Three years ago, multimedia on the Web was on a roll. The widespread acceptance of MP3s, combined with faster Internet connections, created an explosion of music and sound. A multiplicity of entertainments could be streamed or downloaded, and fidelity was improving.
For those of us who produce audio for the Web, the technology is now reasonably mature, but the same is not true for those who want to integrate sound with picture. Access to “fat pipes” certainly increases the options for decent video, though with serious limitations in time and space. And the masses are still linked to the Web via dial-up connections, where video is stuck at a true postage-stamp size. About the most you can say is that there is a picture, and it does at times move, but most would dismiss such limited video as a curiosity.
Despite today’s limitations, it’s important that audio professionals know what choices are available for creating and delivering multimedia content on the Web. There are some cool new technologies out there, and ingenious ways of using what we already have.
QUALITY GOALS
Before getting down to the nitty-gritty of the options, let me state what I believe to be the minimum acceptable quality benchmark: no loss in audio quality, and video at a level close to what we’ve come to expect and enjoy in audio. Continued growth in audio features, especially as they relate to picture, is another desirable goal.
For audio, the minimum standard is equivalent to the quality offered by good perceptual-coding, lossy data compression. (Of course, linear PCM is welcome wherever we can get it.) What I’d really like to see on the video side is something comparable to DVD or a good VHS tape, with a full-size picture, at full frame rate, and generally free of contouring and gross compression artifacts. At the low end, there are good applications where quarter-screen video can be used.
As video quality grows, the allocation of bandwidth to support surround sound becomes a much smaller percentage of the total, so we can expect that multichannel audio will eventually become part of the online video experience.
THE GREAT BANDWIDTH DIVIDE
More than any other application, digital video brings out the differences between broadband and dial-up connections. On a cable modem, DSL or T1 connection, video streaming and downloading are still not completely fluid, but the download time and quality are more or less acceptable. On a 56Kbps or lower bit rate dial-up, the experience is uncompelling, to say the least. Today, the reported combined share of cable modems and DSL in home applications is hovering around 6%, which means that 94% of potential Internet video consumers are sucking on a mighty thin straw.
Of course, even those who do have faster home connections are not exactly swimming in bandwidth. Corporate and university users with T1 or better connections have the best access, and some attractive applications target these sectors specifically. For the home user, though, it’s still a marginal proposition overall, and one that’s clearly split among haves and have-nots. Local and Wide Area Networks using Ethernet or other technology offer the best bandwidth, and these systems are found not only in corporate and academic settings, but also to an increasing degree in large condominium or other high-end residential developments. In these “Video-over-IP” configurations, a more-or-less vertical audience can be given access to very compelling applications for entertainment, training and information.
DIVX/MPEG-4 SP
In the past two years, a video-oriented subculture (not unlike the one that embraced early MP3) has sprung up around a compression standard commonly referred to as DivX. DivXNetworks, the parent company, claims more than 25 million free player downloads to date, and as was the case with MP3, the greatest activity is on college campuses and other places where young, active minds have time and access to generous bandwidth. Even so, at least one commercial site, DivX.com, is offering downloads of full-length feature films (at an average file size of 690 MB) for a fee.
DivX is based on the MPEG-4 video compression standard (the term DivX was adopted to provide a catchier handle for the community of users and to establish licensing separate from that of MPEG) and is actually identical to the lowest level of the MPEG-4 spec, referred to as Simple Profile, or MPEG-4 SP. The so-called Simple Profile defines the core of audio and video data compression for MPEG-4, taking advantage of technical developments beyond those in MPEG-1 or MPEG-2 to yield good quality at lower bit rates — dramatically lower in the case of video, somewhere between 0.5 and 1 Mbps. The picture quality can be compared to that of DVD, and download times are more or less reasonable, provided you have a fat pipe.
Taking it back a step, DivX/MPEG-4 SP compression is based on an earlier standard called H.263, which was developed for video conferencing. Faced with extreme bandwidth limitations, and user demand for full motion, the engineers who developed H.263 came up with technical innovations that proved to leverage compression substantially over prior techniques.
This ended up becoming the basis for video compression as defined by the MPEG-4 specification, with MPEG-2 AAC for audio. AAC (Advanced Audio Coding) improves on MP3, though not to the same degree that H.263 improves on MPEG-2 on the video side. Formal blind listening tests have shown that AAC at 128 kbps delivers fidelity equivalent to that of MP3 at 192 kbps. (The full report on the ISO-sponsored testing effort can be found at www.tnt.uni-hannover.de/project/mpeg/audio/public/w2006.pdf.)
As occurred in the case of MP3 audio, the notions of MPEG-4 SP compression were picked up in the open-source developer community, resulting in the implementation of encoder-decoders that are free to everyone. For DivX, the tools of choice today are the DivX codec 4.11 for encoding, with a player application called The Playa.
Quite a lot of attention in MPEG-4 has gone into defining scalable stream delivery, capable of simultaneously addressing users with a wide range of viewing devices and bandwidth situations, including mobile users whose connection rates tend to swing wildly as they travel.
When the full provisions of the MPEG-4 specification are implemented, it promises to be the key technology of multimedia for the Internet.
HOW GOOD IS DIVX?
In visiting DivX sites on the Web, you’ll frequently find the claim that DivX-encoded picture is of the same quality as DVD. This is a statement that I can’t fully endorse, having now looked at a fair number of clips. To say that DivX, as commonly practiced, is “DVD quality” is a bit like saying that MP3 sounds as good as CD.
There are issues with equivalency of playback platforms as well. All of the DivX decoders I’ve seen are software-based, which means that they are dependent on CPU performance and usually require a VGA display. Because MPEG-4 video decoding is more efficient than MPEG-2, a given CPU configuration may well look better playing DivX than it does playing DVD.
Though DivX/MPEG-4 beats the pants off anything else available for high-quality video at manageable bit rates, it still is fat enough to present issues in streaming and downloading. The broadband/56k divide is much in effect here. If you’re in the 94% of the populace that still dials in, then DivX is of marginal utility for you; a two-minute trailer can take an hour or more to download.
However, if you’ve got a lusty, fat pipe to work with, then there’s a lot of enjoyment to be had from DivX downloads. I’m still not sold on downloading 700MB feature films, because I think the DVD is probably a better deal. But there are truly nifty independent short films available that are well worth looking at.
PUTTING DIVX TO WORK
If you’re interested in using DivX for video downloads or other applications, it couldn’t be easier to get started. (Assuming, of course, that you’ve got the video!) Just download the encoder and player applications from www.divx.com and go to work. The current version runs on Windows and on Linux, with Mac support promised in the near future.
The encoder application accepts input in AVI format. For best results, be sure that your input files are to full video specification. If you’re working with less than full-resolution source video, then you may not be getting the most out of the medium.
For my money, DivX/MPEG-4 is at its best for short-form material. Luckily for us music types, that description fits music videos perfectly. At 15 to 30 MB for a two- to three-minute video, you may even get some dial-up traffic. But be warned: DivX is not going to play well on less than a PIII at 250 MHz, and I’d consider even that marginal.
WHAT ABOUT QUICKTIME?
Apple’s QuickTime technology, of course, is the daddy of all computer video formats, with a full 10 years in the market. When absolutely nothing else existed, CD-ROMs were full of games and movies in QuickTime format, and QuickTime downloads were the first to show up on the Web. So where is Apple’s now-venerable technology in the current arms race of Internet video?
QuickTime has undeniably lost a lot of market share (MediaMetrix reports just 4% market penetration) and momentum to the likes of RealMedia, Windows Media Player, and MPEG/DivX formats. That said, however, our old friend is far from dead. The MPEG-4 standard has adopted QuickTime’s file format, because it offers the flexibility needed for the ambitious scalability that is dear to the hearts of those driving the standards process. The media codecs, though, are coming from elsewhere, as described previously.
Apple also seems to have woken up to the need to bolster one of its flagship technologies, by upgrading video and audio quality, and by launching a campaign to re-establish QuickTime’s presence in recognized outlets. One major coup recently was the adoption of QuickTime as the format for all of the Star Wars: Episode II trailers on the Web.
STREAMING STANDARDS: WINDOWS MEDIA PLAYER AND REALNETWORKS
When someone mentions video on the Internet, the first thing that comes to mind are streaming video, a la RealNetworks and Windows Media Player. The presence and mindshare of streaming sites are undeniable, but when it comes to integration of audio and video, I feel that it may be the least satisfactory of the alternatives available, for two reasons: (a) the video sucks because of bandwidth limitations, and (b) the audio sucks because video takes up the lion’s share of available bit rate.
That said, the value of streaming video for promotion is undeniable. One of the best ways to get a potential customer to download a big video file, or order a Web-connected DVD, is to show a teaser in streaming format. For music sites, artist interviews make a sensible application of the medium, because the limitations of sound and picture are going to be less of a problem.
The Goliaths of streaming media, of course, are RealNetworks and Microsoft, and these two compete furiously. Pressured to start showing revenue, RealNetworks now charges a nominal sum for its latest player (Real-One), while Microsoft still offers free downloads of Windows Media Player, currently at version 7.01.
The infrastructure for delivery of content in both streaming formats is extensive, both from the companies and from third parties. Details on the ins and outs of encoding and delivery can be found on the Web at www.realnetworks.com and www.microsoft.com/windows/windowsmedia/default.asp.
CONCLUSION
High-quality audio and video on the Internet are facing some challenges as connection bandwidth is at a plateau, but new technologies, along with clever application of both existing and emerging standards, are pointing the way to truly satisfactory combinations of high-impact sound and picture. In this article, I’ve tried to show you that if you want to deliver kick-ass sound and picture over the Internet, or to create compelling interactive experiences with theatrical-level media, realistic options do exist. Look at the available encoders, play with some sound locked to video. It takes work and creativity, but the possibilities are out there.
Former technical editor for Mix‘s sister publication Electronic MusicianGary S. Hall is pioneering 5.1 electronica, video and live performance, with collaborators in the international Chill Out scene centered in Bahia state, Brazil. Look forward to his forthcoming Web-connected title, Ouivir Mais e Pedir Menos (Listen More and Ask Less).
THE REVERSE OPTION
Web-Connected DVD
One way to deal with bandwidth issues is to circumvent them entirely, by putting the fat assets on hard media and mounting locally. The viewer gets instantaneous access to media files that can be as high-quality as you want them to be, including surround sound and/or high-density linear PCM audio. By linking the player application to the Internet, you can have the capability of a highly interactive and transactive entertainment experience, with local playback triggering calls to the Web or, conversely, having the site drive playback on cue with user interaction or sync to external events such as a Webcast. (See flow chart on page 72.)
With the advent of lower-cost DVD recorders and DVD authoring software, publishing to Web-connected DVD has become practical. Reasonably priced solutions exist for creating DVD-R discs on demand and delivering overnight to the end user. It’s a little-noted fact that the bandwidth of Federal Express exceeds that of the Internet, with half-a-dozen discs in an overnight box offering dozens of gigabytes in hand within a day.
Web-connected DVD also offers the advantage of cross-platform playability, with the video and audio content of the disc being accessible (and sellable) to anyone with a DVD set-top player. From the audio pro’s standpoint, another big plus for Web-connected DVD as a vehicle is its support for 5.1 surround. Because DVD-Video supports 5.1 in AC-3 or DTS formats, it is straightforward to include this capability in a Web-connected DVD title.
Web-connection of a DVD-Video title potentially offers music artists a great benefit. The connection can be used to access “hidden” content, for example, to lure the listener online, where you can build a fanbase, engage the listener in dialog with the artist or with other fans. Of course, the opportunity to sell additional CDs, downloaded tunes, t-shirts and other items is not to be missed.
Tools of the DVD Trade
There have been attempts at practical Web-connectivity for DVD, but today the arena belongs nearly exclusively to Interactual Inc., of San Jose, Calif. Working with major movie studios, the company developed the first major market titles that connected DVD-Video with ROM and Internet content.
One of the big challenges in delivering Web-connected DVD to market was the player issue. Because the DVD-Video specification did not define connectivity in the medium, there is no standardized way of defining how DVD-Video content can interact with HTML-driven Web browsers. A custom player is needed, but there are tremendous economic, licensing and installed-base-conflict issues involved.
Interactual’s solution is to acknowledge whatever player application is installed, then make use of its licensed MPEG and AC-3 decoders, while replacing the navigation functions with its own Web-friendly application. The current version of this application is the Interactual Player 2.0, though millions of discs out there carry the earlier PCFriendly (Interactual’s original brand name) player. The Interactual Player, along with an auto-run installer script and the artwork and HTML codes that define the desired interaction, is formatted on the DVD disc, along with the DVD-Video content. When the end-user inserts the disc into a PC, the player application is installed to disk and becomes the DVD player for that disk. Because the Interactual player knows exactly what to do to connect content and HTML, the user has a complete, custom experience of arbitrary complexity.
In developing a DVD title with Interactual ROM and Web-link features, the DVD-Video content is developed on any standard DVD authoring system. HTML pages that include calls to the Interactual player provide links to and from clips, scenes, and other features of the disc. One just has to note where these are in the structure of the DVD disc, and use the appropriate parameters to point to them from the Interactual HTML calls. Once the DVD-Video content and interactive scripts are developed and debugged, the final disc is formatted as a DVD-Video/DVD-ROM hybrid that includes the player applications, HTML and installer scripts. Note that Interactual does not support DVD-Audio. However, most DVD-Audio titles today are DVD-Video/Audio hybrids, and these can be linked with Interactual technology.
Users of Sonic Solutions’ professional and consumer DVD authoring tools have even simpler access to Web-connectivity. Sonic’s DVD Creator, DVD Fusion and DVDIt authoring applications include a feature called eDVD, which includes limited license for the Interactual player. In the authoring tools, URLs can be entered directly for every video clip, chapter point and menu button, and the resulting DVD is automatically formatted to include the Interactual player, installation script and the necessary HTML to implement the specified links. This form of interaction is more limited than what can be achieved with the full Interactual development package, but if you’re using the Sonic Solutions tools, or planning to purchase them anyway, the price is right.
The notion of keeping bulky media assets local and driving interaction through the Internet potentially applies to media other than DVD-Video. MPEG-4 or other high-resolution audio-video content can be included on Enhanced CD, or DVD-ROM, and driven through embedded HTML. I refer to this mode of Internet media interaction as Enhanced Snail Net, or ESN, because the media assets come by ground rather than the Internet. In the case of bulky long-form assets at high resolution, this form of delivery is actually more efficient than FTP.
— Gary Hall