101001010101010101010101Understanding The DVD-Audio FormatWith the world’s attention riveted on the Internet, the DVD-Audio format is facing a tough fight – will DVD-Audio establish itself as the successor to the Compact Disc, or will it face the same indifference as the ill-fated Enhanced CD format? Now that the first DVD-Audio players and titles are trickling out, it’s hard to remember why the format was proposed in the first place, and what it actually brings to the party. This article will attempt to clear up some of that confusion.
It must be said that the DVD-Audio format is a compromise and not everyone will be happy with it. But DVD-Audio significantly extends the technical and creative options of prerecorded audio, while simultaneously offering a consumer-friendly platform for music-related multimedia. Although it will take awhile to make consumers aware of its presence, music industry professionals who begin exploring DVD-Audio’s capabilities should immediately find much to like.
Like CDs, DVD-Audio uses Pulse Code Modulation (PCM). But DVD-Audio takes significant technical strides beyond CD by supporting higher-resolution (word length and sample rate), multichannel sound and lossless compression. Depending on how these factors come into play on a given disc, the format also offers greatly extended playing times.
One of DVD’s most important characteristics is versatility. Beyond a simple audio format, DVD-Audio offers built-in support for the same kinds of “value-added” multimedia features that the record labels tried to support with the Enhanced CD format, which was undermined by technical incompatibilities and consumer apathy. DVD-Audio will support still pictures, text and video and, like DVD-Video, may use graphical menus.
How much (or little) a DVD-Audio release takes advantage of these multimedia features is entirely up to the producer. Unlike CDs, which are all conceptually the same, DVD-Audio albums can, and probably will, be all over the map in terms of audio and multimedia content. That means there will likely be no such thing as a “typical” DVD-Audio disc. And, because the audio market depends on a broad spectrum of player types – from inexpensive personal portables to boom boxes, automotive decks and home hi-fi players – there will be many types of DVD-Audio players.
DVD-Audio has been designed to account for all these variables and still maintain compatibility. All in all, the format allows a fair amount of creative flexibility, but demands strict adherence to the technical details of the specification, which can be complex. Before we dig into the specifics of the audio and other media types that make up DVD-Audio content, let’s examine the underlying structure of the format, as well as the various types of supported discs and players.
A Common PlatformIn defining the DVD family of formats, the consumer electronics, entertainment and computing industries created the specifications for a single platform with multiple uses, which would allow for both interoperability in use and economies of scale in manufacturing. So all prerecorded DVDs are actually DVD-ROMs formatted in the UDF file system.
A DVD-Video, for instance, is a DVD-ROM disc that includes a VIDEO_TS directory (folder) containing all the presentation data (video, audio, etc.) and navigational data required by the DVD-Video specification. A DVD-Video player is designed to look for this DVD-Video “zone” and uses the data to play back DVD-Video content. For DVD-Audio, the equivalent zone is called AUDIO_TS.
Anything on a DVD that is not in these DVD-Video or DVD-Audio zones is referred to as being in the “DVD-Others” zone. This could be any kind of computer data, such as a huge database, an interactive game or a clip-art collection. A single-sided, single-layer DVD, known as a DVD-5, has a total storage capacity of 4.7 GB for the combined contents of all the disc’s zones; a DVD-9 (single-sided, dual-layer) has space for 8.54 GB. The DVD format also includes DVD-10 (double-sided, single-layer) and DVD-18 (double-sided, dual-layer).
A common underlying platform for all DVDs makes it easy to allow a DVD-Video or DVD-Audio disc to be played from a DVD-ROM drive (assuming the host computer is properly equipped) and includes extra features – HTML pages that link to the Web, for instance – that may be accessed on a computer. When the same disc is played in a set-top player, however, the DVD-Others material is ignored, and only the contents of the DVD-Video or DVD-Audio zone are played.
Managers and GroupsMany artists and/or labels may choose to use only DVD-Audio’s audio capabilities and will create what have been referred to as “Pure Audio” DVDs. In its simplest form, a Pure Audio disc will function much the same way as a CD; the player uses linear, track-based navigation to access the disc’s contents.
When a CD is inserted into a CD player, the player reads the TOC (table of contents) file to find the addresses of all the tracks. With DVD-Audio, a player looks in the AUDIO_TS folder for a similar directory of the disc’s contents, referred to in DVD lingo as a “manager.” The manager that is equivalent to the CD’s TOC is the SAMG (Simple Audio Manager), essentially a list of up to 314 tracks. Every DVD-Audio disc is required to include a SAMG to enable track-based navigation.
A more sophisticated form of Pure Audio DVD will offer more flexible navigation by taking advantage of the DVD-Audio specification’s organizational hierarchy. Of this hierarchy’s five levels – album, group, title, track, index – users are generally only conscious of tracks and groups.
As with CD, a DVD-Audio track is a single selection, such as a song. The function of a group is to allow an album’s producer to specify multiple playlists of tracks. Up to nine groups are allowed per album (each side of a DVD-Audio disc is one album). Because a group is simply a playlist, a given track may be referenced by more than one group. For example, on an album with 20 audio tracks, one group might be a sequence of all the songs, another could be a “mellow” playlist of just acoustic numbers, and a third might be a “party” playlist of just dance tracks.
Using groups, producers may create up to nine different listening experiences from one set of material. To access a given track using the player’s remote, the user first enters a group number and then the number of the track within that group. Once any track within a given group has begun playing, the player will continue playing the rest of that group’s tracks on through the end of the group.
Multiple PlayersAs noted earlier, a successful audio format needs to be compatible with a range of players covering a wide variety of prices and playback environments, from the jogging trail to the living room. The mandated inclusion of SAMG allows manufacturers to design inexpensive machines that utilize only track-based navigation, and thus keeps DVD-Audio from being limited to the high end. Players that support the use of groups, however, ignore SAMG and instead read a different manager called AMG (Audio Manager). By requiring two different directories, the specification ensures that every disc can be read by different types of players that are designed for quite different segments of the consumer market.
To give hardware manufacturers even more flexibility in targeting their players to specific markets, the more sophisticated players – those that use AMG rather than SAMG – may or may not include video outputs to hook to a television for DVD-Audio’s multimedia features (SAMG players never have video outs). Even without video, AMG players may support text by means of a front-panel LED display that shows information, such as song titles (with a choice of languages).
To figure out how to play back a given disc, players with video outputs look at a section of AMG known as AMG/AVTT (the “AV” refers to audio-with-video). AMG players without video outputs, on the other hand, refer to AMG/AOTT (audio-only). If there is no graphical content on the disc, then AVTT and AOTT are essentially the same. By defining these two distinct sections of AMG, the specification allows a single type of disc to cover two quite different types of players.
Audio and Video TracksIn addition to DVD-Audio players, some of the content on a DVD-Audio disc may also be playable on a DVD-Video player, because there are two basic types of tracks allowed in DVD-Audio: audio and video.
The basic unit of presentation data for an audio track is an AOB (Audio Object) file. Each AOB contains a PCM audio stream, plus an optional Dolby Digital (AC-3) audio stream. Optionally, an audio track can also be accompanied by still images (photos, graphics) and/or text. The still images are stored in presentation data files called ASVs (Audio Still Video), made up of an MPEG-2 encoded frame with optional highlightable subpicture overlays for information, such as lyrics or bios.
Up to 99 ASVs (not more than 2 Megabytes total) may be group-ed together into an ASV Unit that plays over an “ASVU range” of one or more audio tracks. The producer decides the Display mode, which may be Slideshow (predefined image duration) or Browseable, and the order, which may be sequential, random or shuffle. The player preloads each ASVU into memory before starting to play the tracks in that range. There is no audio output for at least two seconds during this preloading, so the boundaries of ASVU ranges must be placed in such a way to avoid muting duing continuous audio program.
The presentation data for a video track, meanwhile, comes from VOB (Video Object) files. As in DVD-Video, a VOB contains interleaved streams of MPEG-2 video, plus audio and optional subtitles. However, video is not handled identically under the two specifications. Except for during menus with motion-video backgrounds, DVD-Audio does not support DVD-Video features, such as seamless branching, parental control, and some of the author-defined commands that make DVD-Video capable of complex interactivity.
The producer of a DVD-Audio decides whether or not the audio from a given video track should play back on audio-only players. If yes, then the video’s audio must include a linear PCM stream. (An optional Dolby Dital stream may also be present.) If no – if the audio would be pointless without the accompanying picture – then a linear PCM stream is not required. (Using Dolby Digital only would be allowed.) However, a video track that is not set up to allow audio playback on audio-only players may not be included in a group with any material that is. That means that such video tracks must be segregated into their own groups.
When a DVD-Audio player looks for the files (audio, stills and text) that it may need to play back audio tracks, it finds them in the AUDIO_TS folder. The data for video tracks, however, is stored in a VIDEO_TS folder like that used on a DVD-Video. The inclusion of a VMG (Video Manager) in this folder is what allows the disc’s video tracks to also be played in a DVD-Video player, which cannot recognize AMG (or SAMG). This setup allows creation of discs that will play back something on a DVD-Video player, even though the rest of the DVD-Audio material is only accessible on a DVD-Audio player.
While the implementation of all these playback variations may seem dauntingly complicated, it’s all intended to allow producers to use as many DVD-Audio features as they like – without creating discs that won’t play at all on certain types of players. In practice, the types of players producers will likely need to target their efforts come down to three: track-based, audio-only players; audio-only players that support groups; and “Universal” players that have video outputs and also include the ability to play DVD-Video titles, such as consumer video releases.
Audio FormatsWith a general understanding of how DVD-Audio works, we can focus on the format’s audio capabilities. As explained above, there are actually two types of tracks, audio and video, and each has its own audio requirements.
The audio requirements for video tracks, based on the DVD-Video spec, are the simpler of the two. Linear PCM streams are supported at 16, 20 and 24-bit word lengths. Two sample rates are supported: streams at 48 kHz may use up to eight channels; streams at 96 kHz may use two channels only. The maximum audio bit rate is limited to 6.1444 Megabits per second. For Dolby Digital streams, up to 5.1 channels may be used, with a maximum bit rate of 448 kilobits per second.
The audio for audio tracks, which is really the heart of the specification, is quite a bit more complex. While players may optionally include support for formats such as Dolby Digital or DTS, support for PCM is rquired. Players are actually required to support two types of PCM: linear (the same type as on CDs, aka LPCM) and “packed” using Meridian Lossless Packing. MLP is a data-reduction technique that allows PCM to be expressed more compactly than LPCM – yielding storage and bandwidth efficiencies of 40% to 50%, depending on the program – and then reconstructed with bit-for-bit correspondence to the original signal (hence the term “lossless”).
MLP was included in the DVD-Audio specification to facilitate the use of high-resolution and multichannel sound. The supported resolutions use 16, 20 or 24-bit word lengths and are divided into two sample-rate families. One is based on the CD’s 44.1kHz rate, and also includes 88.2kHz and 176.4kHz rates. The other is based on the standard audio-for-video rate of 48 kHz, and also includes the multiples 96 kHz and 192 kHz. Within each family, the highest rate is supported for mono or 2-channel playback, while the other rates allow up to six channels.
High-resolution audio eats up a lot of bandwidth. Using LPCM, for example, six channels of 24-bit/96kHz audio requires a bit rate of 13.824 Mbps, far in excess of DVD-Audio’s audio bandwidth of 9.6 Mbps (the maximum rate at which audio data is read by the player’s drive). This problem is addressed in part by MLP, but the specification also allows producers to allocate bandwidth by using higher resolution for some channels than for others.
Each channel used in a given track is assigned to one of two Channel Groups, with the resolution of Group 2 lower than that of Group 1. Left, center and right across the front, for instance, might be in Group 1 at 24-bit/96 kHz, while left and right surrounds and a low-frequency (subwoofer) channel are assigned to Group 2 at 16-bit/48 kHz. Twenty-one different Channel Group configurations are defined in the specification. The number of channels, their assignment to the two Channel Groups and the resolution of each Channel Group may be changed on a track-by-track basis, though players may briefly mute during such changes.
While the use of MLP and Channel Groups makes high-resolution multichannel sound feasible within the available bandwidth, it still takes up a lot of disc space. This problem is compounded by the fact that the specification requires that every track on a DVD-Audio is playable on a 2-channel playback system – once again, the idea that all discs are playable on all players.
To avoid forcing producers to include additional 2-channel mixes of songs that are already present as multichannel mixes, the format requires that players support SMART, a system for downmixing to stereo on-the-fly using level, panning and phase “coefficients” that are predefined by the producer during the mix. A SMART “downmix” will only be played if a discrete 2-channel mix of a given program has not been included on the disc. Thus, producers can use separate stereo mixes or use SMART downmixes and have longer available playing time.
The Best Hope?It’s evident from the foregoing that the DVD-Audio specification allows tremendous flexibility. A full-featured album might include several different playlists (groups) drawn from the underlying audio tracks and accompanied by browsable “still-shows,” with lyrics you can click on to take you to different parts of the songs. And the disc could also include a set of music videos, as well as an interview with the band. A Pure Audio disc, on the other hand, would simply present the music using the producer’s preferred resolution and channel configuration (stereo up through 6-channel).
What’s nice about the format is that the basic elements – high-resolution, multichannel sound, still pictures, text and video – can all be used (or not used) without making the disc incompatible with some types of players. This means that producers’ choices can be driven by creativity (and, admittedly, budget) rather than constrained by technology.
In a world where MP3s at 11:1 compression seem to be acceptable to listen to music, the record industry needs more than simply ultrahigh fidelity to rekindle consumer enthusiasm for buying prerecorded music. To the extent that the format’s creative possibilities can stimulate a fresh wave of imaginative entertainment, DVD-Audio may well be the industry’s best hope.