Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now


Surround Encoding and Monitoring


Photo: Courtesy iStock

If there were one phrase that best describes the state of surround audio production today, it would be “ever-changing.” Surround is still trying to find itself across film, television, gaming, audio-only and, ultimately, the portable delivery systems. It seems that on the high-end, in event spaces such as theaters, the channel count will only go up, while in the home and on mobile sets, the move is toward refinement of the surround experience and re-creation of the immersive field out of headworn sets.

Last year, Dolby, the worldwide leader in surround encoding across nearly all platforms, debuted a revolutionary technology called Dolby Atmos, an object-based system that can accommodate up to 62.2 channels for theater playback through a proprietary decoder. Dolby also works continuously with companies like Intel and Nokia, on the chip level, to make surround easily portable. DTS, meanwhile, Dolby’s chief rival on the high end, created some buzz at CES a few weeks ago with DTS Headphone:X, which delivers 11.1 for headworn playback. GenAudio’s AstoundSound is finding traction in the encode for 2-channel headphone and speaker listeners. And there are a couple of heavy hitters in the headworn device market that have created some impressive products that provide a believable playback experience, even when the listener is moving.

For the companies that make physical surround playback products and develop algorithms, a major hurdle has been to provide broad distribution of quality playback. This is especially daunting with the consumer’s love affair for portable devices. For engineers, the frustration has always been how to get all listeners to have the same or similar experiences when playing back their mixes, no matter the format. The past few years have revealed a separation between technology developments for the big theater experience and on your phone while you ride public transit. Yet both areas have to be addressed in delivery. They have to also be addressed on the front end.

The release of Dolby Atmos has been followed by early adoption on the high end, with Skywalker Sound, Todd-AO and Warner Bros., among others, installing systems. While you can get up to a 62.2-channel mix, that isn’t entirely accurate. Those are the limits of the “physical channels.” The system throws out the concept of channels in favor of a hybrid approach to mixing that directs sound as dynamic objects (or sound elements) that envelop the listener, in combination with channels for playback. The flexibility of object-based mixing provides total control over placement and movement of individual sounds or “objects” anywhere within a theater environment.

The rooms used to create the content are equipped with a render mastering unit (RMU), a box the engineer uses to develop and play back the Dolby Atmos mix. While speakers are elements used to get the audio out (5.1 or 7.1, for instance), with Dolby Atmos, sound objects can be placed anywhere in the soundfield, even overhead.

It all starts with a bed mix of 5.1 or 7.1. On top of that bed mix is a hybrid solution that marries in objects that can be placed anywhere within the speaker configuration. At any one time, the objects can be placed in the field between or directly coming from the speakers, simulating up to 64 discrete speaker feeds. All this data is stored within the master mix, which then is rendered real time at the venue to match speaker placement, size or geometry of the space. This allows for an accurate rendition of the mixer’s intent to be carried through to the venue, no matter the gear they have. As long as they have the Dolby Atmos decoder, it’s all done via the math. Right now, Dolby is using the same RMU in the mix studio and commercial theaters as they wait for the availability of Dolby Atmos Cinema Processer (CP850), which should be available in spring 2013.

While a mixer may have a good shot at upholding integrity with basic stereo mixes playing back over quality speakers or headworn devices, surround content complicates the matter. With all the different surround encoders and decoders, plus a rash of software and hardware “helper” devices and upmixers meant to enhance the quality of a listener’s experience, there’s no telling how your mix will sound when it gets to the end of the line. It took years to achieve normalcy in television. Today there’s no guarantee that a Verizon phone will act the same as an AT&T phone.

This is where DTS thinks it has the answer with its new Headphone:X system fresh out of CES 2013. Headphone:X takes advantage of the advanced properties of the DTS-HD audio codec, which is said to precisely simulate the experience of being in any movie theater or the movie’s mixing stage, and fine tuning that experience to an exact seat location within the room.

DTS went to great pains at CES to demo their system, collaborating with Focal to create a unique 11.1 speaker playback system as a reference, then quickly A/B them with Headphone:X. Michael Farino from DTS explains: “The system is not a matrixed upmix but a re-creation of the discrete playback in a particular space.” So for content creators, nothing changes. Farino continued: “An engineer would mix the music/film as they would in any space, then the mixes can be played back from the perspective of the sampled environment.”

The ability to make all this happen is within the Headphone:X metadata, which is captured using a proprietary system. Right now DTS can only produce this room capture data but is working on simple user tools so creators can sample their own spaces. For DTS and surround playback, this is a groundbreaking product that if it delivers as promised, can provide high-quality surround playback and assure broad distribution with little affect on the engineer. Plus it’s scalable from stereo to 11.1, meaning existing mixes from stereo on up can be played back in the modeled environments.

For Genaudio’s AstoundSound encoding process, Greg Morgenstein, Genaudio’s senior mix engineer, wants the pure surround mix as a source for his encode. Genaudio AstoundSound is a 4-D sound localization cue technology intended for professional and consumer software and hardware product integration. Morgenstein’s “playground” is Astound Studios, a 5.1 studio built specifically for AstoundSound.

When Morgenstein creates the AstoundSound encode, he would rather have the full surround mix, or start one from scratch, as his intention is to accurately re-create the original surround listening experience over two speakers and headphones. To check the finished product, the 5.1 original and a fold-down created with a Waves plug-in is checked against the Astound encode, which is created by running parts of the original mix through plug-ins. The center will go through a single plug-in, and the front and rear stereo pairs will each have their own plug-ins. This gives Morgenstein the ability to customize the encode for each application—it’s not a one-size-fits-all encode.

So in answer to the opening questions, engineers can pretty much carry on with business as usual, unless of course you find yourself on the Skywalker Sound stage and get to mix a jet-by with a new ceiling speaker. But all mix engineers would be wise to brush up on the wide range of options out there in surround delivery and develop techniques to make sure the integrity of their mixes follow through to the AMC Theater or the iPhone on the train.