A scene from Gounod’s Romêo et Juliette, one of many Metropolitan Opera performances broadcast in HD
Photo: Ken Howard/Metropolitan Opera
Coloratura performances might not have changed much since Mozart’s time, but the opera experience itself is certainly going through a profound sea change. Technological advances have allowed operas to be broadcast on PBS and other outlets in high def and surround, and this season the Metropolitan Opera is also showing eight of its productions live in theaters across the world in HD. It’s the latest and boldest step in what has been a long evolution for opera on television.
For more than 25 years, Jay David Saks has been involved in the tech end of the Met’s broadcasts, first as engineer, then audio producer. Once an aspiring conductor, Saks got involved in producing classical and Broadway cast albums (and has earned nine Grammy Awards in the process) while working with the Met on the side; when the company’s musical director, James Levine, became dissatisfied with the quality of the Met’s famous Saturday radio broadcasts, he brought in Saks to handle music production. Untrained as an engineer, Saks developed his technique as a mixer under fire. “I decided that I wanted to mix live myself rather than work with an engineer at the board,” he says. “At that time, there was no post-production at all.” Saks mixes from a permanent control room at the Met with a large window that provides a clear view of the entire stage. “This is extremely useful for opera, where singers constantly move about the stage.
“I used to mix a lot of my recordings back in the 1980s; and even today, when I’m working in post with Ken Hahn at Sync Sound, I function as a co-mixer,” he adds. The Met records dress rehearsals and the live broadcasts of all its HD transmissions, and repurposes material culled from these performances for later television replay on PBS and DVD release.
“We do post-production on all of the HD shows, and the audio work is executed at Sync Sound, my home away from home for the last 20 or 25 years,” Saks says. “The original HD shows themselves are live. Whatever happens — missed notes, mix or video mistakes, warts and all — that’s how it goes out. When we’re working in post, though, we have options. If the soprano missed a high C but nailed it in rehearsal, we’ll insert the good take to fix the spot.”
At the most recent AES show, Saks and several Met engineers looked for a digital console to replace their aging Studer board. “Recall was never that critical to our process when we were simply broadcasting live on Saturdays. These days, we have multiple productions alternating at the same time. We’re constantly juggling; sometimes we’ll rehearse a show two weeks before its production. And as time goes on, our productions have gotten more and more complicated. They require more microphones and other resources — including reverb — and we find ourselves writing down dozens of notes, drowning in millions of tape strips telling us where the trims, EQ and reverb settings are.”
Wait, did Saks just say that the Met uses reverb in its broadcasts? What would Mozart say? “Here’s the thing: For all those years before I got here,” Saks replies, “the broadcasts sounded dry and constricted because the house — while a comfortable listening environment when you’re sitting there experiencing a production — yields a dry recorded sound. I use fairly close-miking, along with more distant miking, and without reverb the sound is simply not right and surprisingly doesn’t sound the way it actually does live. From the beginning, I started using reverb, even though we only had a spring chamber in the early days!
“After the first season or two, we got a digital reverb,” he continues. “We’re currently using a Lexicon 480L. I try to make reverb sound like it isn’t artificial. I also use compression, filtering and EQ — anything to re-create in someone’s mind the sense that they’re experiencing an excellent live performance. The irony is, to do that I have to use these processing tools!
“There’s a big difference between recording an opera and an orchestral concert. When one records an orchestra in a concert hall, you’re able to pretty much place microphones wherever you need to. But opera is a visual art, so I don’t have the freedom to hang a tree or place microphones on stands where they’d be in full view of audiences both in the opera house and in HD. As a result, all of my miking is either too close to the orchestra or too far away from them, and usually not in ideal locations. Same goes for the singers. If I were making a recording of an opera, I’d mike completely differently.”
Surround sound has had no impact on the Met’s live radio broadcasts. “We’re still working in stereo,” Saks says. “However, for HD transmissions I can’t handle both stereo and 5.1 simultaneously, and the Met doesn’t want someone else doing a separate surround mix. Our live stereo mix signal is fed to a truck sitting out on Amsterdam Avenue. The guys in the truck use Dolby to up-convert to a 5.1 stream. Prior to our first HD transmisison last season, I went down to Dolby with samples of Met stereo broadcast recordings, and we very carefully set the parameters that would work best. We then went over to Digital Cinema, Sync Sound’s cinema studio, to verify the quality of the signal. Ken Hunold, a Dolby employee, sits in the truck on Amsterdam and checks the up-conversion in real time. Ken brings the converter box, and our production mixer, Tom Holmes, sits with him. In preparation of the post work to come, we use David Hewitt’s multitrack truck. David records everything to Pro Tools.”
Once the elements of a show — including intermission interviews, PBS fund pitches and other ancillary material, in addition to the multitrack — have been assembled, Saks moves over to Sync Sound. “I start out editing with John Bowen fixing music and performance errors and making inserts. We might spend just a day on a show, or could go up to three on this part of the process. Then I move over to another room to work with Ken. Over the years, the Met’s productions have become more complicated, and I’ll know that there are spots that need additional work. Ken and I often work for three or four days together.
“Once the stereo mix is completed, we’ll make a real discrete 5.1 mix, not an up-converted one, mostly for DVD release, PBS broadcast and archival purposes. On occasion though, we will just up-convert the stereo version. The technical complexity of the production’s recording is a factor. We always ask ourselves if a 5.1 mix built from the stems and tracks will really yield a product superior to an up-conversion. If we’re convinced that it’s worth the time, we’ll go ahead and make the 5.1 ourselves. As far as the theatrical broadcasts, I have to say that we’ve gotten great response to the up-conversion, even from audio pros.”
Delays are an unavoidable part of the live-transmission process. The task of minimizing any disjunction between the audio and video streams falls to Mark Schubin, who holds the title of engineer in charge, Media Department at the Met.
“There are several issues to consider,” says Schubin. “The first is not transmission-related; it’s based on the way theaters are constructed. Every loudspeaker you see in a theater carries surround material; the left, center and right speakers are all behind the screen. In a theater, the audience hears the bulk of the surround sound coming from in front of them. In the home, the majority of the surround information comes from speakers in the rear. Jay has to construct his 5.1 post mixes keeping this distinction in mind.
“Time delays are interesting from two standpoints. First is absolute time delay itself. We distribute our surround sound via AC3 encoding. AC3 has a significant delay on the order of six television frames for the encoder and one more for the decoder. That would be an absolute nightmare were it not for the fact that video encoding and decoding takes even longer! We can dial in the specific delay we want for the audio, and to make sure that everyone gets it right we do extensive lip synching prior to a show so that the theaters can adjust the audio and video streams. For two hours we send out signals.
“The second issue is a bit trickier. It involves the way human beings establish an audio perspective. The speed of sound is roughly 1,100 feet per second. At that rate, 37 feet is one television frame in the U.S., more or less. If a singer is standing 37 feet back from the lip of the stage, where Jay has established his microphones, the singer’s sound will, therefore, be one frame late. In a theater, someone might be sitting 37 feet from the screen as well, adding another frame of delay.
“This isn’t necessarily a problem, though — at least not yet. In television, a lot depends on what the director is doing. If he or she is showing a wide shot, no sweat — the audience expects the sound to be delayed. If a close up is being presented, however, there can be a problem: I’m seeing a big face and my brain says that it should be accompanied by immediate sound. We get the occasional complaint that lip sync has changed during the transmission of a show. That hasn’t happened; there’s really just been the introduction of an acoustic perception issue. There’s not much we can do, though, and by and large the audiences have been happy with the link we’ve created between the audio and video that we deliver.”
Gary Eskow is a Mix contributing writer.