Which is the more important sense: sight or sound? Those of us who spend our waking lives working with audio have little doubt that the aural dimensions of reality are more interesting than the visual. And many in the video industry agree, if a little backhandedly: One of the oft-repeated “truisms” heard in video circles is “Television without picture is radio, but television without sound is technical difficulties.” Not to mention the variations on the end of that sentence floating around, like “…furniture,” “…surveillance” or — my favorite — “…unemployment.”
They all point to one thing: Grabbing the eyeballs may be good for getting someone’s attention, but most of the really important information coming from the television set is going to your ears. By many measures, our hearing is more acute than our sight: The range of our aural perception (at least when we’re young) covers 10 octaves, while our response to visible light barely covers one octave. The dynamic range of the human ear is about 120 dB, and we can go from the lowest extreme of that range to the highest pretty much instantaneously; the dynamic range of the ocular system, taking into account both the physical and chemical changes the eye undergoes to adjust to varying light conditions — some of which can take as long as several minutes — is a mere 60 dB.
Audio people know this both from instinct and from experience. Yet most of us still have to fight the media powers-that-be over the value of good audio. And on the output side, networks, duplication houses and other distribution channels can be abominably careless with sound, whether it’s dropping sync, screwing up phase relationships, slamming tracks through brick-wall limiters, grossly misadjusting surround encoders or collapsing your beautiful 5.1 mix into comb-filtered mono.
So we beat our heads against the wall trying to get the video people we work with to understand how to make at least decent audio; we write articles decrying the situation until we’re blue in the face (personally, I’m past blue, all the way to indigo); and we make nuisances of ourselves on online forums where videographers and editors hang out.
Or we can become video-makers and push them out of the picture altogether. But despite the fact that a surprising number of audio pros have taken up video as a hobby or even a sideline, most of us don’t have the inclination to change our primary profession. If you can’t join the video-makers, however, you can at least work within the medium to understand the rules of their game.
Getting your hands dirty with video is probably the best way to learn what you need to know to talk intelligently with your clients. And it’s a whole lot easier than it used to be. When I first started working with video in the late 1980s, there were a half-dozen tape formats in use and the machines to play most of them cost as much as I had spent on my entire studio. I had to beg and plead the producers with whom I worked to find a VHS hi-fi machine so that they could make dubs for me. Even worse, I had to try to convince them to put timecode on one of the audio tracks so I could lock to it.
But things got better, and in recent years, since computers got fast enough to handle it, I’ve been working exclusively with QuickTime video. On those occasions when I need to digitize a tape (or sometimes even a DVD), I have a cute little Canopus FireWire video bridge with composite and S-video inputs that does a good job of converting the video and keeping the audio in sync. I also have Apple’s Final Cut Express, and between the hardware and software I have improved my ability to communicate with the world of video immensely: Not only can I now talk the talk, but when necessary I can show the people with whom I’m working what I’m talking about by fixing their mistakes right in front of them — in a nice way, of course.
One thing you need to realize about digital video, if you don’t already know this, is that it’s almost always compressed. In the early days, computer video-editing systems compressed the data so much that their actual video output was unusable for anything except rough cuts; instead, the systems created an edit decision list (EDL) that was then exported to a dedicated hardware editor, which (or who) worked long into the night assembling the actual program from the original tapes.
It’s really only in the past six years or so that our desktop computers have been powerful enough to work on the actual video program material and produce an acceptable output. But it’s still compressed: The most common format is called DV (its proper name is actually DV25), which uses a data-compression ratio of 5:1, what is called “light” compression in the video world. (Have a look at YouTube to see what’s considered “heavy.”) DV arrived just around the time when computers were ready for it — and thus we saw the explosion of DV-based software like Final Cut, Premiere 6 and iMovie.
But while we’ve been settling into the DV comfort zone, the game has changed. Nowadays, we have to learn how to deal with high definition, known to its friends — and enemies — as HD. HD presents a huge leap forward in both quality and complexity. Data rates for compressed HD start at 100 Mbps and go up — way up — from there. Standard FireWire connections, which will happily carry 16 channels of high-res audio, have only enough room for a single HD channel, and sometimes have trouble with that. Instead, IEEE 1394b, the 800Mbps version of FireWire, is the preferred way to connect an HD peripheral with a computer.
The fact that there are three primary but incompatible forms of HDTV makes things even more difficult. Factor in the number of frame rates from which you can choose, and you end up with 13 different formats. There is also no audio standard for HD: You can theoretically use just about any sample rate, word length or number of channels, although HD tape formats have a fixed number of channels — either four or eight.
Fortunately, the price of a stake at the HD table has gone down considerably just in the past year or so. Once you get past the horsepower and throughput challenges, finding software tools that work in HD is almost ludicrously simple, especially if you’re a Mac owner. Other software makers are following Apple’s lead into low-cost HD editing programs, and the hardware manufacturers are right there, too, with some pretty astonishing new tape- and disk-based HD camcorders now available for less than $1,000.
There’s still the issue of getting the stuff into your computer. Capturing and spitting out HD video without compromising its quality is a very different issue from doing it with SD. You need a new generation of hardware, and it’s not going to be cheap. But it may come from some unexpected places.
Don’t even think of sending a composite video signal into your computer; you have to have real component inputs. Consumer-level equipment will have analog component jacks, but professional gear uses the Serial Digital Interface (SDI), and you will probably have to deal with both. For outputs, you will need DVI (for computer monitors) and HDMI (for large-screen televisions). It will probably also help you to have external blackburst, word clock and timecode inputs to make sure everything stays locked correctly. And audio? To cover all the bases, you’ll need eight channels, preferably in both analog and digital.
There are a number of manufacturers that make HD capture cards and breakout boxes. Usable HD interfaces start at less than $1,000, but to get a full feature set you have to spend more. One of the more intriguing boxes, which is becoming available just as you read this, is the V3HD HD interface ($2,500) from Mark of the Unicorn (MOTU).
Although it may strike some of us as odd that an audio company is making a foray into video (we usually think of it as going the other way around, as for example when Sony bought MCI or Avid sucked up Digidesign), it’s actually not a new phenomenon. Back in the early ’80s, not long after Fairlight revolutionized the audio world with its CMI, the company came out with the low-cost CVI (the “V” for video), a real-time video mixer and effects generator that found favor among video artists and music video producers.
A decade later, Sonic Foundry, the original makers of Sound Forge, came out with Vegas Video, which is still supported by Sony, which took over Sonic Foundry in 2003. Passport Designs, a pioneer in MIDI sequencing, never got into digital audio the way its competitors MOTU, Steinberg and Opcode did, but instead went after the video-editing world with Producer. It got excellent reviews and a TEC Award nomination, but was too far ahead of its time to keep the company afloat. More recently, Solid State Logic joined forces with Broadcast Devices Ltd. to develop an all-in-one mixing, editing and asset-management system called Gravity.
The V3HD isn’t MOTU’s first video product. In the late ’80s, it came out with the Video Timepiece, a brilliant, if under-appreciated, $1,200 box that saved me from countless disasters having to do with the previously mentioned lousy tape dubs I often had to work with. It could re-generate and jam sync SMPTE timecode, lock to external video clock and do window burn. Best of all, it could let me see what was going on with the timecode on the dubs so well that I could go back to the director and tell him what he was doing wrong, and often help him with his sync problems — in a nice way.
The V3HD has everything you could ever need in an HD interface: It has 32 audio channels in several different digital and analog formats, and RS-422 machine control. It allows for simultaneous ingesting (that’s the new buzzword, get used to it) and output of multiple video formats so you can run your SD and HD sources without changing cables, and you can see what your edits look like on SD and HD monitors at the same time. It comes with control software and lets you adjust the delay times on the audio to compensate for the built-in video delays that HD processing systems and monitors introduce. It will do frame-rate, HD format and audio sample-rate conversion, to and from every conceivable standard on the fly.
MOTU is investing a lot in video: “The video market is now at the point where audio was 10 years ago when we did the 2408,” says lead hardware designer Paul Sullivan, referring to the company’s first PCI audio system, “or where MIDI was 20 years ago.” Recall that a little more than 20 years ago, the most sophisticated piece of video hardware for the computer-based music studio was the Roland SBX-80 SMPTE-to-MIDI converter, which had a clumsy interface, no computer programmability and no memory. But a lot of audio people bought it who were hungry to get into the brave new world of video. Now that world is exponentially bigger.
In today’s post world, it’s not just enough to know more about audio than your clients do; you also need to be conversant in video and that means HD. So shelling out the bucks for some HD hardware and learning what goes into and comes out of it might be a wise move. After all, knowing what your clients are doing better than they do is an awfully good way to show them how valuable your services are.
Paul Lehrman is coordinator of music technology for Tufts University and thanks Don Schechter for his help with this column. His Insider Audio Bathroom Reader is available from Thomson Course Technology and