Do You Hear What I Hear?

There’s a priceless moment on the Firesign Theatre’s third album when an authority figure (a prosecutor who is somehow also an auctioneer) bellows, “What do I hear?” and a stoned voice from the back of the room responds, “That’s metaphysically absurd, man. How can I know what you hear?”

This brings to mind two questions. First of all, as we’re professionals who depend on our hearing to produce sounds that will appeal to other people’s ears, how do we know what our audience is actually hearing? And second, for that matter, how do we know what we’re hearing? These two questions are becoming even more prevalent today, as most music listeners are “enjoying” sounds on low-fi playback systems or headphones — far from the quality of studio monitors.

When it comes to our audience, you might as well ask, “What do you mean by ‘green’?” Physicists can agree on a range of wavelengths for “green,” while everyone else can point to different objects and get a general consensus from those around them that said objects are or are not the color in question. But no one can possibly put themselves into someone else’s mind to see exactly how they experience “green.” As conscious beings, our perceptions are ours alone. Lily Tomlin’s character Trudy the Bag Lady, in The Search for Signs of Intelligent Life in the Universe, put it perfectly when she said, “Reality is nothing but a collective hunch.”

Similarly with sound, we can measure its volume, look at its spectrum, see how it changes over time and analyze the impulse response of the space in which it’s produced. But there’s that subjective response to the sound that’s within our heads that can’t be measured — at least not without a sensor on every brain cell and synapse involved.

Because we’re in the business of shaping the reality of sounds, it’s fairly important that our “hunches” be correct. And it’s our ears that we trust. No amount of visual or data analysis will allow us to decide that a sound is “right” without hearing it.

How do we make that decision? A crucial part of the act of hearing is making comparisons between what our ears are telling us at the moment and the models that live in our memory of what we’ve heard before. From the moment our auditory faculties first kick in, those memories are established and baselines are formed. The first sounds all humans hear are their mothers, and then they hear other family members, then domestic sounds and gradually they take in the larger world outside. I imagine it’s a safe bet to say that for most of us in this business, among those earliest aural experiences were the sounds of singing and musical instruments. Not only did these sounds intrigue and inspire us, but they provided us with the context in which we would listen and judge the sounds we would work with in our professional lives.

So we know what things are supposed to sound like. As professionals, we learn something else: What we’re hearing through the studio monitors isn’t the same as what we hear when there’s a direct acoustic path from the sound source to our ears. Ideally, speakers would be totally flat with no distortion or phase error and with perfect dispersion, but even the best monitors are still far from being totally “transparent.” In addition, every indoor space that’s not an anechoic chamber has its peculiar colorations, which are different from any other space. We need to be able to compensate for these distortions, consciously or unconsciously, and block out the sound of the speakers and the room as we listen. Our experience and training as professionals teach us how to eliminate the medium and concentrate on the source.

But this weird thing has happened in the past hundred or so years, and the trend is accelerating: The proportion of musical sounds that people are exposed to throughout their lives that are produced by “organic” means has been decreasing and is quickly approaching zero. This means that the baselines that we, and our audiences, need to determine what sounds “real” and what doesn’t are disappearing.

Before the end of the 19th century, the only music anyone heard was performed live. The sound that reached an audience member’s ears was that of the instruments and the singers, with nothing mediating between the mechanism of production — whether it was a stick hitting a dried goatskin, the plucking of a taut piece of feline intestine or the vibrations of a set of vocal cords — and the mechanism of perception.

But with the invention of the radio and the phonograph, all of that has changed. People could now listen to music 24 hours a day every day if they wanted and be nowhere near actual musicians. Compared to real instruments, wax cylinders and crystal sets sounded dreadful, but the convenience of hearing a huge variety of music at any time without leaving home more than made up for the loss in quality for most people.

The “hi-fi” boom that started in the 1950s improved things, as listeners began to appreciate better sound reproduction and the price of decent-sounding equipment fell to where even college students — who soon became the music industry’s most important market — could afford it. Today’s high-end and even medium-priced home audio equipment sounds better than ever.

But as the media for music delivery have blossomed — from wax cylinders to XM Radio — fewer people experience hearing acoustic music. Symphony orchestras are cutting back seasons or going out of business altogether all over America, and school music programs, which traditionally have given students the precious opportunity to hear what real instruments sound like from both a player’s and a listener’s perspective, are in the toilet. While there are certainly parts of the “live” music scene that are still healthy, they depend on sound systems that, as they get bigger and more complex to project to the farthest reaches of a large venue, serve to isolate the audiences even more from what’s happening onstage acoustically.

And, as electronic sources of music have become more prolific, another thing has happened: Because it is now so easy to listen to music, people actually listen to it less, and it has become more of an environmental element like aural wallpaper. Because audiences aren’t focusing so much on the music, the quality of the systems that many listen to has been allowed to slip backward. Personal stereos have been a major factor in this: From the Sony Walkman to the iPod, people are listening to crummy sound reproduction at top volume, screening out any kind of sonic reality and replacing it with a lo-fi sound. Everyone can now have their own private soundtrack, as if they were perpetually walking alone through a theme park, without any other aural distractions, with a 15dB dynamic range and nothing below 100 Hz.

I remember this hitting me like a ton of bricks one day in the summer of 1979. I had been out of the country for a few months, and soon after I returned to the U.S., I was walking in New York City’s Central Park and came upon an amazing picture: On a patch of blacktop were several dozen gyrating disco-dancing roller skaters, but the only sound I could hear was that of the skate wheels on the pavement. Each of the dancers was sporting a pair of headphones with little antennae coming out of them. Inside each of the headphones, I soon realized, was an FM radio, and they were all dancing to music that I couldn’t hear. But it became obvious — after I watched them for a few minutes — that they weren’t all dancing to the same music; each was tuned to a different station.

The “multimedia” speaker systems that people now plug into their computers so they can listen to MP3 streams have taken us further down the same road. Companies that decades ago revolutionized speaker designs — such as Advent, KLH and Altec Lansing — have had their brands swallowed up by multinational electronics foundries that slap those once-revered names on tinny little underpowered speakers connected to “subwoofers” that produce a huge hump at 120 Hz so that consumers think they’re getting something for their money.

More recently, the tools of personal audio wallpaper have entered the production chain. Again, one incident sticks out in my mind that showed me clearly where this was going: A couple of years ago, I went into a hip coffeehouse — where the blaring post-punk music makes it impossible to hold a normal conversation — and sat down at a table near a young man wearing earbuds and peering intently into a PowerBook. I glanced over, and to my amazement, I realized he was working on something in Digital Performer.

How many composers live in apartment buildings where they work late into the night and, for fear of disturbing their neighbors, never turn on their monitors but only mix on headphones? How many of your colleagues, or even you, boast of doing some of your best audio editing on a transcontinental plane flight?

A pessimist looking at this might conclude we were approaching a kind of “perfect storm” in which we lose complete control over what our audience hears. No one ever finds out what a real instrument sounds like; the systems that we use to reproduce and disseminate music are getting worse. And because most people don’t even listen closely to music anymore, they don’t care.

In my own teaching, I’ve seen how the lack of proper aural context results in an inability to discriminate between good and bad, real and not-real sound. In one of my school’s music labs, I use a 14-year-old synth that, although I really like it as a teaching tool, I’ll be the first to admit has a factory program set that is a little dated. But one of my students recently said, “The sounds are so realistic, why would anyone need to use anything else?”

There are nine workstations in that lab, which means the students have to work on headphones. We use pretty decent closed-ear models and the students generally don’t have any complaints. That is until we play back their assignments on the room’s powered speakers. “Why does it sound so incredibly different?” one will invariably ask. I take this as a splendid opportunity to teach them something about acoustics: how reflections and room modes affect bass response, the role of head effects in stereo imaging and so on. They dutifully take it in, but then they say, “Yes, but why does it sound so incredibly different?” The idea of the music and the medium being separate from each other sometimes just doesn’t sink in.

If you’re looking for an answer or even a conclusion here, I haven’t got one. But I do know that the next generation of audio engineers and mixers — if there’s going to be one — will have a hard time if they don’t have more exposure than the average young person to natural, unamplified and unprocessed sound. If every sound we ever hear comes through a medium (and most of them suck), then how are we ever going to agree on what we hear?

Which means that our ears and our judgment are still all we have. Try to take care of both of them. And keep listening and keep learning.

Paul D. Lehrman has only heard one MP3 he’s liked, which was on a collection of Goon Shows.