Scott Gershin is in a unique position on the audio end of the videogame industry. As executive creative director and sound supervisor/designer at Soundelux Design Music Group, he oversees the development of a steady stream of high-end videogames by a wide variety of publishers.
As a leading feature film sound supervisor/designer for more than two decades, he’s worked on many Hollywood films — American Beauty, Shrek, The Chronicles of Riddick, the ribald marionette comedy Team America, the dazzling werewolf flick Underworld: Evolution and Curtis Hanson’s forthcoming Las Vegas gambling saga, Lucky You. Additionally, he’s done sound design work for museums, theme park rides, commercials — the man has his hand in a lot of pots.
Gershin is also an outspoken and articulate advocate for game audio pros, promoting the field at trade shows and in magazines, and as of this month helping spearhead a new organization called IESD: Interactive Entertainment Sound Developers. “It’s for sound professionals,” he says, “for music, design and integration people. It’s a branch of G.A.N.G. [Game Audio Network Guild], but it will also be its own thing.” Partnering with Gershin are Sony’s Dave Murrant, Scott Selfon from Microsoft and Gene Semel of High Moon Productions. “We’ve been nicknamed ‘The Four Horsemen,’” he says with a laugh.
Recently, we chatted with Gershin to get his perspective on the evolution and rapid pace of changes in the world of game audio.
You had already been working in film sound for many years when you first got involved with game audio. How did that come about?
I started in games about 16 years ago. I had always enjoyed games, and around that time, I started seeing [games] move away from FM [synthesis] sounds and into audio samples, although they were being done as 11k/8-bit. Before I even did film, I was working as a synthesist — we’re talking about the E-mu [Emulator] I and early Akais, 8-bit, 12-bit/companded technology. So when I started in gaming a decade or so later, I had already lived through that technology and its creative and technological challenges. I knew a lot of tricks going in [to working on games] from what I learned in the music business — how to get a decent sound out of samplers. Sixteen years ago when I was designing and suping just for movies [at Soundelux], I thought we could utilize our talents in other industries. I started a new company within Soundelux with Wylie Stateman and Lon Bender, which focused on sound design and music composition for interactive entertainment, theme parks, music videos and commercials. It was a really interesting time for us in audio, with all these new sound-with-picture industries coming into existence. We had to make it up and figure it out as we went.
Our first gig came from a friend of a friend of a friend — you know how it goes — who heard about some people who had just bought a company that was called Activision. They were brand new and I had set up a meeting with them to discuss my philosophies and approach to how audio could be used in gaming. I wasn’t scared of the technology because I’d been there before with music. It was great fun because at that time it really was guerrilla audio. We ended up doing a bunch of projects out of the chute with Activision, including Pitfall: The Mayan Adventure for the PC, MechWarrior II, Spycraft and Zork Nemesis. They were all really different and each required a different creative and technological approach. Since then, we’ve worked on over 150 games.
Did the audio limitations of those early games bug you?
I love technology. I’ve always been really into gear. It became a technological puzzle. The challenge was how to use limited elements to create new and unique sounds. When I first started doing film, I was a sampler guy using Synclaviers, Fairlights, ADAPs and synchronizers to lay back my sounds to multitrack. At first that was the only technology available. Then disk recorders came out, but they were only 8-tracks. A lot of times I was using way more elements than that to create my designs. So at that time, disk recording, before plug-ins, was limited compared to what I could do with a sampler. With samplers, I was able to play the sounds using different, often exotic MIDI controllers to give each design a human feel. It really allowed me to mangle the sounds into something new and interesting.
In the beginning of gaming, there were definitely limitations — really low sample and bit rates, as well as codecs that affected the audio in unflattering ways. If you were lucky, the best sample rate available was 22/16; the high ends were rolled off, but it was definitely manageable. Eight-bit was a whole other challenge that wasn’t a whole lot of fun. But I took some of the tricks I knew from the music business and applied them. A lot of it was trying to figure out how to solve problems. It was hard just doing the simplest of things, but that was part of the fun, too. There were no rules and everyone was kind of making it up as we went along. Also, I really liked the people in the gaming world.
When we first started, one of the main limitations was on each platform — like a sampler — how much RAM did we have access to? And that’s still an issue. So there’s that dynamic: How many sounds can we use in the game? We had loop lengths that had to be divisible by a certain number of samples. All the sounds within a certain game level had to be able to fit in packets of a certain size. Each level would retain certain sounds, such as your character’s weapon and fighting moves [packet one] while introducing new sounds to support the new characters and challenges, such as an adversary’s sound and weapons [packet two]. All sounds had to fit into one or the other packet that was being used for that level. There were all these strange little formulas.
Another reason game audio was difficult back then was the codecs. It wasn’t just the 11k, 8-bit. It had to go through data reduction and that alone created some very strange artifacts. There are times you’d do a whoosh or some kind of sound, and then you’d put it through the codec and what would come out was almost nothing like what you put in.
And things improved as new formats came along?
Yes. I think a couple of things happened. We got more RAM and higher-capacity game storage with discs compared to cartridges; the codecs started sounding better; and we were able to stream audio, which was very big. Then the next technology was a combination of streaming and RAM, where we could get sounds with low-latency availability through RAM while the rest of it streamed off the hard drive or disk, similar to what a [Tascam] GigaStudio does today. So that was important.
And then some consoles were able to stream multiple tracks. With next-gen platforms, we’re able to stream large numbers of tracks and support plug-in-style technology. Now all of a sudden the ability to do bigger and better audio has become available. Graphics have gotten much more sophisticated and are starting to be compared to other media, such as film or television. The pressure is on for a more cinematic experience.
Well, you want the realism of the sound to match the realism and dimensionality of the visuals.
That’s right. There’s been an interesting marriage between the film world and the gaming world, where you want to create the coolest, biggest sounds you can. [In games,] we weren’t necessarily constricted by some of the technological restrictions of broadcast TV — limited dynamic range, mixing for TV speakers; we were more like DVDs.
Gershin was sound designer/sound supervisor for Capcom’s Devil May Cry.
We’re at a time now when 5.1 home theaters are becoming commonplace. Even for those using only computer speakers, most have subs and bass management. In sound design, we have the chance to use the effect of the sub to have more impact, turning it into an E-ticket ride. Of course, a lot of the stuff we’re doing in games is multichannel — 5.1 is now the standard. I think it used to be considered something that was mostly in theaters or obtainable only by the select few. I get the sense that if you look at the slice of America that’s into gaming and movies, a lot of them have invested in 5.1 systems.
Are there more parallels or differences between working on sound for film and sound for games?
There are a lot of similarities and some fundamental differences. I don’t want to sound corny, but it really is all about storytelling and enhancing the experience, whether you get to be characters in a game or you get to watch them in a movie. I always look at the medium I work in and try to do the best I can within that industry.
There are some differences. In gaming you hear the same sounds over and over and over as you play it, so we have to be very conscientious to make sure that the sound still has the impact and the emotional fortitude when played over and over. Also, our clients hear and approve each sound individually before being put in the game. So each sound is put under the microscope. Conversely, in movies each sound doesn’t live on its own but plays a role in the linear-ness of the story. That doesn’t mean we don’t spend time on the smaller sounds; we focus the listener’s ears, allowing us to prioritize what role different sounds play within a scene.
Gaming is not strictly a random medium; it has linear phrases and flows: where you have to solve a problem, where you’ve got to get from point A to point B, where you’ve got to get through your enemies. Whatever the dynamic is, there’s still an emotional arc to each scenario. So whether I’m on a movie or a game, I identify those emotional arcs and try to support those with sound.
Are you allowed to use effects that you gather for a film on a videogame? Could you use werewolf noises from Underworld or swords from Kill Bill, for example?
We go out as audio photographers and capture life — sounds, events, whatever. Then we use them as our palette of elements that we can utilize later in any medium to create something new and interesting. We will never take the exact same sound from one product and use it in another project. We will combine and manipulate elements and use them in new and exciting ways.
How come movies of videogames are usually so terrible? You’d think it would be a good marriage.
They’re different: Gaming is a medium you experience by participating; movies are a medium you experience while watching. What’s happening, though, is games are becoming more story-based and much more episodic. They’re becoming franchises — you can play part 1, part 2 and part 3, and follow a long, complicated storyline with all these adventures along the way. There are so many wonderful games out there that have in-depth stories, whether you create them [within the game] or the game designers create them. Something like The Chronicles of Riddick is a movie, an anime and a game. All of them span different time lines within the universe of the Riddick chronicles. So if you see all three, you get more of the story. This approach really expands the art of storytelling and character development.
Are there pieces of technology that have come out in the past year or two that you think are advancing the state-of-the-art in game sound?
Sure. Gaming is a very technological industry. I actually equate it more to movie animation than movies themselves because the dev cycles are longer. There are different technological issues that have to be understood and dealt with compared to film. I see more of a parallel between high-end CG animated movies and games.
Probably the biggest recent technological development is the advent of what’s called ‘middleware tools,’ which will allow [the games you develop] to play on multiple platforms — sometimes competing platforms. I think that’s huge. It used to be when you developed a project, you developed it for Sony or for Microsoft or for PC. The developer would make it for one platform, send it to a company and then they would rebuild the game to make it work on another platform.
There are two interesting companies supporting what we call middleware — Wwise and FMOD — that have enabled audio content to be somewhat more easily ported between multiple platforms, so we don’t have to keep reinventing the wheel. In the past, you had to create for the lowest common denominator. If you were creating for, say, two platforms, where one platform would only stream two to four channels at best and the other platform could stream 30 to 40 channels, you always had to stick to the smaller platform.
What’s a recent game you’ve worked on that you’re excited about?
We just finished [Capcom’s] Lost Planet, which I’m very proud of. Heading up the sound for the project was Tomoya Kishi from Capcom, Peter Zinda and myself. We and our crew worked on it for about two years. It was one of those projects where everything jelled and clicked. It was nice to have the time to work on something and experiment a lot — that really makes a difference.
Are the sound budgets for big games getting to be comparable to big films?
It’s apples and oranges. The whole business paradigm is very different. On something like Lost Planet, we were on and off the project for a long period and there were different phases. On a movie, most of the time when you’re on it, you’re on it. In between phases, I did four or five movies during that two-year period.
Obviously, you still like doing both.
Oh, yes. I have a passion for both film and games. I’ve been incredibly fortunate to be able to participate in both and work on some really great projects in both industries. If I could stay this way for the next 20 or 30 years, I’d be a happy camper.
Blair Jackson is Mix’s senior editor.