Tech's Files: Start Your EnginesUNDERSTANDING THE BASICS OF GAME AUDIO 3/01/2010 7:00 AM Eastern
I have a friend whose business card reads, “If I don't know it, I know someone who does.” That's my story in this edition of “Tech's Files.” My friend and fellow geek Damian Kastbauer is an audio-for-gaming insider.
We'd like to provide an overview for peeps like me who are completely unfamiliar with game audio, but who might benefit from knowing some of the nuts and bolts of the process. For example, DAW plug-ins are very graphics-intensive and look very much like their hardware counterparts, even though sliders are more mouse-friendly than knobs. By contrast, audio tools for games tend to be parameter-based, whereas a slider is a newcomer and virtual knobs don't even exist!
On the surface, game development is often compared to the process of making a film, with the need for a storyboard, screenplay, set and sound design. There is a common discipline between film and videogame creation, but some aspects remain distinctly different. While it might seem absurd to reinvent a DAW or a camera each time a new project is initiated, that's what often happens in the gaming industry when it comes time to improve upon previous technology. For each new game, the logic, visual and sonic programmers must create a brand-spankin' new engine. The insider's coding game is all about “playing well with others” — for example, sharing memory and DSP/CPU capabilities — so that the behind-the-scenes technology is transparent to the gamers.
Of course, game designers want a great first impression; the new release must look, feel and sound more realistic than the previous generation. Processing requirements, platform variations (computer hardware)and time to market are all moving targets made more dramatic by projected deadlines that don't account for the nebulous “fun factor” necessary to make a good game great.
Anyone familiar with more than one DAW knows that each has its own feel; just like a car's engine, gear train and suspension contribute to its personality, the game engine and toolset add feeling to the creation of the game. When engineers are programming a game, a customized processing engine is built and stocked with only the tools necessary for various tasks that need to be done. Hardcore PC users take a similar approach when installing an operating system by loading only the most essential applications required to achieve maximum efficiency and speed. Understanding the needs of the game — and the hardware specifications of the target platform — allows programmers to leave out anything they won't be using.
Once everything has been assessed — software features defined and resources for the audio portion of the engine allocated — it's time to start building. Some programmers prefer to “roll their own,” but there are also several available run-time audio libraries (also known as middleware) with which an engineer can kick-start the audio development. For the uninitiated, this is the customized programming toolkit that provides solutions to game-specific feature sets, such as hardware specifications, and has the potential to meet the needs of cross-platform development.
The game audio engine can be thought of as a multitimbral sampler that plays back audio samples at the request of the game engine. In addition to playing back requested sounds, the engine will also be asked to do many tasks on the fly — pitch shift, volume randomizing, shuffling and sequencing of playlists — because sounds are constantly changing.
To modify and adjust the values for tasks like these, audio-specific toolsets are “created” to harness the engine's functionality. If the user interface for these tools is not given ample consideration, then control over features and functionality will be out of reach when needed; they will also require programmer assistance, which is often hard to come by.
Populating a mechanical toolbox is analogous to the task of compiling a software DSP toolset. Any tool in the box can be used, but choosing the most appropriate tool can minimize any detours to the mechanic. For example, you wouldn't open a 7-band EQ if only a highpass filter were needed.
Hundreds of simultaneous sound voices may be requested during any given frame or moment of game play. To make traffic flow smoothly and avoid pileups, it's necessary to establish memory limits and priorities based on the target platform. A sound may either be loaded into Random Access Memory (RAM) in full, or streamed on demand from the DVD or HDD (if available). Through the use of a streaming buffer, only a small part of the sound needs to be loaded so it can start playing on demand when requested. In the background, the rest of the sound file is in the cue, ready to be “streamed” from the media. Multiple on-demand variations (or instances) of sounds may be loaded into RAM to ensure proper playback. All of these requirements can add up to serious issues with memory management.
Imagine mixing an action scene in Pro Tools and being limited to six mono tracks. We're talking explosions, multiple vehicles, crowds screaming and, by the way, no bouncing tracks! There are a couple of ways to prioritize important sounds in the scene to avoid traffic jams and mud.
You can take a broad brush and prioritize based on sound categories using a scale of 0 to 100. We might decide that it's more important to hear weapons as opposed to footsteps (or a voice instead of explosions), which may further lead to separating an important voice from grunts and groans. These priorities are applied to the metadata of the sounds being played back and are then used to prioritize the sounds, based on the maximum number of voices allowable at a given time. Priorities can be dynamically adjusted or modified based on the distance, or amplitude, of the sound from the position of the “listener.”
One of the emerging trends vying for our valuable processing resources concerns the availability of DSP plug-ins that have been commonplace in pro audio since the mid-'90s. Recently, companies including Waves, WaveArts and McDSP have developed cross-platform-compatible versions of some of the popular effects that are CPU-efficient enough to run on today's consoles.
The developer has the ability to process sound in games at run time and maintain a similar level of quality heard across other media types. So during gameplay, effects can be applied to modify their playback based on values coming from the game engine. For example, applying distortion and EQ to a critical voice file, depending on whether it's being delivered in person or via headset communicator, is something that could change based on whether the player decides to stick around while being barked at by the mission guide.
Real-time effects also allow developers to adjust output dynamics, in essence “mastering” the final mix to optimize dynamics so they can, for example, avoid clipping during a pileup of sounds and ensure that quiet sounds are heard across different playback devices. Available in music production for years, it has remained on the fringe of game audio due to a lack of processing power, embedded workflow and authoring. Because most game consoles on the market can also play back movies and music, audio quality comparisons are inevitable. Initiatives like these will continue to help raise the bar for interactive audio.
This is just a simple overview of the many challenges and techniques that affect the creative and technical sound design process for games. Each new generation is delivering a more pleasing and realistically represented soundtrack while any technical decisions (and limitations) are transparent to the user; things sound as expected, with no distractions. Applying several strategies for memory management and optimization can not only guarantee sounds are heard when needed, but also allows room to cram in more sound and features to suit the gameplay, style and scope of the project.
Damian Kastbauer is a freelance technical sound designer working with the Bay Area Sound Department. His contributions to the art of game audio implementation can be heard in Conan, Star Wars: The Force Unleashed and The Saboteur, among others. Eddie Ciletti is learning to translate Italian and gamer geek speak at www.tangible-technology.com.