Before the new Playstation2 game console was even on sale in the U.S., it was already clear that Sony had a hit on its hands. People camped out overnight to be first in line for PS2s, and sold-out retailers were back-ordered for months. Following in the footsteps of the original Playstation, which knocked Nintendo off the top of the game-system heap, PS2 was an instant success.
Part of PS2’s appeal derives from the fact that it launched with solid support from game publishers. As many as 26 titles were available at launch time, with more than 50 scheduled for release in time for the recent holiday season.
Another element is the technology Sony has packed into the machine. PS2 sports a new central processor, dubbed the “Emotion Engine,” that Sony says is capable of peak calculation performance “comparable to that of high-end graphics workstations used in motion picture production.” There is also the fact that PS2 supports the use of DVD (in addition to CD) as a medium for game titles, making far more disc space available for game designers to work with. And even when CD media is used, the speed of the built-in DVD drive means that bandwidth from the disc to the electronics is vastly improved over Playstation. Also, PS2 will play DVD-Video titles, making it a nice dual-purpose purchase for $299 (list).
Despite all of these advances, however, audio for PS2 is not radically different than the original Playstation, either in its preparation or the way it is used during playback. “Most video game consoles have restrictions in terms of available storage and memory for audio,” says Stan Weaver, lead sound designer on Star Wars Starfighter, a PS2 title due out this spring from LucasArts Entertainment Company in San Rafael, Calif. “The sound designers and programmers must make creative decisions in response to these concerns, such as downsampling, limiting the size and quantity of sound files, and optimizing their use. Sony has made some improvements with the PS2, but these considerations are still a factor.”
AUDIO FORMATS
Most gamers will experience PS2 audio through analog stereo line outputs. In addition to standard stereo signals, these outputs may be used to deliver sound in Dolby Surround (left, center, right and surround channels). “If the user has a Pro Logic decoder, it will decode the Lt/Rt signal and put certain sounds behind you and in the center,” says Edwin Dolinski, audio operations director at the Vancouver, B.C., game-development arm of Electronic Arts. “Over the last few years, EA has done dozens of Playstation titles like that.” As for PS2, the company’s titles so far include FIFA 2001 Major League Soccer, Madden NFL 2001, NASCAR 2001, NHL 2001 and the snowboarding title SSX.
PS2’s support for DVD-Video means that the box is also required to support audio in PCM and Dolby Digital formats (support for DTS is optional in DVD), and the console includes an optical digital output that can feed a surround system. But don’t look for those capabilities to be exploited much for gaming. “A lot of people think that since PS2 can play DVD titles, the audio on the games could be in 5.1 surround,” Dolinski says. “But Sony only included those audio capabilities for DVD playback. In a practical sense, I highly doubt anyone would ever try to do it for a game.”
Dolinski sees two obstacles to trying to leverage PS2’s DVD-based audio capabilities for games. “First,” he says, “you wouldn’t be able to use the SPU2 sound chips to do your audio processing, so you’d have to do it on the Emotion Engine. And, at that point, you’re in a big fight for processing power with the other elements of the game: the visuals and the AI [artificial intelligence, or game logic]. So nobody would design a game that way.”
Another problem is that there is no programming support for using any audio type other than a proprietary Sony format, called VAG. “At this point,” Dolinski says, “Sony doesn’t make any tool or give you any function calls to enable the real-time rendering of Dolby Digital or DTS. Everything that you play off of the sound chips in the machine has to be pre-encoded into VAG.”
VAG applies data compression of about 3.5:1 to the input signal, which may vary in resolution. In terms of word-length, Dolinski says that “for games, you’re pretty much working in 16-bit.” As for sample rate, he says, “The Playstation2 hardware is set up to output digital sound at 48 kHz. But it’s not required that the source files be at 48 kHz.” Deciding on the sample rates of various source files is part of prioritizing the use of bandwidth based on the different roles that audio plays in a game.
“Chances are that you want to cram as much sound as possible into the available RAM or to store more sounds on disc,” Dolinski explains. “For example, there’s no point in sampling speech at 48 kHz, so you would tend to sample it at quite a bit lower rate. Frequently, you’ll mix sample rates depending on the quality of the sound effect that’s playing: Do you need any high end on a low-frequency explosion, for instance? So maybe you can save some bits there and save some bits on speech. With music, however, it’s going to be more noticeable, so you try to keep the sample rate up.”
Dolinski cautions that even if you use a lower sample rate on some sounds to save disc space or RAM, PS2 will actually upsample everything to 48 kHz at the output. “It’s nice to work in simple factors of that rate—24 kHz, for example—when you downsample,” he says. “That way, you have the least artifacts when your sounds are upsampled during playback.”
INSIDE THE BOX
The VAG format was used in Playstation, but Sony has made some enhancements to the audio setup for PS2. “They’ve given you two of the SPU2 chips instead of one,” Dolinski says. “That gives you 48 channels to work with instead of 24. And they’ve upped the sound RAM from 500 KB to two MB. In game design, there’s generally a fight for RAM between visual needs, game logic and audio. But, because PS2 gives you two MB of dedicated RAM attached to the SPU2 chips, you don’t have to fight for it. There’s 32 MB of main RAM for the other elements, and they access the main processor directly, so visuals and AI aren’t going to try to plunder the sound RAM.”
The sound RAM is allocated between short sounds that are stored in RAM to be triggered as needed and buffers that are used for continuous sound that streams from disc during game play. “You might stream your background ambience or stream a music track so you don’t have to get a 40MB file to fit into the two MB of sound RAM,” explains Dolinski. “Instead, you use a 100kB buffer for the streamed sound data from disc. We’ve tended to stream stereo music tracks ever since we moved to CDs with Playstation.”
The streaming is made possible because the drive can deliver data from the disc faster than the game needs to use it. “We stream audio from the CD while the computer graphics are being rendered from the game code stored in the main RAM,” Dolinski explains. “If you need additional game elements, you read the disc in a few places while your audio buffer is running down, and then go and refresh the audio buffer before it runs out.”
Dolinski says this basic approach has been “par for the course” since EA started developing games on CD, but PS2’s higher bandwidth means that more elements can be juggled at the same time. “It’s now quite feasible to stream stereo music and at the same time stream continuous stereo ambience, and still have time to read the disc for the game information,” he says. “So if you want a crowd chanting in a stadium, for example, you don’t have to use a two-second loop, you can use a 30-second loop streaming continuously off the disc, which doesn’t sound as monotonous. These capabilities are factored into deciding which sounds should be samples in RAM and which should be streamed.”
One of the most profound differences between sound for games and other applications of sound (music industry, sound for picture, etc.) is that there really is no “final” mix. “Think of it more as a sound bank that is loaded into RAM,” Dolinski says. “There may be 120 sounds sitting there, and when a certain event happens, it triggers a sound. We can attach parameters to each of the samples, things like amplitude, pan and frequency. The parameters may or may not be controlled by the game code; we’ll program in certain variations so that you don’t get that repetitive thing of an identical sound each time something similar happens in the game. Even the mix between the sound effects is going to vary dynamically, depending on what’s happening in the game.”
Another difference is that playback is event-based rather than timeline-based. “There’s no timeline,” Dolinski says, “because the time at which any sound is fired depends entirely on the game code. Everything sits as a list of sounds in RAM, and there are hooks to the game code. So when a certain event happens, it will fire off a particular sound or start a new stream or duck a stream, because you are going to insert some speech. The mix changes on-the-fly, and you don’t know the durations between events. So it has nothing whatsoever to do with linear time.”
The event-based and streamed elements are mixed by the SPU2 chips before being sent to the outputs. Regarding how the 48 channels are generally used, Dolinski says it’s “a question of which sounds need priority—who’s going to get cut off if you’ve got more sounds than available channels. Certain channels may be set aside for the streams, certain channels for speaking voices and others for effects such as collisions. You can choose to reserve channels for specific classes of sounds, or you can leave the whole thing up for grabs. But it’s obviously nicer to have 48 voices to work with than 24.”
PRODUCTION PROCESS
Dolinski sees the overall production process of sound for a game as involving three major phases. “There’s the planning,” he says, “and there’s making and delivering all the sounds. Then after you wait for the sounds to be implemented in the game, there’s listening back and tweaking the results.”
Because sound for a PS2 game can involve hundreds or even thousands of dialog, effects and music elements, making the sounds and incorporating them into a game is a big job that requires a lot of teamwork. “For each title in development, we typically assign a lead sound designer and a lead composer, in addition to the voice director and voice editor,” says Jeff Kliment, sound department manager at LucasArts. “We also have two assistants in our department who get assigned to projects as needed.”
At LucasArts, the voice director gets involved early, working closely with the game designers in creating the script, casting the actors and supervising the voice recordings. “Once the recording is done,” Kliment says, “the voice editors will create the master set of high-resolution voice files. These then go to the sound department for additional processing, which may include compression, equalization and creative processing, such as radio effects or reverb. For some files, the final step is a sample-rate conversion. The lead sound designer then hands the final voice set to the project team to be incorporated into the game.”
Weaver says that for a Star Wars game, the sound design process starts with a library of sound effects from the movies. “We work with Skywalker Sound to make sure we have the raw materials we’ll need to get started,” he says. “But that’s just the beginning. Since our games contain characters and locations not seen in any of the movies, we have to do a lot of original sound design to supplement the existing material. Rather than relying too heavily on off-the-shelf SFX libraries, we try to do as much original recording as possible for each game. This includes field recording, Foley and electronic composition with synthesizers and samplers. Over the years, we’ve amassed an extensive library of original material. In the end, the sound design will be a combination of all these elements.”
Music, Weaver says, is handled similarly to effects. “Again, if we’re talking about a Star Wars title, then it’s usually based on the John Williams scores, but lately we’ve been adding our own twists with additional composition and/or remixing. Our non-Star Wars titles get original musical treatment. The score will generally consist of a blend of MIDI-based tracks with live players added for sweetening.”
Live sessions for both music and sound design take place in LucasArts’ main recording studio. “The current setup in that room,” Kliment says, “is a Pro Tools system routed to a Soundcraft Ghost 32×8 console, with Meyer HD-1 monitors, plus a small complement of outboard gear and software tools. We have a handful of Neumann and AKG microphones, as well.”
The rest of the voice editing and sound design work at LucasArts is handled in soundproof offices, each equipped with a Pro Tools system, a small mixing desk and monitors. “Each of the workstations is equipped with a set of software tools and plug-ins,” Kliment says, “and we have an array of samplers, synthesizers and outboard gear that is moved around as needed.”
As files for a game are completed, they are converted to VAG. “If you are a game developer and you have Sony’s developer kit,” Dolinski says, “you get a Sony tool to convert your audio into that format. If you are an outside contractor, you are probably going to deliver your sounds to the game developer as uncompressed digital audio files in one of the common formats, like .AIFF, .WAV or SDII.”
Once compressed, the files are handed off to programmers for incorporation into the game. “Depending on the state of the game code, the sounds are not always implemented right away,” says Dolinski. “So sometimes you have things finished, but you don’t really get the satisfaction of hearing how it’s working out in the game until weeks later.”
When the sounds are incorporated, the next phase of production begins. “You start doing a lot of listening back,” Dolinski says, “and trying to correct the mix. You’ll notice that certain things sound different in the game than they did on your desktop. There might be a piece of code that fires the same sound twice, or you’re getting the wrong sound when a certain action happens. So there’s a lot of final massage and tweaking once you’ve delivered all these sounds and they’ve been put into the game. That’s really when most of the mixing work begins.”
The tweaking phase continues through for as many iterations as are necessary to create a soundtrack that fully supports the game designer’s intended gaming experience. “Our main concern,” says Weaver, “is that the sound and music help make the game fun to play. If the soundtrack puts a smile on your face and gets your feet tapping, then we’re doing okay.”
Philip De Lancie is Mix’s new-technologies editor.