Rikki Don't Lose That NumberMOVING TOWARD ARCHIVAL STANDARDS 5/01/2005 8:00 AM Eastern
Consider our collective audio heritage — those uncounted hours of recordings on tape, acetate and disc we've accumulated as a species since Valdemar Poulsen built the first wire recorder and Edison made his first cylinder. Mixed in somewhere among digital media standards for bitstream formats and media file interchange, you'd think there would be some standards as to how we keep this priceless historical record from vanishing. How do you ensure these documents will last and that someone will eventually be able to play them back and enjoy them? What kind of standards or recommended practices are out there for preserving audio material?
A lot of thought has been given to these issues by some very smart people, and there are guidelines in place from well-respected organizations. The activities of standards organizations are important in guiding the choices made by these and other groups that have a mandate to preserve this information. The AES SC-03 Subcommittee on the Preservation and Restoration of Audio Recording says that its scope “includes test methods, practices and specifications pertaining to the life expectancy and retrieval of audio information recorded on mechanical, optical and magnetic systems, including their respective media. It includes coordination with other organizations concerned with this scope and with those concerned with preservation and restoration of recorded images.” Other organizations, such as the Society of Motion Picture and Television Engineers (SMPTE, www.smpte.org) and the European Broadcasting Union (EBU, www.ebu.ch), have also published recommendations (available online). Find a resource list at www.mixonline.com. As these groups push for acceptance of their recommended practices, there are some common practices that you should consider, whether you're storing your old high school band tapes, a collection of vintage acetates or vault masters from top recording artists.
AUDIO IS KING
In any archive, there will be the media itself, generally referred to as the “essence” — such as digital audio or video — and the metadata, or “data about the data.” Archivists and engineers from several major collections provided insights into which formats are used to preserve audio and metadata. Let's start with the audio side.
One of the most important collections of recorded material on the planet is managed by the U.S. Library of Congress. According to digital conversion specialist Peter Alyea and Larry Appelbaum, supervisor/senior studio engineer of the LOC's recorded sound laboratory, “Our current baseline format for our master files is .WAV/Broadcast .WAV at 24-bit/96 kHz. We refer to this as our baseline because the majority of our work is archiving analog sources. We have also archived a smaller number of analog sources at 24-bit/192 kHz.”
This sentiment is echoed by Alan Stoker, recorded sound and moving image curator for the Country Music Hall of Fame in Nashville. The Hall of Fame has more than 200,000 disc recordings in its collection. Around 15,000 of them are 16-inch acetate or metal “transcription discs”: recordings made directly to disc with no tape intermediate. Stoker explains, “We are beginning a digital archive where we are starting to transfer those deteriorating acetates to 96kHz, 24-bit Broadcast .WAV files.”
The Broadcast .WAV format is widely recognized as the preferred method for archiving digital audio and is now supported as an output format by most professional DAWs. In 2003, the Producers and Engineers Wing of the Recording Academy (NARAS) released “The Delivery Recommendations for Master Recording.” This document, written by committee co-chairs George Massenburg and Kyle Lehning, sets out a number of recommended practices for preserving audio masters. According to Massenburg, “Knowing there were a lot of ways the audio and data could be stored, we felt it was important to try and establish some best practices for delivering and preserving audio so there would be better communication within the industry.” This document describes the methods commonly in use to deliver master recordings and is designed to be periodically updated to reflect new methods as they evolve.
The NARAS document makes a firm recommendation for delivery of master recordings as “flattened” Broadcast .WAV files — single-track continuous recordings without edits. Massenburg says, “The recommendation to use the Broadcast .WAV format reflects the establishment of this data format for audio as a standard defined by the EBU and referenced in AES standards. It is a very flexible format that allows for a variety of bit depths and sample rates and has a provision for storing metadata in the header files.”
I NEVER METADATA I DIDN'T LIKE
Any recording project generates metadata: information about the recording, such as technical specifications, names of artist, engineers, titles of pieces, transcriptions of lyrics or spoken word and many other types of information.
There are several approaches to capturing the metadata. Peter Alyea from the LOC notes that the direct approach to capturing the metadata entirely in the header has its limitations. “We store a small amount of metadata in the file header,” he says. “We have found a lack of robust tools to manipulate and update this data in a batch mode, so we've limited the amount of metadata to core fields for the time being. With better tool integration with our database, this will hopefully change.” The Hall of Fame's Stoker adds, “We are creating metadata files that are married to the audio files. This is in the form of an XML file that is stored in the header area of the Broadcast .WAV file itself. By using a UMID, or Unique Material Identifier, we can ensure the data is preserved along with the recorded audio.”
Stoker's comment points out an important distinction between using the BWF header as the direct repository for the metadata and using it as a reference to an external repository. Rather than trying to capture all of the metadata entirely in the BWF file header, the Hall of Fame project uses the header to capture a reference to an external database record that can be maintained apart from the audio file itself. This approach was recommended by Bridge Media Solutions (www.bridgemediasolutions.com), a consulting company that has carved out a niche providing advice and technical assistance to organizations engaged in audio archival projects.
Bridge Media Solutions was a principal consultant on the Country Music Hall of Fame archive and many other similar projects. According to company president John Spencer, a number of best practices have emerged fairly universally, such as the use of the Broadcast .WAV format for storing the audio. Storing the metadata has found a wider variety of implementations, depending on the sophistication of the archive, funding and other considerations. Spencer says, “There are two principles we believe in very strongly: metadata evolves over time and it is too big to attach to the media.”
The practice followed by Bridge Media Solutions is to store a SMPTE UMID in the header of the BWF file; think of it as a unique serial number for that piece of audio. This UMID then acts as a pointer to the file record in a SQL database file stored as XML data. This database, stored separately from the audio, acts as the repository for the metadata. According to Spencer, “This has the double benefit of keeping a small header on the essence file itself and it allows the metadata to be available offline in a standardized format where it can be cataloged, searched and updated separately from the media file without losing the link back to the audio essence.”
And even once the archive is built, it has to be maintained. According to Spencer, “An archive should be designed to be refreshed periodically to maintain data integrity and to keep up with retrieval technology.” In other words, merely dropping everything off to a gold writable CD and storing the metadata as a Filemaker file won't ensure a long-term future for your archive. Audio playback devices come and go, as do data formats tied to a specific platform or OS. Using more universal standards such as BWF, XML and SQL database structures ensures the archive is relatively independent of particular applications, devices and manufacturers.
Choice of parameters for an archive depends on what format was used for the original recording. If the existing master is analog (analog tape or acetate or vinyl disc), general practice is to create a first-generation archive that's as close to the source as possible, without using any filtering or noise-reduction technologies. These can be later applied to the archived master for producing access copies of the material for public consumption. According to Stoker, “[The Hall of Fame] felt that adherence to capturing and preserving the original is of primary importance for the master archive, so we elected to use no noise reduction in acquiring the master audio from the acetates.” Alyea notes that the Library of Congress follows this same purist approach: “As for processing, like equalization along with using noise-reduction tools, we use none of it. We will add some gain if the source is very low, but other than that, no processing is used to make the master file.”
WHAT'S THE FREQUENCY?
The use of 96kHz/24-bit sampling seems to allow for balancing the need for sufficient quality and resolution with the practicalities of storage and acquisition technologies. It provides enough bandwidth and dynamic range to capture the essence of the original analog source. For materials originally acquired in digital format, of course you can't add resolution beyond what is there, so these are typically maintained in the existing bit depth and sample rate. The LOC's Alyea notes, “Physical digital media is captured to a file at its native bit depth and sample rate, with no up-sampling or other modifications.”
One of the most important requirements in preservation is to get the material onto a medium for which playback devices are likely to be available in the future. According to Bridge Media Solutions' Spencer, “We suggest capture to a hard drive medium, then offloading once there is a large data load to an offline format such as LTO tape, which can store hundreds of gigabytes at very low cost.” These are then checked periodically for data integrity and transferred to current technology to maintain the archive's long-term viability.
An archive is a living thing. Once you've given birth to it by getting the essence and metadata into secure digital form, it will require care and feeding to remain viable. Archiving and preservation is not just a one-time job, it's an ongoing adventure.
Ron Franklin is director of media products for Tarari Inc.; his musical identity on the Web is at www.ronfranklinmusic.com.
A number of organizations are involved in research and promotion of guidelines for best practices in preserving audio. Click here for a few sources on audio archiving, with links to their Websites.