This month, I'm taking a look at an increasingly viable alternative to bulky linear PCM sound files and lossy-compressed, but compact, audio files. That alternative is a lossless codec, which trims the fat without sacrificing aural satisfaction.
Let's start with my “viable alternative” descriptor. In some parts of the world, notably Western Europe, the Pacific Rim and Canada, Fiber-to-the-Home and Fiber-to-the-Curb (FTTH and FTTC, respectively) have arrived in urban neighborhoods with little or no fanfare. Essentially the same service with either a domestic or commercial marketing slant, FTTH and FTTC are names given to utilities that bring optical fiber directly to a home or building rather than delivering converged data services through a lower-bandwidth copper connection.
In the old-school scenario that FTTC replaces, an individual subscriber's copper, usually an unshielded twisted pair (UTP), is eventually aggregated into existing fiber-optic trunks that serve an entire neighborhood. The more copper involved, the lower the potential bandwidth that any subscriber can expect. Cable modems or digital subscriber line (DSL) services “piggyback” data on those existing copper connections that we use. Though there are small-scale exceptions, this is the typical approach taken in the U.S. of A. Elsewhere in the universe, FTTH services are normally subsidized by less myopic governments to bootstrap their national information technology (IT) infrastructure, and as a result, end-user costs are kept quite low. This is what was done to create our original phone system, but the move away from government's support of public programs in the 1980s has resulted in unchained former monopolies and short-sighted profiteers, aka “market forces,” dictating our current public IT utilities build-out. Nowadays, for what I pay for very low-rate ADSL in San Francisco, my buds in Tokyo receive several tiers higher-speed service, but I digress.
Suffice to say, now that broadband data services are available to most locations in the States, audio geeks can pass sound files over public networks without too much of a transit time penalty. Still, it would be nice to speed things up a bit, and that's where lossless codecs come in. Through the miracle of mathematics, it's possible to reduce a file's size by about 40 percent to 60 percent without discarding any information. This can be done simply (Huffman Coding) or sublimely (MLP), and as a result, there are far too many codecs to choose from. Unless you have, for example, Dolby Labs' marketing muscle, your product can get lost in the shuffle. Claude Cellier, president of Merging Technologies and maker of the LRC lossless codec, opines, “While probably being the audio company that ‘invented’ lossless audio compression in its modern form, filing for a seminal U.S. patent as [far] back as April 1995, we are not even a tenth as active at marketing and promoting it. If building, enhancing, fine-tuning and improving Pyramix wasn't drawing on most of our resources, time and energy, we'd surely be more active in promoting LRC, but days [only] have 24 hours.”
Among the scrum of competing offerings are a few that have been standardized or mandated, such as the lossless codecs in Windows Media 9 and Philips' Direct Stream Transfer process built into SACD. Dolby controls MLP, the lossless codec mandated for use in the DVD-Audio format, and the company's sometimes-strident promotion of the format keeps licensing fees flowing in. But there are many other codecs available, including several Open Source choices. FLAC, or Free Lossless Audio Codec, competes with the less-developed Monkey's Audio and WavPack for developers' attention. There are also closed or proprietary applications, such as La, LPAC, Shorten and OptiFROG. Shorten, in particular, is quite popular with hobbyist music traders. In keeping with its vision of a universal media standard, the MPEG machine is also adding a lossless option: Audio Lossless Coding, or MPEG-4 ALS. According to Tilman Liebchen, one of the authors of the LPAC codec, “An improved version of the LPAC algorithm was recently chosen as [a] reference model” for MPEG-4 ALS.
Modern lossless codecs are rather complicated affairs, but basically, they often work by applying carefully selected filters to the audio, noting the filter coefficients and only storing those filters' coefficients and residual audio output. There are, however, some simple forms of lossless compression that you probably use every day to streamline your work and you may not even know it. One is the Zip file format that was created by PKWARE and used everywhere to reduce file sizes for transmission over the Net. Another is run-length coding, which is built into data tape drives, their so-called “hardware compression” option. Run-length coding comprises searching for repeated runs of a single symbol in a data stream or file and replacing that run with a single instance of that symbol and a run count. Here's an example: At the end of a song recorded at 44.1 kHz, let's assume that there are two seconds of digital silence. So instead of explicitly writing 88,200 identical samples to tape, a backup would write the equivalent of “zero amplitude sample 8.82EE4 times.” You can imagine that this would save a fair amount of space on the medium.
Last month, I mentioned some products that I've been listening to, and I want to discuss one in this column. The folks at Shure have a new series of consumer in-ear monitors that essentially re-brands the transducer component of its pro personal monitoring systems — same products, different package. At last year's Home Entertainment [June 2003, San Francisco] show, I briefly listened to the E3c and E5c and “understood” the timbre immediately. Shure's marketing crew subsequently provided me with a pair of E3cs for evaluation, and I must say, if you want to replace your iPod ear buds with something much better, buy an E3c. After a short break-in, I was surprised by their low distortion, lack of resonance and unhyped character. They're comfortable enough for extended wear, and as they provide very good isolation, they allow you to monitor at lower volumes. Also, if you want something with even lower distortion and extended frequency response, audition the E5c.
That's all for this month's techno-babble. Next month, I'll head back to my “Pedants In a Big Box” IT glossary. In the meantime, grab your pocket protectors and continue to rock!
This month's column, my 50th installment of “Bitstream” (Oy!), was written while multitasking at O'Reilly's ETech conference in southerly San Diego, where I filled other geeks in on UWB basics.
Pedant In a Box
Huffman Coding: A member of the entropy coding family of algorithms, Huffman Coding employs a simple concept with a wide range of applications in data storage and transmission. Entropy encoding uses statistical symbol substitution to quickly and effectively reduce file size. This is equivalent to taking dictation with shorthand rather than writing out each and every word. Here's how it works: First, you build a probability table for all of the “symbols” in a particular lexicon. These symbols could be the letters of the alphabet or, in our case, the 224 or 16,777,216 amplitude values of 24-bit AES/EBU audio. So for each available sample value in the millions of choices, you have to decide the probability of any specific value appearing in a particular sound file. If you scan the entire file, you can assign “custom” probabilities for that file. Once you have those assignments, you then take the lowest-probability pair of two adjacent samples — the least-likely combination — and substitute that pair with a unique symbol. This replaces two samples with one placeholder. Then, repeat this process with the next least-likely combination, taking rare pairs and replacing them with a smaller, shorthand equivalent.
After repeating the substitution cycle many times, you will end up with a table of replacement symbols and much of the entropy or randomness removed. When it is time to reconstitute or decode the compressed data, simply look up each entry in the substitution table in reverse order — i.e., from end to beginning — until every symbol has been replaced with the original sample pair. Huffman Coding is often used to further compress the data from the filter bank processing.
David Huffman wrote his original paper on optimizing Shannon — Fano coding in 1952. Claude Shannon, one of the authors of the earlier work, was the same fellow who provided insight into the information theory that forms the basis for much of digital audio's current implementations.
MLP: the mandated method for lossless storage of linear PCM data in the DVD-Audio standard. MLP employs several individual tools to the original data, each one providing additional compaction to the data set. Though an explanation requires most of an article, let's just say that MLP reduces a data payload by at least 4 bits, and often more than twice that. MLP has modest computational needs, which is good from a CE manufacturer's perspective, and is also able to deal with constant or variable bit rate (CBR and VBR, respectively) and mixed sample rates in a multichannel data stream. For a more in-depth explanation, check the “Papers” section in Seneschal's Info Annex at www.seneschal.net.