Computer, Do the Math!

This month, I'm delving into a topic that generates its share of enmity in the two audio camps: the Fixed and the Floats. Like the prodigal McCoys and 2/01/2003 7:00 AM Eastern

This month, I'm delving into a topic that generates its share of enmity in the two audio camps: the Fixed and the Floats. Like the prodigal McCoys and Hatfields, these fellers have been mixin' it up since “real engineers” wore 'coon skin caps, but nobody's yet come out as the winner. I've got friends, and you may be one of them, who are adamant about what flavor of arithmetic is used in their DAW. While there are circumstances when it's more appropriate to employ fixed or floating point, it's a bit like arguing if bidirectional mics are better than cardioid. They're two different beasts; one is neither more accurate nor prettier than the other, just different. But that doesn't seem to matter to those folks who are feudin'.

Let's start with one inherent aspect of this discussion: the word length war of 24 vs. 32. In audio circles, “24 bits” refers to 24-bit, fixed-point, or integer arithmetic, while “32 bits” generally refers to 32-bit, floating-point arithmetic. Before I lay out my argument's clear, crystalline lines, I must digress, unless you remember your high school math from days gone by: Skip to “The Fixed and the Floats” sidebar for a brief rehash, if needed.

All digital audio systems are, at heart, little silicon math majors. AES/EBU PCM data usually starts life as a sampled representation of some acoustical or electrical event and is stored as a 24-bit data word. Thus, AES/EBU audio is 24-bit, fixed-point data by definition. Yet many hardware DAWs use 32-bit, floating-point arithmetic to process what was once your AES/EBU data. Questions arise here: First, because 32 is larger than 24, are 32 bits better? Also, what happens when you convert a fixed-point sample to its floating-point equivalent? Well, my opinion is “no” and “not much,” respectively, but read on and decide for yourself.

Strictly speaking, there is no difference between expressing a number — in our case, an audio sample — as either fixed or floating-point. Given sufficient precision, they are equivalent, but therein lies the rub. I'll dig into the subject of sufficient precision in a future column, but for now, let's stick with the 24 vs. 32 discussion. In audio circles, 24 bits are the AES-mandated word length, so some products use 24-bit, fixed-point number crunching. On the other hand, floating-point arithmetic lends itself to simple digital signal processes like gain change and mixing, so 32-bit, floating-point processing is commonly used throughout the audio industry. In these cases, the 24-bit fixed standard is equivalent to the 24-bit mantissa, plus 8-bit exponent used in the 32-bit, floating-point version.

There's a saying that the devil's in the details, and low-level detail is what many engineers work very hard to preserve. James Moorer, former tech chieftain at Sonic Solutions and now with Adobe Systems, is a longtime proponent of fixed-point arithmetic. In an AES paper (“48-Bit Integer Processing Beats 32-Bit Floating-Point for Professional Audio Applications,” presented at the 107th AES Convention, September 24-27, 1999, Preprint Number 5038 [L-3]) discussing the advantages of double-precision fixed-point vs. single-precision floating-point DSP, he states that “…there is an advantage to using integer arithmetic in general, in that most integer (24-bit, fixed-point) arithmetic units have very wide accumulators, such as 48 to 56 bits, whereas 32-bit floating-point arithmetic units generally have only 24 or 32 bits of mantissa precision in the accumulator. This can lead to serious signal degradation, especially with low-frequency filters.”

The signal degradation mentioned translates into 3 or 4 bits of precision, by the way — not much in the grand scheme, but when multiple operations are performed on a signal, small errors can quickly add up. The accumulator that Moorer mentions is a temporary memory location, or register, that stores the result of an addition or multiplication operation. Any bits that don't fit in the accumulator must be thrown out, usually via a rounding operation.

Moorer is talking specifically about DSP implementations; that is, the manner in which integrated circuit designers chose to build their chip-level products. In this case, he's referring to the Motorola 56k family of DSPs: 24-bit, fixed-point machines with 56-bit accumulators. A common choice for floating-point DSP is Analog Devices' SHARC family, which is a 32- or 40-bit, floating-point device with a 32-bit mantissa accumulator.

The 56k and SHARC are two common hardware examples, but host-based, software-only DAWs largely use the CPU's built-in fixed- or floating-point processing. Because personal computers are general-purpose devices, they can perform most arithmetic operations they are called upon to do, though it may not happen as quickly as a purpose-built hardware device. By the way, SHARCs have some interesting register features, but, for simplicity, I'm gonna skip their trick stuff and stick with the basic concept.

So, the bottom line: First, carry “enough” significant digits from one DSP operation to the next. Second, when you have to throw out extra “low-order” bits, do so sensibly so that residual low-amplitude information will not be lost. Finally, when it comes time to down-res that 24- or 32-bit master to a 16-bit consumer format, carefully redither it. If done properly, the conversion from a long word length file to a shorter word length distribution master will carry most of that quiet information, even though the “extra” bits are gone.

One question you may be asking now is why designers choose one processor architecture over another? I'm not sure, but methinks it has something to do with parts' costs and programming complexity. An example is that SHARC family, which has less-than-stellar “development tools,” as programming aids are called, but is inexpensive and easy to hook together when an application calls for many DSPs; hence, their seeming ubiquity in low-cost digital audio gear or where gazillions are needed, as in a digital console. Also, once a DSP choice has been made, the corporate culture tends to discount other architectures due to familiarity and a wealth of in-house wisdom about the chosen part.

Through all of this, realize that microphone choice and placement, which preamp and converter you use, gain staging, signal path and circuit topologies, along with redithering choices, usually have far more effect on the final sound than the arithmetic used in a professional DSP product. Also, I feel that all this fussing is moot if you're working on pop music without dynamic range and way too much processing. However, once an analog signal is sampled, then quality issues are dictated by, among other things, subtle product-design trade-offs, including how “excess” data is handled. So, the 24 vs. 32 argument really comes down to implementation, either in hardware or software. If your gear “does the math” carefully — that is, conservatively performs the DSP — then it will produce a higher-quality result; why we're all in this business.

This column was written while under the influence of reruns of Buffy: The Musical and the cool jazz grooves of Stan Getz's Focus. For links to DAW manufacturers, both fixed and floats, head on over to


Computers can perform their computations in one of two ways: Either the math is fixed-point, as you or I would do in long-hand arithmetic, or it's floating-point, which in high school is called scientific notation. Fixed-point notation is a method of expressing a value by having an arbitrary number of digits before and/or after a decimal point: 0.0079, 3.1415 and 8,654.63 are all fixed-point expressions. Floating-point takes another tack by using a “mantissa” and “exponent.” The mantissa provides the significant information, or digits, and the exponent provides a scaling factor that shows how big the number is. Some examples:

.0079 7.9 times 10-3 7.9EE-3
3.1415 3.1415 times 101 3.1415EE1
865,426.3 8.654263 times 105 8.654263EE5

Notice that the floating-point versions have a single digit, a decimal point, then the rest of the significant digits. Also, grok that any number raised to the first power equals 1, so multiplying anything by 101 is the same as multiplying by 1. So, 75 times 101 equals 75. Finally, notice that the exponent, or “power” to which the number 10 is raised, is equal to the decimal places that the decimal point has been moved from the fixed-point version: Positive values move the decimal place to the right, and negative values move the decimal point to the left. By the way, scientific notation is a geekspeak way of writing a floating-point number in a compact way, with “EE” standing in for “times 10 to the power of.”

Want to read more stories like this?
Get our Free Newsletter Here!

PLASA (London, UK)

ExCel London, Royal Victoria Dock, London, UK



Millennium Biltmore Hotel, Los Angeles, California, US