Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×

Into the New Millennium With…MIDI?

Keeping an old friend young — MIDI in the age of USB and Firewire, part 1

Well, this time it’s for real. The new millennium. No more arguments
about when it begins – it’s here. The Y2K warnings and the Y2K jokes
can be put away forever; the census has been taken, the presidential
election circus (God help us!) is over and we’re now in 2001,
indisputably the 21st century.

Arthur Clarke and Stanley Kubrick were a bit over optimistic about
what we were supposed to have by this year: There’s no permanent space
station above the Earth, and no Howard Johnson’s restaurant in it (or
anywhere else for that matter), and no one’s building a nuclear-powered
HAL/IBM-controlled starship to take humans to Jupiter’s moons, or even
our own. In his earlier writings, Clarke was right on the money about
communications satellites covering the Earth, but he was dead wrong
about the role of garbage at the turn of the 21st century: He saw it as
a potential fuel source, not something to be delivered over microwave,
twisted copper pair, coaxial cable and optical fiber to every home and
office in every corner of the globe by the Petabyte. And he never
foresaw the culture of the personal computer, the ubiquitousness of the
Internet, the dynamic forces that are pulling the entertainment
industry apart and putting it (maybe) back together in an entirely new
way…or MIDI.

MIDI? “Why would anyone want to talk about MIDI in 2001?” I hear you
cry. “I thought MIDI was like so last century!” Well, it was. But it’s
a little early to be digging its grave and dancing (no doubt using
downloaded 124bpm loops) on it.

I’ll admit it. I’m a MIDI-holic. Yes, my name is Paul, and I use
MIDI. A lot. I compose with it, perform with it, mix with it, process
with it, teach it and, yes, write about it. And except for the last one
or two items, I’ll bet most of you do many of the same things. It’s
become so commonplace, so mundane that we don’t even think about it
anymore. And it’s true that it’s not very exciting, compared with the
tools we now have for manipulating real audio. But even though messing
around with MIDI data may not be as immediately gratifying as running
old Rick James samples through Acid, it’s still an important part of
what our industry is all about (especially among those apparently
dwindling numbers of us who value originality).

But MIDI is old stuff, right? And nothing’s happening with it,
right? Yes, it is old stuff (the MIDI Specification, after literally
hundreds of changes in its 17-plus years, is still referred to as
“1.0”), but to say that it’s moribund is to ignore some very important
work that’s being done today to keep it useful in the age of digital
video, T3 Internet connections and GigaHertz desktop computers.

A lot of the work, not surprisingly, has been on the consumer side
of things, where marketers and manufacturers see potential numbers that
exceed by orders of magnitude the size of the professional audio and
music markets. MIDI is still viewed as an efficient and highly flexible
way of handling music for games, Web sites and similar applications
that require either low bandwidth or a high degree of interactivity.
It’s still a lot easier – and more convincing, if it’s done right – to
make a MIDI file instantaneously change the mood of a piece of
background music in a game than it is with digital recordings, no
matter how many tracks you might have to play with. And when a game
designer has used up all available CPU speed and RAM on polygon
generation and has forgotten to leave any room for audio, there’s
always enough space to slip in a MIDI file. As for the Internet, when
you are dealing with typical dial-up connections (which most people
still have), any audio file, even after you’ve crunched it through the
compression algorithms of MP3 or Real Audio, goes down the pipeline way
slower than a MIDI file.

While many consumers still associate MIDI with the cheesy FM sounds
of early PC sound cards, even the cheapest “wavetable”-based chipsets
of today sound a lot more respectable than that. (Wavetable is actually
a misnomer for these devices, because they are, in fact, sample-based,
and true wavetable synthesis is something completely different. But I
won’t get into that now.)

Much of the credit for the improvements can be taken by the MIDI
industry’s adoption – through its administrative body, the MIDI
Manufacturers Association (MMA, www.midi.org) – of Downloadable Samples
Level 1 (DLS-1). The significance of DLS-1, which is now almost four
years old, is that instead of being stuck with the sounds a
manufacturer puts into a synthesizer chip’s ROM, or the 128 sounds in
the original General MIDI specification, a composer or sound designer
can create custom sounds in the form of samples. These can be
downloaded as a block into dedicated RAM on the chip and then called up
quickly and polyphonically from a MIDI file. In many ways, this makes
for the best of both worlds: A 2MB sound set and a few hundred
kilobytes of MIDI data can provide literally hours of high-quality,
completely interactive music. (Another technology that follows the same
general idea is Beatnik, Thomas Dolby Robertson’s contribution to music
on the Web.)

But DLS-1 didn’t solve everybody’s problems. Even before it was
developed, Creative Technology, the parent company of E-mu and Ensoniq,
was working on its own version of this concept, calling it “Sound
Fonts,” which was similar to DLS-1 but with more advanced performance
features.

DLS-1 and Sound Fonts threatened to cancel each other out, until
Creative and the rest of the MIDI industry (as well as the MIT Media
Lab and some other interested parties) came up with a higher
functioning standard that was acceptable to everyone, and not
proprietary to anyone (as Sound Fonts was). This is now known, not
surprisingly, as DLS Level 2. The major improvements in DLS-2 are
dynamic filters and matrix-based modulation, two features that are
essential to any professional-level sampler or synthesizer. DLS-2 was
formally adopted by the MMA in the summer of 1999 and has reached
beyond the MIDI community to become part of the MPEG-4 standard, where
it is called “Structured Audio Sample Bank Format.”

The first DLS-2 chips are about to hit the market, and one
manufacturer claims that by the end of 2001, 40 to 60% of all computers
being made will have DLS-2 sounds built right into the motherboard. On
the game side, Microsoft is supporting the new standard in its upcoming
X-Box platform.

Running parallel to DLS and DLS-2 has been the adoption of General
MIDI Level 2. Before the ink was even dry on the original General MIDI
Specification, which was supposed to ensure a high degree of file
compatibility across many synthesis platforms (again, mainly in the
consumer realm), Roland and Yamaha announced “extensions” to GM that
were, of course, incompatible with each other. These extensions gave
their devices more polyphony, effects like reverb and chorus, and an
expanded sound palette. Other manufacturers of domestic keyboards and
low-cost sound modules wanted to be able to improve the capabilities of
their products, as well, but didn’t want to have to license technology
from Roland or Yamaha, or invent their own. So, they clamored for a
nonproprietary expansion to GM.

GM Level 2 (formally adopted in November 1999) increases the minimum
polyphony of an instrument from 16 to 32 voices, defines more
controllers and more precisely than the original spec. For example, the
new spec includes a formula for mapping MIDI volume controller values
to amplitude in dB. (This is largely in response to a survey I designed
in the early ’90s on behalf of the MMA, in which it was found that
controllers were being used very differently by different
manufacturers.)

GM Level 2 also mandates and defines effects and significantly
increases the number of available sounds, both instrumental and
“rhythm,” or percussive, using Bank Change commands to augment the 128
program changes. The advantage of a GM-2 instrument over a DLS-1 is
simply that there is no sound set at all to download. So in
applications where there isn’t time or RAM for a downloadable sound
set, music can play instantly; even at dial-up connection speeds, a
MIDI file playing over the Internet is indistinguishable from one
playing over a MIDI cable.

The first units to adopt GM Level 2 are from Roland, interestingly
enough, and are the latest models in their Sound Canvas line, which
started the whole General MIDI movement; and Korg – surprisingly, in
its high-end Triton rack. More are expected to follow. Be sure to check
out the product intros at winter NAMM this month.

But things are happening at the other end of MIDI, too – the
professional end. The lowly MIDI cable, with its 31,250-bit/second
speed, is ridiculously slow compared to today’s networking and busing
capabilities, and that fact has not been lost on the MIDI developer
community. While MIDI over SCSI never was practical (SCSI is fast, but
it works in spurts, which is okay for buffered digital audio, but not
okay for the real-time control that MIDI requires), there have been
strong efforts to incorporate MIDI with the newest networking
protocols: USB and IEEE-1394, or FireWire.

USB MIDI interfaces have been around since early 1999. After Apple
released the first USB Macintoshes, manufacturers like Emagic, Roland,
Steinberg and Mark of the Unicorn scrambled to put out USB-compatible
MIDI interfaces. Now there are a dozen or more on the market, from
simple palm-sized 1-in, 1-out boxes to rackmount multicable interfaces
with SMPTE and audio I/O. Happily, a standard method for putting MIDI
on a USB cable is defined by the USB Implementers Forum (USB-IF,
www.usb.org). Unhappily, the MIDI Manufacturers Association never
endorsed the USB MIDI spec – and you’ll see why in a moment.

USB has been very successful in replacing, or at least displacing,
many of the disparate computer-networking formats like serial,
parallel, PCI or SCSI ports. Printers, modems, scanners, removable
media drives and gadgets we didn’t even know we needed just a couple of
years ago are now using USB cables. There are great advantages to USB,
such as the ability to connect up to 127 devices of all kinds to a
single computer (using bridges and hubs), automatic configuration (no
more IRQ or SCSI ID nightmares), the ability to “hot-swap” devices, and
higher potential throughput than any of the formats it replaces, with
the exception of SCSI.

So what’s the problem with MIDI? According to Jim Wright at IBM
Research, a longtime member of the MMA Technical Standards Board and
chairman of the organization’s working group concerned with new
transports, USB has timing problems that make it problematic for MIDI.
He has conducted tests comparing “classic” (i.e., serial, parallel, PCI
or PCMCIA) interfaces against USB interfaces, looking at their
round-trip latency (the amount of time it takes for a MIDI event to get
in and out of the interface) and their jitter (the variation in the
latency). He found the latency in the USB interfaces to be between
seven and eight milliseconds, about three times that of the classic
interfaces. This is not in itself an insurmountable problem, because
musicians adjust to small latencies in sound sources quite well – a
bass player and a lead guitarist standing seven feet away from each
other usually have no trouble staying together.

But the jitter in USB interfaces was also much higher than the older
interfaces – about twice as high, meaning (to continue our analogy)
that the two players could at any given moment be five feet away from
each other, and the next moment be 10 feet away – and constantly
moving. In another analogy, which Wright likes to use, imagine playing
a slightly arpeggiated guitar chord: The jitter could make it sound as
if one of your fingers jerked slightly while you were playing the
chord. And for tight grooves and thick MIDI data streams with lots of
aftertouch or controllers, this level of jitter is really unacceptable.
Wright also found that when you add audio to the USB stream, the jitter
goes up another 50% – so it’s three times what MIDI musicians have had
to deal with in the past.

Why is this the case? Well, the USB developers, according to Wright,
came to the MIDI community very late in their development stage, and
thus the MMA and its Japanese counterpart, AMEI, didn’t have much of a
chance to give their input about how MIDI on USB was going to be
handled (although Roland, acting on its own, got involved much
earlier). On a USB cable, MIDI uses asynchronous timing (that is,
there’s no underlying clock as there is with, say, AES/EBU digital
audio), which means if there’s a lot of traffic on the line, then the
MIDI data will be delivered in fits and starts, and there’s no
guaranteed delivery time, even under the best of circumstances. (The
same is true for a standard MIDI cable, but preventing this is what
multiport interfaces are for!)

Audio on USB, on the other hand, uses isochronous timing, which
means the delivery time is guaranteed. So the problem is further
compounded by the fact that because they use different timing schemes,
MIDI and audio data on the same USB cable can easily lose sync with
each other. Getting MIDI and audio to work together in perfect sync is
something software and hardware developers have labored hard for years
to achieve, and now we’re potentially seeing all those efforts being
tossed away.

The interface manufacturers are not unaware of these problems – it’s
this very issue that’s behind the huge advertising campaign that MOTU
has been running promoting its “MTS,” a proprietary system of
time-stamping MIDI events as they enter the USB cable to overcome USB’s
timing problems. Time-stamping of MIDI events has never really been
necessary before, because the latency and jitter of the synthesizers
themselves have been greater than that of any delays in the MIDI
network (or the resolution of MIDI itself, for that matter), but that’s
no longer true with USB. Emagic has followed MOTU’s lead and is using
its own version of time-stamping, and Steinberg is reportedly planning
something similar.

But it’s the same old song: None of these solutions are compatible
with each other, which negates the entire philosophy of MIDI and USB.
MOTU’s MTS works only if you have the company’s software and hardware
and not with Emagic’s hardware or Steinberg’s software, and vice versa,
et cetera, ad infinitum.

It’s the computer manufacturers who are potentially in the best
position to do something about this, and perhaps they will. Mac OS X
might include time-stamping in its MIDI drivers, according to some
sources. Doug Wyatt, the developer of the Opcode MIDI System, the best
software driver for multiport MIDI on the Macintosh (and the primary
casualty in the train wreck Gibson has made of that poor company – more
on this next month), is reportedly leading the OS X MIDI team, but
Apple isn’t saying much about it just yet. (And, sad to say, their
corporate track record on MIDI support has been consistently pretty
miserable.)

Similarly, according to Jim Wright, the Windows Streaming MIDI API
has a 1ms time-stamping feature already built-in, but it only works on
output, not on input. Microsoft’s DirectMusic supports time-stamping
(at a far greater resolution: 100 nanoseconds!), but apparently none of
the hardware interface makers are taking advantage of this yet.

Last April, the USB-IF (led by Intel) announced USB 2.0, in which
throughput is increased by a factor of (take a deep breath) 40. Will it
solve the timing problems? Until someone comes out with a USB 2.0
computer and a USB 2.0 MIDI interface and someone (else) tests them, we
won’t know.

IEEE-1394, though it’s more expensive, seems to hold a lot more
promise for the future of MIDI. I’ll talk about that next month, as
well as some new and proposed enhancements to the Standard MIDI File
spec and (dare it be whispered) the possibility of MIDI 2.0.

Starting early in the new year, Mix’s Web site, mixonline.com, will
have a whole new look. It’s going to be much more tightly integrated
with the sites of our sister magazines under the Intertec/Primedia
corporate umbrella, including Millimeter, Sound & Video Contractor,
Broadcast Engineering, Video Systems, Entertainment Design and, of
course, Electronic Musician, as well as other services like
Digibid.com. There will be some new features, and some current features
will be discontinued. I’ll continue to work with Mix Online, but I may
not be as visible a presence. I’ll still be writing this column, and I
can still be reached at mixonline@gis.net, so please stay in touch. And
thanks to the many thousands of you who have made developing and
running Mix Online so much fun these last three years.

Close