Alias-Intermodulation Distortion and a2i

 

 

 

Alias-Intermodulation Distortion is a relatively recently discovered form of distortion, first positively identified and named by Richard Black, founder and proprietor of Musaeus. It is a subtle but apparently very common form of distortion inherent in most digital recording/replay chains, but is unusual in that it effectively relies on a ‘two-point failure’ for its existence - to put it another way, it is the result of two simultaneous nonlinear mechanisms.

It is well known that aliasing is a possible distortion in digital recording and replay, effectively ‘reflecting’ frequencies about an axis at half the sampling rate. So well known is it that all manufacturers of conventional digital recording and replay equipment guard against it by using very sharp filters which prevent practically all ‘aliasable’ frequencies from getting out and causing trouble.

Astonishingly, though, almost all the filters used by the hundreds of equipment manufacturers worldwide do in fact allow some aliasing. It is normally swept under the carpet (when its existence is recognised at all) by assuring the user that all the aliasing involve frequencies (including distortion products) above 20kHz, which are inaudible. Leaving aside the sizeable can of worms inherent in any discussion of whether information above 20kHz is strictly inaudible, Richard Black’s important realisation was that in practice the aliasing distortion products will intermodulate (‘beat’) with legitimate audio frequencies as a result of the rather poor linearity of most loudspeakers, pretty much irrespective of price. The result of this two-stage distortion process is typically in the region from DC to a few kHz, right across the region of the ear’s maximum sensitivity, and completely unrelated to the original audio harmonically, implying that it may be audible is extremely small amounts. Clearly the amount of AID produced will depend strongly on the amount of very high frequency content in the music in the first place, but estimates suggest that with some programme material the distortion could be as little as 75dB below full level. At such levels, distortion products are unlikely to be distinctly audible but experience and anecdotal evidence suggests that as low as -100dB certain forms of distortion may result in a general blurring of detail in good recordings.

There are various cures for AID. Recordings made with an analogue-to-digital convertor (ADC) which incorporates a sufficiently rapid anti-alias filter will not suffer from it: ‘sufficiently rapid’ means in practice at least 40dB down by 21.5kHz for the case of a CD (44.1kHz sampling) system. Likewise, CD players incorporating a similarly rapid anti-imaging filter will prevent AID with any recording. Better loudspeakers would be useful, too, though that is harder to implement.

A fourth possibility is to use a filter at the mastering stage in preparing a recording to cut sharply above 20kHz, which at least can be done with software and avoids the necessity for new and expensive hardware. This is the basis of the Anti-Alias-Intermod (AAI or a2i) process. It is nothing more than an extremely sharp low-pass filter with a cutoff frequency just above 20kHz, and in its current version is implemented using Cool Edit, a commercially available digital audio editing and processing package. If you know what you are doing you can process recordings yourself: if you want us to do it for you (for a fee), get in touch.

There are several implications of Richard Black’s work on AID and a2i, some of them considerably at odds with many currently accepted armchair theories about digital audio. The first and most solid is that theories which seek to explain any claimed deficiencies in digital audio in terms of the ‘dispersion’ or ‘ringing’ of anti-alias/anti-image filters are wide of the mark. The filter used in the current implementation of a2i has an impulse response which is 24 thousand samples (0.54 seconds) long. If the effects of a typical 50-sample long (1ms impulse response) FIR filter were audible, surely the a2i filter should practically obliterate the music. In fact, it makes things somewhere between considerably clearer and barely perceptibly different, depending on programme material and the speakers or headphones used for listening. (Note: it has been pointed out by the late Julian Dunn that this use of the term ‘dispersion’ is wrong. Dispersion is strictly a function of the passband ripple amplitude and periodicity of a filter. However, many articles have already been written which use dispersion to refer to the ‘energy smearing’ which occurs due to a steep filter slope and that is the sense in which we use it here, if only for consistency with the very hypotheses I am disputing.)

A second implication is that those who complained so bitterly and for so long about an inherent ‘digital sound’ may after all have had a point, since practically every system they have ever heard will exhibit AID. One of the very few commercially available products on the consumer side which was largely free of AID is the PMD100 oversampling filter from Pacific Microsonics, as incorporated in HDCD-compatible CD players and DACs. Significantly, a perusal of reviews of HDCD-compatible replay equipment reveals a remarkable convergence of opinion among reviewers on the high standard of such players with non-HDCD encoded material. (Professional users seeking an AID-free ADC should look at the 2402 from Digital Audio Denmark, or the ADA2496 from TC Electronic.)

A third implication may be that sampling rates higher than 44.1kHz are not after all justified. Actually, far stronger support for that comes from one of the pieces of anecdotal evidence which sparked Richard Black’s investigations of AID, a report by leading mastering engineer Bob Katz on an experiment in which he and colleagues auditioned material sampled at 96kHz before and after bandlimiting (without changing sample rate) to 20kHz and found no difference.

Fourth, highly noise-shaped flavours of dither (yes, I know there is an important distinction between dither and noise-shaping, but what is important here is the fact that both add noise which may be of carefully tailored spectral distribution) must be called into question. The fundamental premise of noise shaping is that the ear is much less sensitive to noise in the top few kHz of the audio band, which is true, but the same intermodulation process that causes aliasing to become audible as AID can in principle produce signal-related downband noise of marginally audible magnitude - effectively, a novel form of modulation noise. Just how much noise-shaping is permissible before the intermodulation products become audible? This may make the case for longer words (independent of sampling rate) stronger. Note that I am note invoking any kind of aliasing here, simply the intermodulation that loudspeakers tend to produce.

Fifth, AID is effectively completely absent in all systems using data reduction, such as MP3, MPEG2, ATRAC (MiniDisc) and so on. This is perhaps a trivial observation in view of the fact that such systems have many other distortion mechanisms of potentially greater seriousness, but anyone who has worked with them in detail will readily confirm that in practice they all curtail the audio band to well below 20kHz with any real music programme, whatever upper limit they may claim with sine wave stimulus. Ironic, at least!

Sixth, given the same considerations that apply in the case of highly noise-shaped dither (above), what of SACD? This appears to have rather high levels of noise in the not-so-far ultrasonic band. Anyone have any comments?

A more detailed technical analysis of AID and a2i was presented by Richard Black in a paper to the 106th Convention of the Audio Engineering Society in Munich, Germany, in May 1999. The paper is available on this site. There is also a good paper on AID on the Digital Audio Denmark site.

Of course, there is much more to a good digitial recording and replay chain than just absence of AID. One of the best sources of information about digital audio in general (and an encyclopaedic source of relevant links) is Bob Katz’ excellent Digital Domain website. Several papers of relevance and interest can also be found on the website of Julian Dunn’s Nanophon.

To read more about Musaeus’s technology.....

Or to return to the homepage.....

Here is a little demonstration of the kind of ultrasonic levels that can exist on a CD. At top is a spectrum plot (actually a screen grab from the excellent Cool Edit software package) showing the presence of a fairly strong harmonic of a solo violin on a commercially available (and generally rather good) recording at about 21.6kHz. Below it is the same spot in the same recording, after a2i filtering.