What it is and How it Works

ATRAC Digital Audio Compression Technology

In order to provide approximately 74 minutes of music on the 2.5-inch MiniDisc, a digital audio compression technology called “ATRAC” (Adaptive Transform Acoustic Coding) has been newly deve- loped. This technology compresses information down to approximately one fifth of the amount of data usually required.

In 16-bit linear encoding, currently used in the CD and DAT formats, with a sampling frequency of

44.1kHz, the analog signal is sampled approxim ately once every 0.02 milliseconds. Each sample is quantized at 16-bit resolution into one of 65536 possible values. Therefore, with CD and DAT, when the analog signal is converted to digital data in real time, 16 bits of data are used every 0.02 millise- conds, regardless of the amplitude of the signal and whether or not a signal is present at all.

Waveform analysis:

Level

 

512 Samples

 

0.02 msec

 

Time

 

 

 

11.6 msec

analyze the waveform

 

 

 

 

during approx

 

 

11.6 msec into

 

 

frequency components

 

 

Frequency F1

Level

 

 

 

 

Frequency F4

Level

 

 

Level

 

Frequency Fn

 

 

ATRAC starts with the same 16-bit digital data but analyzes segments of the data for waveform content every 11.6 msec. Based on this analysis, ATRAC extracts and encodes only those frequency components that are actually audible to the human ear.

This method of encoding is far more efficient than the linear coding technique used for CD and DAT, yet sound quality remains comparable. The following underlying psychoacoustic principles are used during this conversion.

6DADC

A U S T R I A

Threshold of Hearing:

As sound level diminishes, there is a level below which the human ear cannot detect. This threshold varies with frequency. The threshold of audibility is lowest for sounds with a frequency of approximately 4kHz; that is, sounds close to this frequency are most easily detected by the ear. By analyzing the frequency components of an audio signal, it is possible to identify those components that lie below the threshold of hea- ring. Such components can be removed from the original signal without affecting perceived sound quality.

Masking Effect:

If two sounds, one loud and the other soft, are produced simultaneously and they are close to one another in frequency, the softer sound becomes difficult or even impossible to hear. Therefore, when an audio signal has a high level component and a low level component at neighbouring frequencies, the latter can be removed without affecting perceived sound quality. Moreover, with increasing overall signal amplitu- de, it becomes possible to remove a greater number of components without audible effect.

Psychoacoustic principles:

Sampling Distribution and Acoustic Effect

Threshold of

 

Hearing

Masking Effect

Level

F1

F4 F6

Fn

 

50

400

4k

20k

Sampling from ATRAC and its Level

Level

 

 

 

 

 

 

 

F1

F4 F6

Fn

 

 

 

 

50

400

4k

20k

Freq. (Hz)