What it is and how it works, Atrac digital audio compression technology, Psychoacoustic principles – Sony MINIDISC User Manual

Page 8

Advertising
background image

Threshold of Hearing:

As sound level diminishes, there is a level below
which the human ear cannot detect. This thres-
hold varies with frequency. The threshold of
audibility is lowest for sounds with a frequency
of approximately 4kHz; that is, sounds close to
this frequency are most easily detected by the
ear. By analyzing the frequency components of
an audio signal, it is possible to identify those
components that lie below the threshold of hea-
ring. Such components can be removed from
the original signal without affecting perceived
sound quality.

Masking Effect:

If two sounds, one loud and the other soft, are
produced simultaneously and they are close to
one another in frequency, the softer sound be-
comes difficult or even impossible to hear.
Therefore, when an audio signal has a high level
component and a low level component at neigh-
bouring frequencies, the latter can be removed
without affecting perceived sound quality.
Moreover, with increasing overall signal amplitu-
de, it becomes possible to remove a greater num-
ber of components without audible effect.

What it is and How it Works

A

A

D

D

C

C

D

D

A U S T R I A

6

0.02 msec

11.6 msec

Level

analyze the waveform
during approx
11.6 msec into
frequency components

Time

Level

Frequency F1

512 Samples

Level

Frequency F4

Level

Frequency Fn

Waveform analysis:

ATRAC Digital Audio Compression
Technology

In order to provide approximately 74 minutes of
music on the 2.5-inch MiniDisc, a digital audio
compression technology called “ATRAC” (Adaptive
Transform Acoustic Coding) has been newly deve-
loped. This technology compresses information
down to approximately one fifth of the amount of
data usually required.
In 16-bit linear encoding, currently used in the CD
and DAT formats, with a sampling frequency of
44.1 kHz, the analog signal is sampled approxim
ately once every 0.02 milliseconds. Each sample is
quantized at 16-bit resolution into one of 65536
possible values. Therefore, with CD and DAT, when
the analog signal is converted to digital data in real
time, 16 bits of data are used every 0.02 millise-
conds, regardless of the amplitude of the signal and
whether or not a signal is present at all.

ATRAC starts with the same 16-bit digital data but
analyzes segments of the data for waveform content
every 11.6 msec. Based on this analysis, ATRAC
extracts and encodes only those frequency compo-
nents that are actually audible to the human ear.

This method of encoding is far more efficient than
the linear coding technique used for CD and DAT,
yet sound quality remains comparable. The following
underlying psychoacoustic principles are used during
this conversion.

Psychoacoustic principles:

Masking Effect

F6

Fn

F4

F1

50

400

4k

20k

Threshold of
Hearing

Level

Sampling Distribution and Acoustic Effect

Level

F6

Fn

F4

F1

50

400

4k

20k

Freq. (Hz)

Sampling from ATRAC and its Level

Advertising