Psychoacoustics has long enjoyed a symbiotic relationship with
computer science. Internet pioneers
J. C. R. Licklider and
Bob Taylor both completed graduate-level work in psychoacoustics, while
BBN Technologies originally specialized in consulting on acoustics issues before it began building the first
packet-switched network. Licklider wrote a paper entitled "A duplex theory of pitch perception". Psychoacoustics is applied within many fields of software development, where developers map proven and experimental mathematical patterns in digital signal processing. Many audio compression codecs such as
MP3 and
Opus use a psychoacoustic model to increase compression ratios. The success of
conventional audio systems for the reproduction of music in theatres and homes can be attributed to psychoacoustics and psychoacoustic considerations gave rise to novel audio systems, such as psychoacoustic
sound field synthesis. Furthermore, scientists have experimented with limited success in creating new acoustic weapons, which emit frequencies that may impair, harm, or kill. Psychoacoustics are also leveraged in
sonification to make multiple independent data dimensions audible and easily interpretable. This enables auditory guidance without the need for spatial audio and in
sonification computer games and other applications, such as
drone flying and
image-guided surgery. It is also applied today within music, where musicians and artists continue to create new auditory experiences by masking unwanted frequencies of instruments, causing other frequencies to be enhanced. Yet another application is in the design of small or lower-quality loudspeakers, which can use the phenomenon of
missing fundamentals to give the effect of bass notes at lower frequencies than the loudspeakers are physically able to produce (see references). Automobile manufacturers engineer their engines and even doors to have a certain sound. == Perceptual audio coding == The psychoacoustic model provides for high-quality
lossy signal compression by describing which parts of a given digital audio signal can be removed or reproduced with reduced quality without significant loss in the perceived quality of the sound. This provides great benefit to the overall compression ratio, and psychoacoustic analysis routinely leads to compressed music files that are one-tenth to one-twelfth the size of high-quality masters, but with discernibly less proportional quality loss. Such compression is a feature of nearly all modern lossy audio compression formats. Some of these formats include
Dolby Digital (AC-3),
MP3,
Opus,
Ogg Vorbis,
AAC,
WMA,
MPEG-1 Layer II (used for
digital audio broadcasting in several countries), and
ATRAC, the compression used in
MiniDisc and some
Walkman models. Psychoacoustics is based heavily on
human anatomy, especially the ear's limitations in perceiving sound as outlined previously. To summarize, the main limitations are: •
High-frequency limit •
Absolute threshold of hearing •
Temporal masking (forward masking, backward masking) •
Simultaneous masking (also known as spectral masking) A compression algorithm can assign a lower priority to sounds outside the range of human hearing and reduce the precision of different frequencies according to the predicted masking level. By carefully shifting bits away from the unimportant components and toward the important ones, the algorithm ensures that the sounds a listener is most likely to perceive are most accurately represented. Audio encoders analyse audio using a perceptual model (psychoacoustic model), in order to compute the required precision per frequency band or temporal section. Results of this computation are then used to adjust coding precision as a function of frequency and time through a set of coding tools that are dependent on the audio encoding format, as different formats support different coding tools. Examples of such coding tools are: • Frequency filtering (
lowpass,
highpass) • Transform window selection (size and model) •
Joint stereo coding •
Parametric stereo •
Sample requantization •
Non-linear quantization •
Vector quantization •
Temporal noise shaping (TNS) • Perceptual noise substitution (PNS) •
Spectral band replication (SBR) In many encoders, a rate control algorithm ensures that the resulting bitrate of the coded audio is within defined limits. If
transparent coding can't be achieved at the target bitrate, then the rate control algorithms will adjust coding precision (and thus introduce distortion) in various parts of the sound spectrum, using guidance from data computed by the psychoacoustic model, until the target bitrate can be matched. ==See also==