MarketSpectral band replication
Company Profile

Spectral band replication

Spectral band replication (SBR) is a technology to enhance audio or speech codecs, especially at low bit rates and is based on harmonic redundancy in the frequency domain.

History and use
A Swedish company Coding Technologies (acquired by Dolby in 2007) developed and pioneered the use of SBR in its MPEG-2 AAC-derived codec called aacPlus, which first appeared in 2001. This codec was submitted to MPEG and formed the basis of MPEG-4 High-Efficiency AAC (HE-AAC), standardized in 2003. Lars Liljeryd, Kristofer Kjörling, and Martin Dietz received the IEEE Masaru Ibuka Consumer Electronics Award in 2013 for their work developing and marketing HE-AAC. Coding Technologies' SBR method has also been used with WMA 10 Professional to create WMA 10 Pro LBR, and with MP3 to create mp3PRO. HE-AAC which uses SBR is used in broadcast systems like DAB+, Digital Radio Mondiale (including xHE-AAC), HD Radio, and XM Satellite Radio. If the player is not capable of using the side information that has been transmitted alongside the "normal" compressed audio data, it may still be able to play the "baseband" data (e.g. sampled at 22.05 kHz instead of 44.1 kHz) as usual, resulting in a dull (since the high frequencies are missing), but otherwise mostly acceptable sound. This is, for example, the case if an mp3PRO file is played back with MP3 software incapable of utilizing the SBR information. Opus's CELT part performs spectral folding on the MDCT bin level, making it a far less advanced but lower-delay technique compared to SBR. Dolby Digital Plus (E-AC3) performs Spectral Extension (SPX). SPX reduces high-frequency components to metadata and is similar to E-AC3 multichannel coupling calculation. Dolby AC-4 expands the technique to Advanced Spectral Extension (A-SPX), with the option of interleaving with regular, non-extended data in time or frequency domain. As a result, SPX can be selective disabled for difficult portions. == Methods ==
Methods
Encoding of SBR produces a downsampled (usually 2:1) audio signal and guidance information. In an early publication, the guiding data is described as being produced by quadrature mirror filter (QMF) analysis and an envelope estimator. Decoding of SBR requires transposing harmonics, a case of audio time stretching and pitch scaling. • A traditional approach starts with small intervals of discrete fourier transform (DFT), phase adjustments, IDFT, and ends with overlap-add. This method is sensitive to transient signals which can cause echos, requiring some padding (50% in USAC) in the DFT. • A newer approach is the QMF. One single filter bank can perform a whole time-stretch and pitch-scale operation for lower computational complexity. == See also ==
tickerdossier.comtickerdossier.substack.com