The main principle behind CELP is called analysis-by-synthesis (AbS) and means that the encoding (analysis) is performed by perceptually optimizing the decoded (synthesis) signal in a closed loop. In theory, the best CELP stream would be produced by trying all possible bit combinations and selecting the one that produces the best-sounding decoded signal. This is obviously not possible in practice for two reasons: the required complexity is beyond any currently available hardware and the “best sounding” selection criterion implies a human listener. In order to achieve real-time encoding using limited computing resources, the CELP search is broken down into smaller, more manageable, sequential searches using a simple perceptual weighting function. Typically, the encoding is performed in the following order: •
Linear prediction coefficients (LPC) are computed and quantized, usually as
line spectral pairs (LSPs). • The adaptive (pitch) codebook is searched and its contribution removed. • The fixed (innovation) codebook is searched.
Noise weighting Most (if not all) modern audio codecs attempt to
shape the coding noise so that it appears mostly in the frequency regions where the ear cannot detect it. For example, the ear is more tolerant to noise in parts of the spectrum that are louder and vice versa. That's why instead of minimizing the simple quadratic error, CELP minimizes the error for the
perceptually weighted domain. The weighting filter W(z) is typically derived from the LPC filter by the use of
bandwidth expansion: :W(z) = \frac{A(z/\gamma_1)}{A(z/\gamma_2)} where \gamma_1 > \gamma_2. ==See also==