When the data points are equally spaced, an
analytical solution to the least-squares equations can be found. The polynomial, of degree
k is defined as Y = a_0 + a_1 z + a_2 z^2 + \dots + a_k z^k. The coefficients
a0,
a1 etc. are obtained by solving the
normal equations (bold
a represents a
vector, bold
J represents a
matrix). \mathbf{a} = \left( \mathbf{J}^\mathsf{T} \mathbf{J} \right)^{-1} \mathbf{J}^\mathsf{T} \mathbf{y}, where \mathbf J is a
Vandermonde matrix, that is row of \mathbf J has values 1, z_i, z_i^2, \dots. For example, for a
cubic polynomial fitted to 5 points,
z = −2, −1, 0, 1, 2 the normal equations are solved as follows. \mathbf{J} = \begin{pmatrix} 1 & -2 & 4 & -8 \\ 1 & - 1 & 1 & -1 \\ 1 & 0 & 0 & 0 \\ 1 & 1 & 1 & 1 \\ 1 & 2 & 4 & 8 \end{pmatrix} \begin{align} \mathbf{J}^\mathsf{T}\mathbf{J} &= \begin{pmatrix} m & \sum z & \sum z^2 & \sum z^3 \\ \sum z & \sum z^2 & \sum z^3 & \sum z^4 \\ \sum z^2 & \sum z^3 & \sum z^4 & \sum z^5 \\ \sum z^3 & \sum z^4 & \sum z^5 & \sum z^6 \end{pmatrix} \\[1ex] &= \begin{pmatrix} m & 0 & \sum z^2 & 0 \\ 0 & \sum z^2 & 0 & \sum z^4 \\ \sum z^2 & 0 & \sum z^4 & 0 \\ 0 & \sum z^4 & 0 & \sum z^6 \end{pmatrix} \\[1ex] &= \begin{pmatrix} 5 & 0 & 10 & 0 \\ 0 & 10 & 0 & 34 \\ 10 & 0 & 34 & 0 \\ 0 & 34 & 0 & 130 \end{pmatrix} \end{align} Now, the normal equations can be factored into two separate sets of equations, by rearranging rows and columns, with \mathbf{J}^\mathsf{T}\mathbf{J}_\text{even} = \begin{pmatrix} 5 & 10 \\ 10 & 34 \\ \end{pmatrix} \quad \mathrm{and} \quad \mathbf{J}^\mathsf{T}\mathbf{J}_\text{odd} = \begin{pmatrix} 10 & 34 \\ 34 & 130 \\ \end{pmatrix} Expressions for the inverse of each of these matrices can be obtained using
Cramer's rule \left(\mathbf{J}^\mathsf{T}\mathbf{J}\right)^{-1}_\text{even} = \frac{1}{70} \begin{pmatrix} 34 & -10\\ -10 & 5 \\ \end{pmatrix} \quad \mathrm{and} \quad \left(\mathbf{J}^\mathsf{T}\mathbf{J}\right)^{-1}_\text{odd} = \frac{1}{144} \begin{pmatrix} 130 & -34\\ -34 & 10 \\ \end{pmatrix} The normal equations become \begin{pmatrix} a_0 \\ a_2 \end{pmatrix}_j = \frac{1}{70} \begin{pmatrix} 34 & -10 \\ -10 & 5 \end{pmatrix} \begin{pmatrix} 1 & 1 & 1 & 1 & 1 \\ 4 & 1 & 0 & 1 & 4 \end{pmatrix} \begin{pmatrix}y_{j-2} \\ y_{j-1} \\ y_j \\ y_{j+1} \\ y_{j+2} \end{pmatrix} and \begin{pmatrix} a_1 \\ a_3 \end{pmatrix}_j= \frac{1}{144} \begin{pmatrix} 130 & -34 \\ -34 & 10 \end{pmatrix} \begin{pmatrix} -2 & -1 & 0 & 1 & 2 \\ -8 & -1 & 0 & 1 & 8 \end{pmatrix} \begin{pmatrix}y_{j-2} \\ y_{j-1} \\ y_j \\ y_{j+1} \\ y_{j+2} \\\end{pmatrix} Multiplying out and removing common factors, \begin{align} a_{0,j} &= \frac{1}{35} \left(-3y_{j-2} + 12y_{j-1} + 17y_j + 12y_{j+1} - 3y_{j+2}\right) \\[1ex] a_{1,j} &= \frac{1}{12} \left(y_{j-2} - 8y_{j-1} + 8y_{j+1} - y_{j+2}\right) \\[1ex] a_{2,j} &= \frac{1}{14} \left(2y_{j-2} - y_{j-1} - 2y_j -y_{j+1} + 2y_{j+2}\right) \\[1ex] a_{3,j} &= \frac{1}{12} \left(-y_{j-2} + 2y_{j-1} - 2y_{j+1} + y_{j+2}\right) \end{align} The coefficients of
y in these expressions are known as
convolution coefficients. They are elements of the matrix \mathbf{C} = \left(\mathbf{J}^\mathsf{T} \mathbf{J}\right)^{-1} \mathbf{J}^\mathsf{T} In general, (C \otimes y)_j\ = Y_j = \sum_{i=-s}^s C_i\, y_{j+i},\qquad s+1 \le j \le n-s, \;\; s = \frac{m-1}{2} In matrix notation this example is written as \begin{pmatrix} Y_3 \\ Y_4 \\ Y_5 \\ \vdots \end{pmatrix} = \frac{1}{35} \begin{pmatrix} - 3 & 12 & 17 & 12 & -3 & 0 & 0 & \cdots \\ 0 & -3 & 12 & 17 & 12 & -3 & 0 & \cdots \\ 0 & 0 & - 3 & 12 & 17 & 12 & -3 & \cdots \\ \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \ddots \end{pmatrix} \begin{pmatrix} y_1 \\ y_2 \\ y_3 \\ y_4 \\ y_5 \\ y_6 \\ y_7 \\ \vdots \end{pmatrix} Tables of convolution coefficients, calculated in the same way for
m up to 25, were published for the Savitzky–Golay smoothing filter in 1964, (2 and 3), (4 and 5) etc. give the same coefficients for smoothing and even derivatives. Polynomials of degree (1 and 2), (3 and 4) etc. give the same coefficients for odd derivatives.
Algebraic expressions It is not necessary always to use the Savitzky–Golay tables. The summations in the matrix
JT
J can be evaluated in
closed form, \begin{align} \sum_{z=-s}^s z^2 &= {m(m^2-1) \over 12}, & s = \frac{m-1}{2} \\ \sum_z z^4 &= {m(m^2-1)(3m^2-7) \over 240} \\ \sum_z z^6 &= {m(m^2-1)(3m^4-18m^2+31) \over 1344} \end{align} so that algebraic formulae can be derived for the convolution coefficients. Functions that are suitable for use with a curve that has an
inflection point are: • Smoothing, polynomial degree 2,3 : C_{0i} = \frac{{\left( {3m^2 - 7 - 20i^2 } \right)/4}}{{m\left( {m^2 - 4} \right)/3}}; \quad \frac{1-m}{2} \le i \le \frac{m-1}{2} (the range of values for
i also applies to the expressions below) • 1st derivative: polynomial degree 3,4 C_{1i} = \frac{ {5\left( {3m^4 - 18m^2 + 31} \right)i - 28\left( {3m^2 - 7} \right)i^3 }} {{m\left( {m^2 - 1} \right)\left( {3m^4 - 39m^2 + 108} \right)/15}} • 2nd derivative: polynomial degree 2,3 C_{2i} = \frac{{12mi^2 - m\left( {m^2 - 1} \right)}} {{m^2\left({m^2 - 1} \right) \left( {m^2 - 4} \right)/30}} • 3rd derivative: polynomial degree 3,4 C_{3i} = \frac{{ - \left( {3m^2 - 7} \right)i + 20i^3 }} {{m\left( {m^2 - 1} \right)\left( {3m^4 - 39m^2 + 108} \right)/420}} Simpler expressions that can be used with curves that don't have an inflection point are: • Smoothing, polynomial degree 0,1 (moving average): C_{0i} = \frac{1}{m} • 1st derivative, polynomial degree 1,2: C_{1i} = \frac{i}{m(m^2-1)/12} Higher derivatives can be obtained. For example, a fourth derivative can be obtained by performing two passes of a second derivative function.
Use of orthogonal polynomials An alternative to fitting
m data points by a simple polynomial in the subsidiary variable,
z, is to use
orthogonal polynomials. Y = b_0 P^0(z) + b_1 P^1(z) \cdots + b_k P^k(z). where
P0, ...,
Pk is a set of mutually orthogonal polynomials of degree 0, ...,
k. Full details on how to obtain expressions for the orthogonal polynomials and the relationship between the coefficients
b and
a are given by Guest. This is also equivalent to fitting the first (
m + 1)/2 points with the same polynomial, and similarly for the last points.
Weighting the data It is implicit in the above treatment that the data points are all given equal weight. Technically, the
objective function U = \sum_i w_i \left(Y_i - y_i\right)^2 being minimized in the least-squares process has unit weights,
wi = 1. When weights are not all the same the normal equations become \mathbf a = \left( \mathbf{J}^\mathsf{T} \mathbf{W} \mathbf{J} \right)^{-1} \mathbf{J}^\mathsf{T} \mathbf{W} \mathbf{y}, \qquad W_{i,i} \ne 1 , the quality of the smoothed curve can be improved compared to an unweighted (uniformly weighted) Savitzky-Golay filter with the same window length and polynomial degree. On the one hand, less residual noise is observed while a higher data fidelity is obtained on the other hand (except for the boundaries where the weights favour higher deviations).|400x400px If the same set of diagonal weights is used for all data subsets, W = \text{diag}(w_1,w_2,...,w_m), an analytical solution to the normal equations can be written down. For example, with a quadratic polynomial, \mathbf{J}^\mathsf{T}\mathbf{W}\mathbf{J} = \begin{pmatrix} \sum w_i~~~ & \sum w_i z_i & \sum w_i z_i^2 \\ \sum w_i z_i & \sum w_i z_i^2 & \sum w_i z_i^3 \\ \sum w_i z_i^2 & \sum w_i z_i^3 & \sum w_i z_i^4 \end{pmatrix} An explicit expression for the inverse of this matrix can be obtained using
Cramer's rule. A set of convolution coefficients may then be derived as \mathbf{C} = \left( \mathbf{J}^\mathsf{T} \mathbf{W} \mathbf{J} \right)^{-1} \mathbf{J}^\mathsf{T} \mathbf{W}. Alternatively the coefficients,
C, could be calculated in a
spreadsheet, employing a built-in matrix inversion routine to obtain the inverse of the normal equations matrix. This set of coefficients, once calculated and stored, can be used with all calculations in which the same weighting scheme applies. A different set of coefficients is needed for each different weighting scheme. It was shown that Savitzky–Golay filter can be improved by introducing weights that decrease at the ends of the fitting interval. == Two-dimensional convolution coefficients ==