Residual sum of squares

In statistics, the residual sum of squares (RSS), also known as the sum of squared residuals (SSR) or the sum of squared estimate of errors (SSE), is the sum of the squares of residuals. It is a measure of the discrepancy between the data and an estimation model, such as a linear regression. A small RSS indicates a tight fit of the model to the data. It is used as an optimality criterion in parameter selection and model selection.

One explanatory variable

In a model with a single explanatory variable, RSS is given by: \operatorname{RSS} = \sum_{i=1}^n \left(y_i - f(x_i)\right)^2 where yi is the ith value of the variable to be predicted, xi is the ith value of the explanatory variable, and f(x_i) is the predicted value of yi (also termed \hat{y_i}). In a standard linear simple regression model, y_i = \alpha + \beta x_i+\varepsilon_i\,, where \alpha and \beta are coefficients, y and x are the regressand and the regressor, respectively, and ε is the error term. The sum of squares of residuals is the sum of squares of \widehat{\varepsilon\,}_i; that is \operatorname{RSS} = \sum_{i=1}^n \left(\widehat{\varepsilon}_i\right)^2 = \sum_{i=1}^n \left(y_i - (\widehat{\alpha\,} + \widehat{\beta}\, x_i)\right)^2 where \widehat{\alpha\,} is the estimated value of the constant term \alpha and \widehat{\beta\,} is the estimated value of the slope coefficient \beta. ==Matrix expression for the OLS residual sum of squares==

Matrix expression for the OLS residual sum of squares

The general regression model with observations and explanators, the first of which is a constant unit vector whose coefficient is the regression intercept, is y = X \beta + e where is an n × 1 vector of dependent variable observations, each column of the n × k matrix is a vector of observations on one of the k explanators, \beta is a k × 1 vector of true coefficients, and is an n× 1 vector of the true underlying errors. The ordinary least squares estimator for \beta is \begin{align} &X \hat \beta = y \\[1ex] \iff & X^\operatorname{T} X \hat \beta = X^\operatorname{T} y \\[1ex] \iff & \hat \beta = \left(X^\operatorname{T} X\right)^{-1}X^\operatorname{T} y. \end{align} The residual vector \hat e = y - X \hat \beta = y - X (X^\operatorname{T} X)^{-1}X^\operatorname{T} y; so the residual sum of squares is: \operatorname{RSS} = \hat e ^\operatorname{T} \hat e = \left\| \hat e \right\|^2 , (equivalent to the square of the norm of residuals). In full: \begin{align} \operatorname{RSS} &= y^\operatorname{T} y - y^\operatorname{T} X \left(X^\operatorname{T} X\right)^{-1} X^\operatorname{T} y \\[1ex] &= y^\operatorname{T} \left[I - X \left(X^\operatorname{T} X\right)^{-1} X^\operatorname{T}\right] y \\[1ex] &= y^\operatorname{T} \left[I - H\right] y, \end{align} where is the hat matrix, or the projection matrix in linear regression. == Relation with Pearson's product-moment correlation ==

Relation with Pearson's product-moment correlation

The least-squares regression line is given by y = ax + b, where b=\bar{y}-a\bar{x} and a=\frac{S_{xy}}{S_{xx}}, where S_{xy}=\sum_{i=1}^n(\bar{x}-x_i)(\bar{y}-y_i) and S_{xx}=\sum_{i=1}^n(\bar{x}-x_i)^2. Therefore, \begin{align} \operatorname{RSS} & = \sum_{i=1}^n \left(y_i - f(x_i)\right)^2 = \sum_{i=1}^n \left(y_i - (ax_i+b)\right)^2 \\[1ex] &= \sum_{i=1}^n \left(y_i - ax_i-\bar{y} + a\bar{x}\right)^2 = \sum_{i=1}^n \left[a\left(\bar{x} - x_i\right) - \left(\bar{y} - y_i\right)\right]^2 \\[1ex] &= a^2 S_{xx} - 2aS_{xy} + S_{yy} = S_{yy} - aS_{xy} \\[1ex] &=S_{yy} \left(1 - \frac{S_{xy}^2}{S_{xx} S_{yy}} \right) \end{align} where S_{yy}=\sum_{i=1}^n (\bar{y}-y_i)^2 . The Pearson product-moment correlation is given by r=\frac{S_{xy}}{\sqrt{S_{xx}S_{yy}}}; therefore, \operatorname{RSS}=S_{yy}(1-r^2). ==See also==

Source: Wikipedia ↗

tickerdossier.com tickerdossier.substack.com