Fixed effects estimator Since \alpha_{i} is not observable, it cannot be directly
controlled for. The FE model eliminates \alpha_{i} by de-meaning the variables using the
within transformation: :y_{it}-\overline{y}_{i}=\left(X_{it}-\overline{X}_{i}\right) \beta+ \left( \alpha_{i} - \overline{\alpha}_{i} \right ) + \left( u_{it}-\overline{u}_{i}\right) \implies \ddot{y}_{it}=\ddot{X}_{it} \beta+\ddot{u}_{it} where \overline{y}_{i}=\frac{1}{T}\sum\limits_{t=1}^{T}y_{it}, \overline{X}_{i}=\frac{1}{T}\sum\limits_{t=1}^{T}X_{it}, and \overline{u}_{i}=\frac{1}{T}\sum\limits_{t=1}^{T}u_{it}. Since \alpha_{i} is constant, \overline{\alpha_{i}}=\alpha_{i} and hence the effect is eliminated. The FE estimator \hat{\beta}_{FE} is then obtained by an OLS regression of \ddot{y} on \ddot{X}. At least three alternatives to the
within transformation exist with variations: • One is to add a dummy variable for each individual i>1 (omitting the first individual because of
multicollinearity). This is numerically, but not computationally, equivalent to the fixed effect model and only works if the sum of the number of series and the number of global parameters is smaller than the number of observations. The dummy variable approach is particularly demanding with respect to computer memory usage and it is not recommended for problems larger than the available RAM, and the applied program compilation, can accommodate. • Second alternative is to use consecutive reiterations approach to local and global estimations. This approach is very suitable for low memory systems on which it is much more computationally efficient than the dummy variable approach. • The third approach is a nested estimation whereby the local estimation for individual series is programmed in as a part of the model definition. This approach is the most computationally and memory efficient, but it requires proficient programming skills and access to the model programming code; although, it can be programmed including in SAS. Finally, each of the above alternatives can be improved if the series-specific estimation is linear (within a nonlinear model), in which case the direct linear solution for individual series can be programmed in as part of the nonlinear model definition.
First difference estimator An alternative to the within transformation is the
first difference transformation, which produces a different estimator. For t=2,\dots,T: :y_{it}-y_{i,t-1}=\left(X_{it}-X_{i,t-1}\right) \beta+ \left( \alpha_{i} - \alpha_{i} \right ) + \left( u_{it}-u_{i,t-1}\right) \implies \Delta y_{it}=\Delta X_{it} \beta+ \Delta u_{it}. The FD estimator \hat\beta_{FD} is then obtained by an OLS regression of \Delta y_{it} on \Delta X_{it}. When T=2, the first difference and fixed effects estimators are numerically equivalent. For T>2, they are not. If the error terms u_{it} are
homoskedastic with no
serial correlation, the fixed effects estimator is more
efficient than the first difference estimator. If u_{it} follows a
random walk, however, the first difference estimator is more efficient. ==== Equality of fixed effects and first difference estimators when T=2 ==== For the special two period case (T=2), the fixed effects (FE) estimator and the first difference (FD) estimator are numerically equivalent. This is because the FE estimator effectively "doubles the data set" used in the FD estimator. To see this, establish that the fixed effects estimator is: {FE}_{T=2}= \left[ (x_{i1}-\bar x_{i}) (x_{i1}-\bar x_{i})' + (x_{i2}-\bar x_{i}) (x_{i2}-\bar x_{i})' \right]^{-1}\left[ (x_{i1}-\bar x_{i}) (y_{i1}-\bar y_{i}) + (x_{i2}-\bar x_{i}) (y_{i2}-\bar y_{i})\right] Since each (x_{i1}-\bar x_{i}) can be re-written as (x_{i1}-\dfrac{x_{i1}+x_{i2}}{2})=\dfrac{x_{i1}-x_{i2}}{2} , we'll re-write the line as: {FE}_{T=2}= \left[\sum_{i=1}^{N} \dfrac{x_{i1}-x_{i2}}{2} \dfrac{x_{i1}-x_{i2}}{2} ' + \dfrac{x_{i2}-x_{i1}}{2} \dfrac{x_{i2}-x_{i1}}{2} ' \right]^{-1} \left[\sum_{i=1}^{N} \dfrac{x_{i1}-x_{i2}}{2} \dfrac{y_{i1}-y_{i2}}{2} + \dfrac{x_{i2}-x_{i1}}{2} \dfrac{y_{i2}-y_{i1}}{2} \right] := \left[\sum_{i=1}^{N} 2 \dfrac{x_{i2}-x_{i1}}{2} \dfrac{x_{i2}-x_{i1}}{2} ' \right]^{-1} \left[\sum_{i=1}^{N} 2 \dfrac{x_{i2}-x_{i1}}{2} \dfrac{y_{i2}-y_{i1}}{2} \right] := 2\left[\sum_{i=1}^{N} (x_{i2}-x_{i1})(x_{i2}-x_{i1})' \right]^{-1} \left[\sum_{i=1}^{N} \frac{1}{2} (x_{i2}-x_{i1})(y_{i2}-y_{i1}) \right] : = \left[\sum_{i=1}^{N} (x_{i2}-x_{i1})(x_{i2}-x_{i1})' \right]^{-1} \sum_{i=1}^{N} (x_{i2}-x_{i1})(y_{i2}-y_{i1}) ={FD}_{T=2}
Chamberlain method Gary Chamberlain's method, a generalization of the within estimator, replaces \alpha_{i} with its
linear projection onto the explanatory variables. Writing the linear projection as: :\alpha_{i} = \lambda_0 + X_{i1} \lambda_1 + X_{i2} \lambda_2 + \dots + X_{iT} \lambda_T + e_i this results in the following equation: :y_{it} = \lambda_0 + X_{i1} \lambda_1 + X_{i2} \lambda_2 + \dots + X_{it}(\lambda_t + \mathbf{\beta}) + \dots + X_{iT} \lambda_T + e_i + u_{it} which can be estimated by
minimum distance estimation.
Hausman–Taylor method Need to have more than one time-variant regressor (X) and time-invariant regressor (Z) and at least one X and one Z that are uncorrelated with \alpha_{i}. Partition the X and Z variables such that \begin{array} [c]{c} X=[\underset{TN\times K1}{X_{1it}}\vdots\underset{TN\times K2}{X_{2it}}]\\ Z=[\underset{TN\times G1}{Z_{1it}}\vdots\underset{TN\times G2}{Z_{2it}}] \end{array} where X_{1} and Z_{1} are uncorrelated with \alpha_{i}. Need K1>G2. Estimating \gamma via OLS on \widehat{di}=Z_{i}\gamma+\varphi_{it} using X_1 and Z_1 as instruments yields a consistent estimate.
Generalization with input uncertainty When there is input uncertainty for the y data, \delta y, then the \chi^2 value, rather than the sum of squared residuals, should be minimized. This can be directly achieved from substitution rules: :\frac{y_{it}}{\delta y_{it}} = \mathbf{\beta}\frac{X_{it}}{\delta y_{it}}+\alpha_{i}\frac{1}{\delta y_{it}}+\frac{u_{it}}{\delta y_{it}}, then the values and standard deviations for \mathbf{\beta} and \alpha_{i} can be determined via classical
ordinary least squares analysis and
variance-covariance matrix. ==Use to test for consistency ==