Beginning with Ronald Fisher, the intraclass correlation has been regarded within the framework of
analysis of variance (ANOVA), and more recently in the framework of
random effects models. A number of ICC estimators have been proposed. Most of the estimators can be defined in terms of the random effects model : Y_{ij} = \mu + \alpha_j + \varepsilon_{ij}, where
Yij is the
ith observation in the
jth group,
μ is an unobserved overall
mean,
αj is an unobserved random effect shared by all values in group
j, and
εij is an unobserved noise term. For the model to be identified, the
αj and
εij are assumed to have expected value zero and to be uncorrelated with each other. Also, the
αj are assumed to be identically distributed, and the
εij are assumed to be identically distributed. The variance of
αj is denoted
σ and the variance of
εij is denoted
σ. The population ICC in this framework is : \frac{\sigma_\alpha^2}{\sigma_\alpha^2+\sigma_\varepsilon^2}. With this framework, the ICC is the
correlation of two observations from the same group. For a one-way random effects model: Y_{ij}=\mu+\alpha_i+\epsilon_{ij} \alpha_i \sim N(0,\sigma_\alpha^2), \epsilon_{ij} \sim N(0,\sigma_\varepsilon^2), \alpha_is and \epsilon_{ij}s independent and \alpha_is are independent from \epsilon_{ij}s. The variance of any observation is: Var(Y_{ij})=\sigma_\varepsilon^2 + \sigma_\alpha^2 The covariance of two observations from the same group i (for j \neq k) is: \begin{align} \text{Cov}(Y_{ij}, Y_{ik}) &= \text{Cov}(\mu + \alpha_i + \epsilon_{ij}, \mu + \alpha_i + \epsilon_{ik}) \\ &= \text{Cov}(\alpha_i + \epsilon_{ij}, \alpha_i + \epsilon_{ik}) \\ &= \text{Cov}(\alpha_i, \alpha_i) + 2\text{Cov}(\alpha_i, \epsilon_{ik}) + \text{Cov}(\epsilon_{ij}, \epsilon_{ik}) \\ &= \text{Cov}(\alpha_i, \alpha_i) \\ &= \text{Var}(\alpha_i) \\ &= \sigma^2_\alpha .\\ \end{align} In this, we've used
properties of the covariance. Put together we get: \text{Cor}(Y_{ij}, Y_{ik}) = \frac{\text{Cov}(Y_{ij}, Y_{ik})}{\sqrt{Var(Y_{ij})Var(Y_{ik})}} = \frac{\sigma^2_\alpha }{\sigma_\varepsilon^2 + \sigma_\alpha^2} An advantage of this ANOVA framework is that different groups can have different numbers of data values, which is difficult to handle using the earlier ICC statistics. This ICC is always non-negative, allowing it to be interpreted as the proportion of total variance that is "between groups." This ICC can be generalized to allow for covariate effects, in which case the ICC is interpreted as capturing the within-class similarity of the covariate-adjusted data values. This expression can never be negative (unlike Fisher's original formula) and therefore, in samples from a population which has an ICC of 0, the ICCs in the samples will be higher than the ICC of the population. A number of different ICC statistics have been proposed, not all of which estimate the same population parameter. There has been considerable debate about which ICC statistics are appropriate for a given use, since they may produce markedly different results for the same data. ==Relationship to Pearson's correlation coefficient==