The sample mean is the average of the values of a variable in a sample, which is the sum of those values divided by the number of values. Using
mathematical notation, if a sample of
N observations on variable
X is taken from the population, the sample mean is: : \bar{X}=\frac{1}{N}\sum_{i=1}^{N}X_{i}. Under this definition, if the sample (1, 4, 1) is taken from the population (1,1,3,4,0,2,1,0), then the sample mean is \bar{x} = (1+4+1)/3 = 2, as compared to the population mean of \mu = (1+1+3+4+0+2+1+0) /8 = 12/8 = 1.5. Even if a sample is random, it is rarely perfectly representative, and other samples would have other sample means even if the samples were all from the same population. The sample (2, 1, 0), for example, would have a sample mean of 1. If the statistician is interested in
K variables rather than one, each observation having a value for each of those
K variables, the overall sample mean consists of
K sample means for individual variables. Let x_{ij} be the
ith independently drawn observation (
i=1,...,
N) on the
jth random variable (
j=1,...,
K). These observations can be arranged into
N column vectors, each with
K entries, with the
K×1 column vector giving the
i-th observations of all variables being denoted \mathbf{x}_i (
i=1,...,
N). The
sample mean vector \mathbf{\bar{x}} is a column vector whose
j-th element \bar{x}_{j} is the average value of the
N observations of the
jth variable: : \bar{x}_{j}=\frac{1}{N} \sum_{i=1}^{N} x_{ij},\quad j=1,\ldots,K. Thus, the sample mean vector contains the average of the observations for each variable, and is written : \mathbf{\bar{x}}=\frac{1}{N}\sum_{i=1}^{N}\mathbf{x}_i = \begin{bmatrix} \bar{x}_1 \\ \vdots \\ \bar{x}_j \\ \vdots \\ \bar{x}_K \end{bmatrix} ==Definition of sample covariance==