The th raw moment (i.e., moment about zero) of a random variable X with density function f(x) is defined by\mu'_n = \langle X^{n} \rangle ~\overset{\mathrm{def}}{=}~ \begin{cases} \sum_i x^n_i f(x_i), & \text{discrete distribution} \\[1.2ex] \int x^n f(x) \, dx, & \text{continuous distribution} \end{cases}The th moment of a
real-valued continuous random variable with density function f(x) about a value c is the
integral\mu_n = \int_{-\infty}^\infty (x - c)^n\,f(x)\,\mathrm{d}x. It is possible to define moments for
random variables in a more general fashion than moments for real-valued functions – see
moments in metric spaces. The moment of a function, without further explanation, usually refers to the above expression with c=0. For the second and higher moments, the
central moment (moments about the mean, with
c being the mean) are usually used rather than the moments about zero, because they provide clearer information about the distribution's shape. Other moments may also be defined. For example, the th inverse moment about zero is \operatorname{E}\left[X^{-n}\right] and the th logarithmic moment about zero is \operatorname{E}\left[\ln^n(X)\right]. The th moment about zero of a probability density function f(x) is the
expected value of X^n and is called a
raw moment or
crude moment. The moments about its mean \mu are called
central moments; these describe the shape of the function, independently of
translation. If f is a
probability density function, then the value of the integral above is called the th moment of the
probability distribution. More generally, if
F is a
cumulative probability distribution function of any probability distribution, which may not have a density function, then the th moment of the probability distribution is given by the
Riemann–Stieltjes integral\mu'_n = \operatorname{E} \left[X^n\right] = \int_{-\infty}^\infty x^n\,\mathrm{d}F(x)where
X is a
random variable that has this cumulative distribution
F, and is the
expectation operator or mean. When\operatorname{E}\left[ \left|X^n \right| \right] = \int_{-\infty}^\infty \left|x^n\right|\,\mathrm{d}F(x) = \inftythe moment is said not to exist. If the th moment about any point exists, so does the th moment (and thus, all lower-order moments) about every point. The zeroth moment of any
probability density function is , since the area under any
probability density function must be equal to one.
Standardized moments The
normalised th central moment or standardised moment is the th central moment divided by ; the normalised th central moment of the random variable is \frac{\mu_n}{\sigma^n} = \frac{\operatorname{E}\left[(X - \mu)^n\right]}{\sigma^n} = \frac{\operatorname{E}\left[(X - \mu)^n\right]}{\operatorname{E}\left[(X - \mu)^2\right]^\frac{n}{2}} . These normalised central moments are
dimensionless quantities, which represent the distribution independently of any linear change of scale.
Notable moments Mean The first raw moment is the
mean, usually denoted \mu \equiv \operatorname{E}[X].
Variance The second
central moment is the
variance. The positive
square root of the variance is the
standard deviation \sigma \equiv \left(\operatorname{E}\left[(x - \mu)^2\right]\right)^\frac{1}{2}.
Skewness The third central moment is the measure of the lopsidedness of the distribution; any symmetric distribution will have a third central moment, if defined, of zero. The normalised third central moment is called the
skewness, often . A distribution that is skewed to the left (the tail of the distribution is longer on the left) will have a negative skewness. A distribution that is skewed to the right (the tail of the distribution is longer on the right), will have a positive skewness. For distributions that are not too different from the
normal distribution, the
median will be somewhere near ; the
mode about .
Kurtosis The fourth central moment is a measure of the heaviness of the tail of the distribution. Since it is the expectation of a fourth power, the fourth central moment, where defined, is always nonnegative; and except for a
point distribution, it is always strictly positive. The fourth central moment of a normal distribution is . The
kurtosis is defined to be the standardized fourth central moment. (Equivalently, as in the next section, excess kurtosis is the fourth
cumulant divided by the square of the second
cumulant.) If a distribution has heavy tails, the kurtosis will be high (sometimes called leptokurtic); conversely, light-tailed distributions (for example, bounded distributions such as the uniform) have low kurtosis (sometimes called platykurtic). The kurtosis can be positive without limit, but must be greater than or equal to ; equality only holds for
binary distributions. For unbounded skew distributions not too far from normal, tends to be somewhere in the area of and . The inequality can be proven by considering\operatorname{E}\left[\left(T^2 - aT - 1\right)^2\right]where . This is the expectation of a square, so it is non-negative for all
a; however it is also a quadratic
polynomial in
a. Its
discriminant must be non-positive, which gives the required relationship.
Higher moments High-order moments are moments beyond 4th-order moments. As with variance, skewness, and kurtosis, these are
higher-order statistics, involving non-linear combinations of the data, and can be used for description or estimation of further
shape parameters. The higher the moment, the harder it is to estimate, in the sense that larger samples are required in order to obtain estimates of similar quality. This is due to the excess
degrees of freedom consumed by the higher orders. Further, they can be subtle to interpret, often being most easily understood in terms of lower order moments – compare the higher-order derivatives of
jerk and
jounce in
physics. For example, just as the 4th-order moment (kurtosis) can be interpreted as "relative importance of tails as compared to shoulders in contribution to dispersion" (for a given amount of dispersion, higher kurtosis corresponds to thicker tails, while lower kurtosis corresponds to broader shoulders), the 5th-order moment can be interpreted as measuring "relative importance of tails as compared to center (
mode and shoulders) in contribution to skewness" (for a given amount of skewness, higher 5th moment corresponds to higher skewness in the tail portions and little skewness of mode, while lower 5th moment corresponds to more skewness in shoulders).
Mixed moments Mixed moments are moments involving multiple variables. The value E[X^k] is called the moment of order k (moments are also defined for non-integral k). The moments of the joint distribution of random variables X_1 ... X_n are defined similarly. For any integers k_i\geq0, the mathematical expectation E[{X_1}^{k_1}\cdots{X_n}^{k_n}] is called a mixed moment of order k (where k=k_1+...+k_n), and E[(X_1-E[X_1])^{k_1}\cdots(X_n-E[X_n])^{k_n}] is called a central mixed moment of order k. The mixed moment E[(X_1-E[X_1])(X_2-E[X_2])] is called the covariance and is one of the basic characteristics of dependency between random variables. Some examples are
covariance,
coskewness and
cokurtosis. While there is a unique covariance, there are multiple co-skewnesses and co-kurtoses. == Properties of moments ==