The
logarithm transformation and
square root transformation are commonly used for positive data, and the
multiplicative inverse transformation (
reciprocal transformation) can be used for non-zero data. The
power transformation is a family of transformations parameterized by a non-negative value λ that includes the logarithm, square root, and multiplicative inverse transformations as special cases. To approach data transformation systematically, it is possible to use
statistical estimation techniques to estimate the parameter λ in the power transformation, thereby identifying the transformation that is approximately the most appropriate in a given setting. Since the power transformation family also includes the identity transformation, this approach can also indicate whether it would be best to analyze the data without a transformation. In regression analysis, this approach is known as the
Box–Cox transformation. The reciprocal transformation, some power transformations such as the
Yeo–Johnson transformation, and certain other transformations such as applying the
inverse hyperbolic sine, can be meaningfully applied to data that include both positive and negative values (the power transformation is invertible over all real numbers if λ is an odd integer). However, when both negative and positive values are observed, it is sometimes common to begin by adding a constant to all values, producing a set of non-negative data to which any power transformation can be applied. 3. To assess whether normality has been achieved after transformation, any of the standard
normality tests may be used. A graphical approach is usually more informative than a formal statistical test and hence a
normal quantile plot is commonly used to assess the fit of a data set to a normal population. Alternatively, rules of thumb based on the sample
skewness and
kurtosis have also been proposed.
Transforming to a uniform distribution or an arbitrary distribution If we observe a set of
n values
X1, ...,
Xn with no ties (i.e., there are
n distinct values), we can replace
Xi with the transformed value
Yi =
k, where
k is defined such that
Xi is the
kth largest among all the
X values. This is called the
rank transform, and creates data with a perfect fit to a
uniform distribution. This approach has a
population analogue. Using the
probability integral transform, if
X is any
random variable, and
F is the
cumulative distribution function of
X, then as long as
F is invertible, the random variable
U =
F(
X) follows a uniform distribution on the
unit interval [0,1]. From a uniform distribution, we can transform to any distribution with an invertible cumulative distribution function. If
G is an invertible cumulative distribution function, and
U is a uniformly distributed random variable, then the random variable
G−1(
U) has
G as its cumulative distribution function. Putting the two together, if
X is any random variable,
F is the invertible cumulative distribution function of
X, and
G is an invertible cumulative distribution function then the random variable
G−1(
F(
X)) has
G as its cumulative distribution function.
Variance stabilizing transformations Many types of statistical data exhibit a "
variance-on-mean relationship", meaning that the variability is different for data values with different
expected values. As an example, in comparing different populations in the world, the variance of income tends to increase with mean income. If we consider a number of small area units (e.g., counties in the United States) and obtain the mean and variance of incomes within each county, it is common that the counties with higher mean income also have higher variances. A
variance-stabilizing transformation aims to remove a variance-on-mean relationship, so that the variance becomes constant relative to the mean. Examples of variance-stabilizing transformations are the
Fisher transformation for the sample correlation coefficient, the
square root transformation or
Anscombe transform for
Poisson data (count data), the
Box–Cox transformation for regression analysis, and the
arcsine square root transformation or angular transformation for proportions (
binomial data). While commonly used for statistical analysis of proportional data, the arcsine square root transformation is not recommended because
logistic regression or a
logit transformation are more appropriate for binomial or non-binomial proportions, respectively, especially due to decreased
type-II error. ==Transformations for multivariate data==