Exchangeable random variables

In statistics, an exchangeable sequence of random variables is a sequence X1, X2, X3, ... whose joint probability distribution does not change when the positions in the sequence in which finitely many of them appear are altered. In other words, the joint distribution is invariant to finite permutation. Thus, for example the sequences

Definition

Formally, an exchangeable sequence of random variables is a finite or infinite sequence X1, X2, X3, … of random variables such that for any finite permutation σ of the indices 1, 2, 3, … (the permutation acts on only finitely many indices, with the rest fixed) the joint probability distribution of the permuted sequence : X_{\sigma(1)}, X_{\sigma(2)}, X_{\sigma(3)}, \dots is the same as the joint probability distribution of the original sequence. A sequence E1, E2, E3, … of events is said to be exchangeable iff the sequence of its indicator functions is exchangeable. The distribution function FX1, …, Xn(x1, …, xn) of a finite sequence of exchangeable random variables is symmetric in its arguments Olav Kallenberg provided an appropriate definition of exchangeability for continuous-time stochastic processes. == History ==

History

The concept was introduced by William Ernest Johnson in his 1924 book Logic, Part III: The Logical Foundations of Science. Exchangeability is equivalent to the concept of statistical control introduced by Walter Shewhart also in 1924. == Exchangeability and the i.i.d. statistical model ==

Exchangeability and the i.i.d. statistical model

The property of exchangeability is closely related to the use of independent and identically distributed (i.i.d.) random variables in statistical models. A sequence of random variables that are i.i.d., conditional on some underlying distributional form, is exchangeable. This follows directly from the structure of the joint probability distribution generated by the i.i.d. form. Mixtures of exchangeable sequences (in particular, sequences of i.i.d. variables) are exchangeable. The converse can be established for infinite sequences, through an important representation theorem by Bruno de Finetti (later extended by other probability theorists such as Halmos and Savage). The extended versions of the theorem show that in any infinite sequence of exchangeable random variables, the random variables are conditionally i.i.d., given the underlying distributional form. This theorem is stated briefly below. (De Finetti’s original theorem only showed this to be true for random indicator variables, but this was later extended to encompass all sequences of random variables.) Another way of putting this is that de Finetti’s theorem characterizes exchangeable sequences as mixtures of i.i.d. sequences—while an exchangeable sequence need not itself be unconditionally i.i.d., it can be expressed as a mixture of underlying i.i.d. sequences. The representation theorem: This statement is based on the presentation in O’Neill (2009) in the references below. Given an infinite sequence of random variables \mathbf{X}=(X_1,X_2,X_3,\ldots) we define the limiting empirical distribution function F_\mathbf{X} by : F_\mathbf{X}(x) = \lim_{n\to\infty} \frac{1}{n} \sum_{i=1}^n I(X_i \le x). (This is the Cesàro limit of the indicator functions. In cases where the Cesàro limit does not exist this function can actually be defined as the Banach limit of the indicator functions, which is an extension of this limit. This latter limit always exists for sums of indicator functions, so that the empirical distribution is always well-defined.) This means that for any vector of random variables in the sequence we have joint distribution function given by : \Pr (X_1 \le x_1,X_2 \le x_2,\ldots,X_n \le x_n) = \int \prod_{i=1}^n F_\mathbf{X}(x_i)\,dP(F_\mathbf{X}). If the distribution function F_\mathbf{X} is indexed by another parameter \theta then (with densities appropriately defined) we have : p_{X_1,\ldots,X_n}(x_1,\ldots,x_n) = \int \prod_{i=1}^n p_{X_i}(x_i\mid\theta)\,dP(\theta). These equations show the joint distribution or density characterised as a mixture distribution based on the underlying limiting empirical distribution (or a parameter indexing this distribution). Note that not all finite exchangeable sequences are mixtures of i.i.d.. To see this, consider sampling without replacement from a finite set until no elements are left. The resulting sequence is exchangeable, but not a mixture of i.i.d.. Indeed, conditioned on all other elements in the sequence, the remaining element is known. == Covariance and correlation ==

Covariance and correlation

Exchangeable sequences have some basic covariance and correlation properties which mean that they are generally positively correlated. For infinite sequences of exchangeable random variables, the covariance between the random variables is equal to the variance of the mean of the underlying distribution function. == Examples ==

Examples

• Any convex combination or mixture distribution of iid sequences of random variables is exchangeable. A converse proposition is de Finetti's theorem. • Suppose an urn contains n red and m blue marbles. Suppose marbles are drawn without replacement until the urn is empty. Let X_i be the indicator random variable of the event that the i-th marble drawn is red. Then \left\{ X_i \right\}_{i=1, \dots, n+m} is an exchangeable sequence. This sequence cannot be extended to any longer exchangeable sequence. • Suppose an urn contains n red and m blue marbles. Further suppose a marble is drawn from the urn and then replaced, with an extra marble of the same colour. Let X_i be the indicator random variable of the event that the i-th marble drawn is red. Then \left\{ X_i \right\}_{i\in \N} is an exchangeable sequence. This model is called Polya's urn. • Let (X, Y) have a bivariate normal distribution with parameters \mu = 0, \sigma_x = \sigma_y = 1 and an arbitrary correlation coefficient \rho\in (-1, 1). The random variables X and Y are then exchangeable, but independent only if \rho=0. The density function is p(x, y) = p(y, x) \propto \exp\left[-\frac{1}{2(1-\rho^2)}(x^2+y^2-2\rho xy)\right]. == Applications ==

Applications

The von Neumann extractor is a randomness extractor that depends on exchangeability: it gives a method to take an exchangeable sequence of 0s and 1s (Bernoulli trials), with some probability p of 0 and q=1-p of 1, and produce a (shorter) exchangeable sequence of 0s and 1s with probability 1/2. Partition the sequence into non-overlapping pairs: if the two elements of the pair are equal (00 or 11), discard it; if the two elements of the pair are unequal (01 or 10), keep the first. This yields a sequence of Bernoulli trials with p=1/2, as, by exchangeability, the odds of a given pair being 01 or 10 are equal. Exchangeable random variables arise in the study of U statistics, particularly in the Hoeffding decomposition. Exchangeability is a key assumption of the distribution-free inference method of conformal prediction. ==See also==

Source: Wikipedia ↗

tickerdossier.com tickerdossier.substack.com