MarketIndependence (probability theory)
Company Profile

Independence (probability theory)

Independence is a fundamental notion in probability theory, as in statistics and the theory of stochastic processes. Two events are independent, statistically independent, or stochastically independent if, informally speaking, the occurrence of one does not affect the probability of occurrence of the other or, equivalently, does not affect the odds. Similarly, two random variables are independent if the realization of one does not affect the probability distribution of the other. Conversely, dependence is when the occurrence of one event does affect the likelihood of another.

Definition
For events Two events Two events A and B are independent (often written as A \perp B or A \perp\!\!\!\perp B, where the latter symbol often is also used for conditional independence) if and only if their joint probability equals the product of their probabilities:—that is, if and only if for all distinct pairs of indices m,k, {{Equation box 1 A finite set of events is mutually independent if every event is independent of any intersection of the other events {{Equation box 1 More generally and equivalently if X and Y are real valued, if the pair of random variables (X,Y) has values in \mathcal X \times \mathcal Y with joint probability distribution P_{X,Y} and marginals P_X and P_Y we have the equality of measures :P_{X,Y}(d(x,y)) =P_X(dx)P_Y(dy), i.e. for every Borel set A \subseteq \mathcal X \times \mathcal Y we have : P_{X,Y}(A) = \int_A P_{X,Y}(d(x,y)) = \int_A P_X(dx)P_Y(dy) where P_{X,Y}(A) = P\left((X,Y) \in A\right). If X and Y are discrete valued this simplifies to :P_{X,Y}(x_i, y_j) = P_X(x_i)P_Y(y_j) \text{ for all } i =1,..,|\mathcal X|,\, j= 1,..,|\mathcal Y|, while if X and Y are real valued and have probability densities p_X(x) and p_Y(y) and joint probability density p_{X,Y}(x,y) it becomes :p_{X,Y}(x,y) = p_X(x) p_Y(y) \quad \text{for almost all } (x,y) \in \mathbb R^2. where "almost all" means all except for a set of measure zero. More than two random variables A finite set of n random variables \{X_1,\ldots,X_n\} is pairwise independent if and only if every pair of random variables is independent. Even if the set of random variables is pairwise independent, it is not necessarily mutually independent as defined next. A finite set of n random variables \{X_1,\ldots,X_n\} is mutually independent if and only if for any sequence of numbers \{x_1, \ldots, x_n\}, the events \{X_1 \le x_1\}, \ldots, \{X_n \le x_n \} are mutually independent events (as defined above in ). This is equivalent to the following condition on the joint cumulative distribution function {{nowrap|F_{X_1,\ldots,X_n}(x_1,\ldots,x_n).}} A finite set of n random variables \{X_1,\ldots,X_n\} is mutually independent if and only if {{Equation box 1 where F_{\mathbf{X}}(\mathbf{x}) and F_{\mathbf{Y}}(\mathbf{y}) denote the cumulative distribution functions of \mathbf{X} and \mathbf{Y} and F_{\mathbf{X,Y}}(\mathbf{x,y}) denotes their joint cumulative distribution function. Independence of \mathbf{X} and \mathbf{Y} is often denoted by \mathbf{X} \perp\!\!\!\perp \mathbf{Y}. Written component-wise, \mathbf{X} and \mathbf{Y} are called independent if :F_{X_1,\ldots,X_m,Y_1,\ldots,Y_n}(x_1,\ldots,x_m,y_1,\ldots,y_n) = F_{X_1,\ldots,X_m}(x_1,\ldots,x_m) \cdot F_{Y_1,\ldots,Y_n}(y_1,\ldots,y_n) \quad \text{for all } x_1,\ldots,x_m,y_1,\ldots,y_n. For stochastic processes For one stochastic process The definition of independence may be extended from random vectors to a stochastic process. Therefore, it is required for an independent stochastic process that the random variables obtained by sampling the process at any n times t_1,\ldots,t_n are independent random variables for any n. Formally, a stochastic process \left\{ X_t \right\}_{t\in\mathcal{T}} is called independent, if and only if for all n\in \mathbb{N} and for all t_1,\ldots,t_n\in\mathcal{T} {{Equation box 1 where {{nowrap|F_{X_{t_1},\ldots,X_{t_n}}(x_1,\ldots,x_n) = \mathrm{P}(X(t_1) \leq x_1,\ldots,X(t_n) \leq x_n).}} Independence of a stochastic process is a property within a stochastic process, not between two stochastic processes. For two stochastic processes Independence of two stochastic processes is a property between two stochastic processes \left\{ X_t \right\}_{t\in\mathcal{T}} and \left\{ Y_t \right\}_{t\in\mathcal{T}} that are defined on the same probability space (\Omega,\mathcal{F},P). Formally, two stochastic processes \left\{ X_t \right\}_{t\in\mathcal{T}} and \left\{ Y_t \right\}_{t\in\mathcal{T}} are said to be independent if for all n\in \mathbb{N} and for all t_1,\ldots,t_n\in\mathcal{T}, the random vectors (X(t_1),\ldots,X(t_n)) and (Y(t_1),\ldots,Y(t_n)) are independent, i.e. if {{Equation box 1 Independent σ-algebras The definitions above ( and ) are both generalized by the following definition of independence for σ-algebras. Let (\Omega, \Sigma, \mathrm{P}) be a probability space and let \mathcal{A} and \mathcal{B} be two sub-σ-algebras of \Sigma. \mathcal{A} and \mathcal{B} are said to be independent if, whenever A \in \mathcal{A} and B \in \mathcal{B}, :\mathrm{P}(A \cap B) = \mathrm{P}(A) \mathrm{P}(B). Likewise, a finite family of σ-algebras (\tau_i)_{i\in I}, where I is an index set, is said to be independent if and only if :\forall \left(A_i\right)_{i\in I} \in \prod\nolimits_{i\in I}\tau_i \ : \ \mathrm{P}\left(\bigcap\nolimits_{i\in I}A_i\right) = \prod\nolimits_{i\in I}\mathrm{P}\left(A_i\right) and an infinite family of σ-algebras is said to be independent if all its finite subfamilies are independent. The new definition relates to the previous ones very directly: • Two events are independent (in the old sense) if and only if the σ-algebras that they generate are independent (in the new sense). The σ-algebra generated by an event E \in \Sigma is, by definition, ::\sigma(\{E\}) = \{ \emptyset, E, \Omega \setminus E, \Omega \}. • Two random variables X and Y defined over \Omega are independent (in the old sense) if and only if the σ-algebras that they generate are independent (in the new sense). The σ-algebra generated by a random variable X taking values in some measurable space S consists, by definition, of all subsets of \Omega of the form X^{-1}(U), where U is any measurable subset of S. Using this definition, it is easy to show that if X and Y are random variables and Y is constant, then X and Y are independent, since the σ-algebra generated by a constant random variable is the trivial σ-algebra \{ \varnothing, \Omega \}. Probability zero events cannot affect independence so independence also holds if Y is only Pr-almost surely constant. ==Properties==
Properties
Self-independence Note that an event is independent of itself if and only if :\mathrm{P}(A) = \mathrm{P}(A \cap A) = \mathrm{P}(A) \cdot \mathrm{P}(A) \iff \mathrm{P}(A) = 0 \text{ or } \mathrm{P}(A) = 1. Thus an event is independent of itself if and only if it almost surely occurs or its complement almost surely occurs; this fact is useful when proving zero–one laws. Similarly, a random variable is independent of itself if and only if it is almost surely constant. Expectation, covariance, variance, and correlation If X and Y are statistically independent random variables, then: - The expected value of the product is the product of the expected values: :\operatorname{E}[X^n Y^m] = \operatorname{E}[X^n] \operatorname{E}[Y^m] - The covariance \operatorname{Cov}[X,Y] is zero: :\operatorname{Cov}[X,Y] = \operatorname{E}[X Y] - \operatorname{E}[X] \operatorname{E}[Y] = 0 - The variance of the sum is the sum of the variances: :\operatorname{V}[X+Y] = \operatorname{V}[X] + \operatorname{V}[Y] + 2 \operatorname{Cov}[X,Y] = \operatorname{V}[X] + \operatorname{V}[Y] - The correlation is zero: :\rho_{X,Y} = \dfrac{\operatorname{Cov}[X,Y]}{\sigma_X \sigma_Y} = 0 The converse does not hold: each of this property does not imply independence. For instance, if two random variables have a covariance of 0 they still may be not independent. Similarly for two stochastic processes \left\{ X_t \right\}_{t\in\mathcal{T}} and \left\{ Y_t \right\}_{t\in\mathcal{T}}: If they are independent, then they are uncorrelated. Characteristic function Two random variables X and Y are independent if and only if the characteristic function of the random vector (X,Y) satisfies :\varphi_{(X,Y)}(t,s) = \varphi_{X}(t)\cdot \varphi_{Y}(s). In particular the characteristic function of their sum is the product of their marginal characteristic functions: :\varphi_{X+Y}(t) = \varphi_X(t)\cdot\varphi_Y(t), though the reverse implication is not true. Random variables that satisfy the latter condition are called subindependent. ==Examples==
Examples
Rolling dice The event of getting a 6 the first time a die is rolled and the event of getting a 6 the second time are independent. By contrast, the event of getting a 6 the first time a die is rolled and the event that the sum of the numbers seen on the first and second trial is 8 are not independent. Drawing cards If two cards are drawn with replacement from a deck of cards, the event of drawing a red card on the first trial and that of drawing a red card on the second trial are independent. By contrast, if two cards are drawn without replacement from a deck of cards, the event of drawing a red card on the first trial and that of drawing a red card on the second trial are not independent, because a deck that has had a red card removed has proportionately fewer red cards. Pairwise and mutual independence Consider the two probability spaces shown. In both cases, \mathrm{P}(A) = \mathrm{P}(B) = 1/2 and \mathrm{P}(C) = 1/4. The events in the first space are pairwise independent because \mathrm{P}(A|B) = \mathrm{P}(A|C)=1/2=\mathrm{P}(A), \mathrm{P}(B|A) = \mathrm{P}(B|C)=1/2=\mathrm{P}(B), and \mathrm{P}(C|A) = \mathrm{P}(C|B)=1/4=\mathrm{P}(C); but the three events are not mutually independent. The events in the second space are both pairwise independent and mutually independent. To illustrate the difference, consider conditioning on two events. In the pairwise independent case, although any one event is independent of each of the other two individually, it is not independent of the intersection of the other two: :\mathrm{P}(A|BC) = \frac{\frac{4}{40}}{\frac{4}{40} + \frac{1}{40}} = \tfrac{4}{5} \ne \mathrm{P}(A) :\mathrm{P}(B|AC) = \frac{\frac{4}{40}}{\frac{4}{40} + \frac{1}{40}} = \tfrac{4}{5} \ne \mathrm{P}(B) :\mathrm{P}(C|AB) = \frac{\frac{4}{40}}{\frac{4}{40} + \frac{6}{40}} = \tfrac{2}{5} \ne \mathrm{P}(C) In the mutually independent case, however, :\mathrm{P}(A|BC) = \frac{\frac{1}{16}}{\frac{1}{16} + \frac{1}{16}} = \tfrac{1}{2} = \mathrm{P}(A) :\mathrm{P}(B|AC) = \frac{\frac{1}{16}}{\frac{1}{16} + \frac{1}{16}} = \tfrac{1}{2} = \mathrm{P}(B) :\mathrm{P}(C|AB) = \frac{\frac{1}{16}}{\frac{1}{16} + \frac{3}{16}} = \tfrac{1}{4} = \mathrm{P}(C) Triple-independence but no pairwise-independence It is possible to create a three-event example in which :\mathrm{P}(A \cap B \cap C) = \mathrm{P}(A)\mathrm{P}(B)\mathrm{P}(C), and yet no two of the three events are pairwise independent (and hence the set of events are not mutually independent). This example shows that mutual independence involves requirements on the products of probabilities of all combinations of events, not just the single events as in this example. ==Conditional independence==
Conditional independence
For events The events A and B are conditionally independent given an event C when \mathrm{P}(A \cap B \mid C) = \mathrm{P}(A \mid C) \cdot \mathrm{P}(B \mid C). For random variables Intuitively, two random variables X and Y are conditionally independent given Z if, once Z is known, the value of Y does not add any additional information about X. For instance, two measurements X and Y of the same underlying quantity Z are not independent, but they are conditionally independent given Z (unless the errors in the two measurements are somehow connected). The formal definition of conditional independence is based on the idea of conditional distributions. If X, Y, and Z are discrete random variables, then we define X and Y to be conditionally independent given Z if :\mathrm{P}(X \le x, Y \le y\;|\;Z = z) = \mathrm{P}(X \le x\;|\;Z = z) \cdot \mathrm{P}(Y \le y\;|\;Z = z) for all x, y and z such that \mathrm{P}(Z=z)>0. On the other hand, if the random variables are continuous and have a joint probability density function f_{XYZ}(x,y,z), then X and Y are conditionally independent given Z if :f_{XY|Z}(x, y | z) = f_{X|Z}(x | z) \cdot f_{Y|Z}(y | z) for all real numbers x, y and z such that f_Z(z)>0. If discrete X and Y are conditionally independent given Z, then :\mathrm{P}(X = x | Y = y , Z = z) = \mathrm{P}(X = x | Z = z) for any x, y and z with \mathrm{P}(Z=z)>0. That is, the conditional distribution for X given Y and Z is the same as that given Z alone. A similar equation holds for the conditional probability density functions in the continuous case. Independence can be seen as a special kind of conditional independence, since probability can be seen as a kind of conditional probability given no events. == History ==
History
Before 1933, independence, in probability theory, was defined in a verbal manner. For example, de Moivre gave the following definition: “Two events are independent, when they have no connexion one with the other, and that the happening of one neither forwards nor obstructs the happening of the other”. If there are n independent events, the probability of the event, that all of them happen was computed as the product of the probabilities of these n events. Apparently, there was the conviction, that this formula was a consequence of the above definition. (Sometimes this was called the Multiplication Theorem.), Of course, a proof of his assertion cannot work without further more formal tacit assumptions. The definition of independence, given in this article, became the standard definition (now used in all books) after it appeared in 1933 as part of Kolmogorov's axiomatization of probability. Kolmogorov credited it to S.N. Bernstein, and quoted a publication which had appeared in Russian in 1927. Unfortunately, both Bernstein and Kolmogorov had not been aware of the work of the Georg Bohlmann. Bohlmann had given the same definition for two events in 1901 and for n events in 1908 In the latter paper, he studied his notion in detail. For example, he gave the first example showing that pairwise independence does not imply mutual independence. Even today, Bohlmann is rarely quoted. More about his work can be found in On the contributions of Georg Bohlmann to probability theory from :de:Ulrich Krengel. ==See also==
tickerdossier.comtickerdossier.substack.com