Probability axioms

The standard probability axioms are the foundations of probability theory introduced by Russian mathematician Andrey Kolmogorov in 1933. Like all axiomatic systems, they outline the basic assumptions underlying the application of probability to fields such as pure mathematics and the physical sciences, while avoiding logical paradoxes.

Kolmogorov axioms

In order to state the Kolmogorov axioms, the following pieces of data must be specified: • The sample space, \Omega, which is the set of all possible outcomes or elementary events. • The space of all events, which are each taken to be sets of outcomes (i.e. subsets of \Omega). The event space, F, must be a ''''-algebra on \Omega. • The probability measure P which assigns to each event E \in F its probability, P(E). Taken together, these assumptions mean that (\Omega, F, P) is a measure space. It is additionally assumed that P(\Omega)=1, making this triple a probability space. Quasiprobability distributions in general relax the third axiom. == Elementary consequences ==

Elementary consequences

In order to demonstrate that the theory generated by the Kolmogorov axioms corresponds with classical probability, some elementary consequences are typically derived. • Since P is finitely additive, we have P(A) + P(A^c) = P(A\cup A^c)= P(\Omega) = 1, so P(A^c) = 1-P(A). • In particular, it follows that P(\emptyset) = 0. The empty set is interpreted as the event that "no outcome occurs", which is impossible. • Similarly, if A \subseteq B, then P(B) = P(A \cup (B\setminus A)) = P(A) + P(B\setminus A) \ge P(A). In other words, P is monotone. • Since \emptyset \subseteq E \subseteq \Omega for any event E, it follows that 0 \le P(E) \le 1. By dividing A \cup B into the disjoint sets A \setminus (A \cap B) , B \setminus (A \cap B) and A \cap B, one arrives at a probabilistic version of the inclusion-exclusion principleP(A \cup B) = P(A) + P(B) - P(A \cap B).In the case where \Omega is finite, the two identities are equivalent. In order to actually do calculations when \Omega is an infinite set, it is sometimes useful to generalize from a finite sample space. For example, if \Omega consists of all infinite sequences of tosses of a fair coin, it is not obvious how to compute the probability of any particular set of sequences (i.e. an event). If the event is "every flip is heads", then it is intuitive that the probability can be computed as:P(\text{infinite sequence of heads}) = \lim_{n \to \infty} P(\text{sequence of n heads}) = \lim_{n \to \infty} 2^{-n} = 0.In order to make this rigorous, one has to prove that P is continuous, in the following sense. If A_j,\,\, j = 1, 2, \ldots is a sequence of events increasing (or decreasing) to another event A, then\lim_{n \to \infty} P(A_n) = P(A). == Simple example: Coin toss ==

Simple example: Coin toss

Consider a single coin-toss, and assume that the coin will either land heads (H) or tails (T) (but not both). No assumption is made as to whether the coin is fair. We may define: : \Omega = \{H,T\} : F = \{\varnothing, \{H\}, \{T\}, \{H,T\}\} Kolmogorov's axioms imply that: : P(\varnothing) = 0 The probability of neither heads nor tails, is 0. : P(\{H,T\}^c) = 0 The probability of either heads or tails, is 1. : P(\{H\}) + P(\{T\}) = 1 The sum of the probability of heads and the probability of tails, is 1. == See also ==

Source: Wikipedia ↗

tickerdossier.com tickerdossier.substack.com