A probability mass function of a discrete random variable X can be seen as a special case of two more general measure theoretic constructions: the
distribution of X and the
probability density function of X with respect to the
counting measure. We make this more precise below. Suppose that (A, \mathcal A, P) is a
probability space and that (B, \mathcal B) is a measurable space whose underlying
σ-algebra is discrete, so in particular contains singleton sets of B. In this setting, a random variable X \colon A \to B is discrete provided its image is countable. The
pushforward measure X_{*}(P)—called the distribution of X in this context—is a probability measure on B whose restriction to singleton sets induces the probability mass function (as mentioned in the previous section) f_X \colon B \to \mathbb R since f_X(b)=P( X^{-1}( b ))=P(X=b) for each b \in B. Now suppose that (B, \mathcal B, \mu) is a
measure space equipped with the counting measure \mu. The probability density function f of X with respect to the counting measure, if it exists, is the
Radon–Nikodym derivative of the pushforward measure of X (with respect to the counting measure), so f = d X_*P / d \mu and f is a function from B to the non-negative reals. As a consequence, for any b \in B we have P(X=b)=P( X^{-1}( b) ) = X_*(P)(b) = \int_{ b } f d \mu = f(b), demonstrating that f is in fact a probability mass function. When there is a natural order among the potential outcomes x, it may be convenient to assign numerical values to them (or
n-tuples in case of a discrete
multivariate random variable) and to consider also values not in the
image of X. That is, f_X may be defined for all
real numbers and f_X(x)=0 for all x \notin X(S) as shown in the figure. The image of X has a
countable subset on which the probability mass function f_X(x) is one. Consequently, the probability mass function is zero for all but a countable number of values of x. The discontinuity of probability mass functions is related to the fact that the
cumulative distribution function of a discrete random variable is also discontinuous. If X is a discrete random variable, then P(X = x) = 1 means that the casual event (X = x) is certain (it is true in 100% of the occurrences); on the contrary, P(X = x) = 0 means that the casual event (X = x) is always impossible. This statement isn't true for a
continuous random variable X, for which P(X = x) = 0 for any possible x.
Discretization is the process of converting a continuous random variable into a discrete one. ==Examples==