Given is an
i.i.d. source, its
time series is i.i.d. with
entropy in the discrete-valued case and
differential entropy in the continuous-valued case. The Source coding theorem states that for any , i.e. for any
rate larger than the
entropy of the source, there is large enough and an encoder that takes i.i.d. repetition of the source, , and maps it to binary bits such that the source symbols are recoverable from the binary bits with probability of at least .
Proof of Achievability. Fix some , and let :p(x_1, \ldots, x_n) = \Pr \left[X_1 = x_1, \cdots, X_n = x_n \right]. The
typical set, , is defined as follows: :A_n^\varepsilon =\left\{(x_1, \cdots, x_n) \ : \ \left|-\frac{1}{n} \log p(x_1, \cdots, x_n) - H_n(X)\right| The
asymptotic equipartition property (AEP) shows that for large enough , the probability that a sequence generated by the source lies in the typical set, , as defined approaches one. In particular, for sufficiently large , P((X_1,X_2,\cdots,X_n) \in A_n^\varepsilon) can be made arbitrarily close to 1, and specifically, greater than 1-\varepsilon (See
AEP for a proof). The definition of typical sets implies that those sequences that lie in the typical set satisfy: :2^{-n(H(X)+\varepsilon)} \leq p \left (x_1, \cdots, x_n \right ) \leq 2^{-n(H(X)-\varepsilon)} • The probability of a sequence (X_1,X_2,\cdots X_n) being drawn from is greater than . • \left| A_n^\varepsilon \right| \leq 2^{n(H(X)+\varepsilon)}, which follows from the left hand side (lower bound) for p(x_1,x_2,\cdots x_n). • \left| A_n^\varepsilon \right| \geq (1-\varepsilon) 2^{n(H(X)-\varepsilon)}, which follows from upper bound for p(x_1,x_2,\cdots x_n) and the lower bound on the total probability of the whole set . Since \left| A_n^\varepsilon \right| \leq 2^{n(H(X)+\varepsilon)}, n(H(X)+\varepsilon) bits are enough to point to any string in this set. The encoding algorithm: the encoder checks if the input sequence lies within the typical set; if yes, it outputs the index of the input sequence within the typical set; if not, the encoder outputs an arbitrary digit number. As long as the input sequence lies within the typical set (with probability at least ), the encoder does not make any error. So, the probability of error of the encoder is bounded above by .
Proof of converse: the converse is proved by showing that any set of size smaller than (in the sense of exponent) would cover a set of probability bounded away from . == Proof: Source coding theorem for symbol codes ==