Chebyshev's inequality

In probability theory, Chebyshev's inequality provides an upper bound on the probability of deviation of a random variable from its mean. More specifically, the probability that a random variable deviates from its mean by more than is at most , where is any positive constant and is the standard deviation.

History

The theorem is named after Russian mathematician Pafnuty Chebyshev, although it was first formulated by his friend and colleague Irénée-Jules Bienaymé. The theorem was first proved by Bienaymé in 1853 and more generally proved by Chebyshev in 1867. His student Andrey Markov provided another proof in his 1884 Ph.D. thesis. ==Statement==

Statement

Chebyshev's inequality is usually stated for random variables, but can be generalized to a statement about measure spaces. Probabilistic statement Let X (integrable) be a random variable with finite non-zero variance \sigma^2 (and thus finite expected value \mu). Then for any real number k > 0, : \Pr(|X-\mu|\geq k\sigma) \leq \frac{1}{k^2}. Only the case k > 1 is useful. When k \leq 1 the right-hand side 1/k^2 \geq 1 and the inequality is trivial as all probabilities are at most 1. As an example, using k = \sqrt{2} shows that the probability values lie outside the interval (\mu - \sqrt{2}\sigma, \mu + \sqrt{2}\sigma) does not exceed 1/2. Equivalently, it implies that the probability of values lying within the interval (i.e. its "coverage") is at least 1/2. For the general case, for any a > 0 , : \Pr(|X-\mu|\geq a) \leq \frac{\sigma^2}{a^2}. Because it can be applied to completely arbitrary distributions provided they have a known finite mean and variance, the inequality generally gives a poor bound compared to what might be deduced if more aspects are known about the distribution involved. Measure-theoretic statement Let (X,\,\Sigma,\,\mu) be a measure space, and let f be an extended real-valued measurable function defined on X. Then for any real number t > 0 and 0 , :\mu(\{x\in X\,:\,\,|f(x)|\geq t\}) \leq {1\over t^p} \int_{X} |f|^p \, d\mu. More generally, if g is an extended real-valued measurable function, nonnegative and nondecreasing, with g(t) \neq 0 then: :\mu(\{x\in X\,:\,\,f(x)\geq t\}) \leq {1\over g(t)} \int_X g\circ f\, d\mu. This statement follows from the Markov inequality, \mu(\{x\in X:|F(x)|\geq \varepsilon\}) \leq\frac1\varepsilon \int_X|F|d\mu , with F=g\circ f and \varepsilon=g(t), since in this case \mu(\{x\in X\,:\,\,g\circ f(x)\geq g(t)\}) \geq \mu(\{x\in X\,:\,\,f(x)\geq t\}) . The previous statement then follows by defining g(x) as |x|^p if x\ge t and 0 otherwise. ==Example==

Example

Suppose we randomly select a journal article from a source with an average of 1000 words per article, with a standard deviation of 200 words. We can then infer that the probability that it has between 600 and 1400 words (i.e. within k=2 standard deviations of the mean) must be at least 75%, because there is no more than 1/k^2 = 1/4 chance to be outside that range, by Chebyshev's inequality. But if we additionally know that the distribution is normal, we can say there is a 75% chance the word count is between 770 and 1230 (which is an even tighter bound). ==Sharpness of bounds==

Sharpness of bounds

As shown in the example above, the theorem typically provides rather loose bounds. However, these bounds cannot in general (remaining true for arbitrary distributions) be improved upon. The bounds are sharp for the following example: for any k \geq 1, : X = \begin{cases} -1, & \text{with probability }\;\;\frac{1}{2k^2} \\ \phantom{-}0, & \text{with probability }1 - \frac{1}{k^2} \\ +1, & \text{with probability }\;\;\frac{1}{2k^2} \end{cases} For this distribution, the mean is \mu = 0 and the variance is \sigma^2 = \frac{(-1)^2}{2k^2} + 0 + \frac{1^2}{2k^2} = \frac{1}{k^2}, so the standard deviation is \sigma = 1/k and : \Pr(|X-\mu| \ge k\sigma) = \Pr(|X| \ge 1) = \frac{1}{k^2}. Chebyshev's inequality is an equality for precisely those distributions which are affine transformations of this example. ==Proof==

Proof

Markov's inequality states that for any non-negative real-valued random variable Y and any positive number a, we have \Pr(|Y| \geq a) \leq \mathbb{E}[|Y|]/a. One way to prove Chebyshev's inequality is to apply Markov's inequality to the random variable Y = (X - \mu)^2 with a = (k \sigma)^2: : \Pr(|X - \mu| \geq k\sigma) = \Pr((X - \mu)^2 \geq k^2\sigma^2) \leq \frac{\mathbb{E}[(X - \mu)^2]}{k^2\sigma^2} = \frac{\sigma^2}{k^2\sigma^2} = \frac{1}{k^2}. It can also be proved directly using conditional expectation: :\begin{align} \sigma^2&=\mathbb{E}\bigl[ (X-\mu)^2 \bigr]\\[5pt] &=\mathbb{E}\Bigl[ (X-\mu)^2 \;\Big|\; k\sigma\leq |X-\mu| \Bigr] \Pr\bigl[ k\sigma\leq|X-\mu| \bigr] + \mathbb{E}\Bigl[ (X-\mu)^2 \;\Big|\; k\sigma>|X-\mu| \Bigr] \Pr\bigl[ k\sigma>|X-\mu| \bigr] \\[5pt] &\geq(k\sigma)^2 \Pr\bigl[ k\sigma > |X-\mu| \bigr] + 0\cdot\Pr\bigl[ k\sigma\leq|X-\mu| \bigr] \\[5pt] &=k^2\sigma^2 \Pr\bigl[ k\sigma\leq|X-\mu| \bigr] \end{align} Chebyshev's inequality then follows by dividing by k^2 \sigma^2. This proof also shows why the bounds are quite loose in typical cases: the conditional expectation on the event where |X - \mu| is thrown away, and the lower bound of k^2 \sigma^2 on the event |X - \mu| \geq k \sigma can be quite poor. Chebyshev's inequality can also be obtained directly from a simple comparison of areas, starting from the representation of an expected value as the difference of two improper Riemann integrals (at the drawing in the definition of expected value for arbitrary real-valued random variables). ==Extensions==

Extensions

Several extensions of Chebyshev's inequality have been developed. Selberg's inequality Selberg derived a generalization to arbitrary intervals. Suppose X is a random variable with mean \mu and variance \sigma^2. Selberg's inequality states that if \beta \geq \alpha \geq 0, : \Pr( X \in [\mu - \alpha, \mu + \beta] ) \ge \begin{cases}\frac{ \alpha^2 }{\alpha^2 + \sigma^2} &\text{if } \alpha(\beta-\alpha) \geq 2\sigma^2 \\ \frac{4\alpha\beta - 4\sigma^2}{(\alpha + \beta)^2} &\text{if } 2\alpha\beta \geq 2\sigma^2 \geq \alpha(\beta - \alpha) \\ 0 & \sigma^2 \geq \alpha\beta\end{cases} When \alpha = \beta, this reduces to Chebyshev's inequality. These are known to be the best possible bounds. Finite-dimensional vector Chebyshev's inequality naturally extends to the multivariate setting, where one has n random variables X_i with mean \mu_i and variance \sigma_i^2. Then the following inequality holds. :\Pr\left(\sum_{i=1}^n (X_i - \mu_i)^2 \ge k^2 \sum_{i=1}^n \sigma_i^2 \right) \le \frac{1}{k^2} This is known as the Birnbaum–Raymond–Zuckerman inequality after the authors who proved it for two dimensions. This result can be rewritten in terms of vectors X = (X_1, X_2, \ldots) with mean \mu = (\mu_1, \mu_2, \ldots), standard deviation \sigma = (\sigma_1, \sigma_2, \ldots), in the Euclidean norm || \cdot ||. : \Pr(\| X - \mu \| \ge k \| \sigma \|) \le \frac{ 1 } { k^2 }. One can also get a similar infinite-dimensional Chebyshev's inequality. A second related inequality has also been derived by Chen. Let n be the dimension of the stochastic vector X and let \operatorname{E}(X) be the mean of X. Let S be the covariance matrix and k > 0. Then : \Pr \left( ( X - \operatorname{E}(X) )^T S^{-1} (X - \operatorname{E}(X)) where Y^T is the transpose of Y. The inequality can be written in terms of the Mahalanobis distance as : \Pr \left( d^2_S(X,\operatorname{E}(X)) where the Mahalanobis distance based on S is defined by : d_S(x,y) =\sqrt{ (x -y)^T S^{-1} (x -y) } Navarro proved that these bounds are sharp, that is, they are the best possible bounds for that regions when we just know the mean and the covariance matrix of X. Stellato et al. showed that this multivariate version of the Chebyshev inequality can be easily derived analytically as a special case of Vandenberghe et al. where the bound is computed by solving a semidefinite program (SDP). Known correlation If the variables are independent this inequality can be sharpened. :\Pr\left (\bigcap_{i = 1}^n \frac{\sigma_i} \le k_i \right ) \ge \prod_{i=1}^n \left (1 - \frac{1}{k_i^2} \right) Berge derived an inequality for two correlated variables X_1, X_2. Let \rho be the correlation coefficient between X_1 and X_2 and let \sigma_i^2 be the variance of X_i. Then : \Pr\left( \bigcap_{ i = 1}^2 \left[ \frac{ | X_i - \mu_i | } { \sigma_i } This result can be sharpened to having different bounds for the two random variables and having asymmetric bounds, as in Selberg's inequality. Olkin and Pratt derived an inequality for n correlated variables. : \Pr\left(\bigcap_{i = 1 }^n \frac{\sigma_i} where the sum is taken over the n variables and : u = \sum_{i=1}^n \frac{1}{ k_i^2} + 2\sum_{i=1}^n \sum_{j where \rho_{ij} is the correlation between X_i and X_j. Olkin and Pratt's inequality was subsequently generalised by Godwin. Higher moments Mitzenmacher and Upfal note that by applying Markov's inequality to the nonnegative variable | X - \operatorname{E}(X) |^n, one can get a family of tail bounds : \Pr\left(| X - \operatorname{E}(X) | \ge k \operatorname{E}(|X - \operatorname{E}(X) |^n )^{ \frac{1}{n} }\right) \le \frac{1 } {k^n}, \qquad k >0,\ n \geq 2. For n = 2 we obtain Chebyshev's inequality. For k \geq 1,\ n > 4 and assuming that the nth moment exists, this bound is tighter than Chebyshev's inequality. This strategy, called the method of moments, is often used to prove tail bounds. Exponential moment A related inequality sometimes known as the exponential Chebyshev's inequality is the inequality : \Pr(X \ge \varepsilon) \le e^{ -t \varepsilon }\operatorname{E}\left (e^{ t X } \right), \qquad t > 0. Let K(t) be the cumulant generating function, : K( t ) = \log \left(\operatorname{E}\left( e^{ t x } \right) \right). Taking the Legendre–Fenchel transformation of K(t) and using the exponential Chebyshev's inequality we have : -\log( \Pr (X \ge \varepsilon )) \ge \sup_t( t \varepsilon - K( t ) ). This inequality may be used to obtain exponential inequalities for unbounded variables. Bounded variables If \Pr(x) has finite support based on the interval [a, b], let M = \max(|a|, |b|), where |x| is the absolute value of x. If the mean of \Pr(x) is zero then for all k > 0 : \frac{\operatorname{E}(|X|^r ) - k^r }{M^r} \le \Pr( | X | \ge k ) \le \frac{\operatorname{E}(| X |^r ) }{ k^r }. The second of these inequalities with r = 2 is the Chebyshev bound. The first provides a lower bound for the value of \Pr(x). ==Finite samples==

Finite samples

Univariate case Saw et al extended Chebyshev's inequality to cases where the population mean and variance are not known and may not exist, but the sample mean and sample standard deviation from N samples are to be employed to bound the expected value of a new drawing from the same distribution. The following simpler version of this inequality is given by Kabán. : \Pr( | X - m | \ge ks ) \le \frac 1 {N + 1} \left\lfloor \frac {N+1} N \left(\frac{N - 1}{k^2} + 1 \right) \right\rfloor where X is a random variable which we have sampled N times, m is the sample mean, k is a constant and s is the sample standard deviation. This inequality holds even when the population moments do not exist, and when the sample is only weakly exchangeably distributed; this criterion is met for randomised sampling. A table of values for the Saw–Yang–Mo inequality for finite sample sizes (N ) has been determined by Konijn. The table allows the calculation of various confidence intervals for the mean, based on multiples, C, of the standard error of the mean as calculated from the sample. For example, Konijn shows that for N = 59, the 95 percent confidence interval for the mean m is (m - C s, m + C s), where C = 4.447 \cdot 1.006 = 4.47 (this is 2.28 times larger than the value found on the assumption of normality showing the loss on precision resulting from ignorance of the precise nature of the distribution). An equivalent inequality can be derived in terms of the sample mean instead, : \Pr( | X - m | \ge ks ) \le \frac 1 {N + 1}. Beasley et al have suggested a modification of this inequality : \Pr(x \le m - a \sigma_-) \le \frac { 1 } { a^2 }. Putting : a = \frac{ k \sigma } { \sigma_- }. Chebyshev's inequality can now be written : \Pr(x \le m - k \sigma) \le \frac { 1 } { k^2 } \frac { \sigma_-^2 } { \sigma^2 }. A similar result can also be derived for the upper semivariance. If we put : \sigma_u^2 = \max(\sigma_-^2, \sigma_+^2) , Chebyshev's inequality can be written : \Pr(| x \le m - k \sigma |) \le \frac 1 {k^2} \frac { \sigma_u^2 } { \sigma^2 } . Because \sigma_u^2 \leq \sigma^2, use of the semivariance sharpens the original inequality. If the distribution is known to be symmetric, then : \sigma_+^2 = \sigma_-^2 = \frac{ 1 } { 2 } \sigma^2 and : \Pr(x \le m - k \sigma) \le \frac 1 {2k^2} . This result agrees with that derived using standardised variables. ;Note: The inequality with the lower semivariance has been found to be of use in estimating downside risk in finance and agriculture. Multivariate case Stellato et al. simplified the notation and extended the empirical Chebyshev inequality from Saw et al. to the multivariate case. Let \xi \in \mathbb{R}^{n_\xi} be a random variable and let N \in \mathbb{Z}_{\geq n_\xi}. We draw N+1 iid samples of \xi denoted as \xi^{(1)},\dots,\xi^{(N)},\xi^{(N+1)} \in \mathbb{R}^{n_\xi}. Based on the first N samples, we define the empirical mean as \mu_N = \frac 1 N \sum_{i=1}^N \xi^{(i)} and the unbiased empirical covariance as \Sigma_N = \frac 1 N \sum_{i=1}^N (\xi^{(i)} - \mu_{N})(\xi^{(i)} - \mu_N)^\top. If \Sigma_N is nonsingular, then for all \lambda \in \mathbb{R}_{\geq 0} then : \begin{align} & P^{N+1} \left((\xi^{(N+1)} - \mu_N)^\top \Sigma_N^{-1}(\xi^{(N+1)} - \mu_N) \geq \lambda^2\right) \\[8pt] \leq {} & \min\left\{1, \frac 1 {N+1} \left\lfloor \frac{n_\xi(N+1)(N^2 - 1 + N\lambda^2)}{N^2\lambda^2}\right\rfloor\right\}. \end{align} Remarks In the univariate case, i.e. n_\xi = 1, this inequality corresponds to the one from Saw et al. Moreover, the right-hand side can be simplified by upper bounding the floor function by its argument : P^{N+1}\left((\xi^{(N+1)} - \mu_N)^\top \Sigma_N^{-1}(\xi^{(N+1)} - \mu_N) \geq \lambda^2\right) \leq \min\left\{1, \frac{n_\xi(N^2 - 1 + N\lambda^2)}{N^2\lambda^2}\right\}. As N \to \infty, the right-hand side tends to \min \left\{1, \frac{n_\xi}{\lambda^2}\right\} which corresponds to the multivariate Chebyshev inequality over ellipsoids shaped according to \Sigma and centered in \mu. ==Sharpened bounds==

Sharpened bounds

Chebyshev's inequality is important because of its applicability to any distribution. As a result of its generality it may not (and usually does not) provide as sharp a bound as alternative methods that can be used if the distribution of the random variable is known. To improve the sharpness of the bounds provided by Chebyshev's inequality a number of methods have been developed; for a review see eg. Cantelli's inequality Cantelli's inequality due to Francesco Paolo Cantelli states that for a real random variable (X) with mean (\mu) and variance (\sigma^2) : \Pr(X - \mu \ge a) \le \frac{\sigma^2}{ \sigma^2 + a^2 } where a \geq 0. This inequality can be used to prove a one tailed variant of Chebyshev's inequality with k > 0 : \Pr(X - \mu \geq k \sigma) \leq \frac{ 1 }{ 1 + k^2 }. The bound on the one tailed variant is known to be sharp. To see this consider the random variable X that takes the values : X = 1 with probability \frac{ \sigma^2 } { 1 + \sigma^2 } : X = - \sigma^2 with probability \frac{ 1 } { 1 + \sigma^2 }. Then \operatorname{E}(X) = 0 and \operatorname{E}(X^2) = \sigma^2 and \Pr(X . An application: distance between the mean and the median The one-sided variant can be used to prove the proposition that for probability distributions having an expected value and a median, the mean and the median can never differ from each other by more than one standard deviation. To express this in symbols let \mu, \nu, and \sigma be respectively the mean, the median, and the standard deviation. Then : \left | \mu - \nu \right | \leq \sigma. There is no need to assume that the variance is finite because this inequality is trivially true if the variance is infinite. The proof is as follows. Setting k = 1 in the statement for the one-sided inequality gives: :\Pr(X - \mu \geq \sigma) \leq \frac{ 1 }{ 2 } \implies \Pr(X \geq \mu + \sigma) \leq \frac{ 1 }{ 2 }. Changing the sign of X and of \mu, we get :\Pr(X \leq \mu - \sigma) \leq \frac{ 1 }{ 2 }. As the median is by definition any real number m that satisfies the inequalities :\Pr(X\leq m) \geq \frac{1}{2}\text{ and }\Pr(X\geq m) \geq \frac{1}{2} this implies that the median lies within one standard deviation of the mean. A proof using Jensen's inequality also exists. Bhattacharyya's inequality Bhattacharyya extended Cantelli's inequality using the third and fourth moments of the distribution. Let \mu = 0 and \sigma^2 be the variance. Let \gamma = E[X^3] / \sigma^3 and \kappa = E[X^4]/\sigma^4. If k^2 - k \gamma - 1 > 0 then : \Pr(X > k\sigma) \le \frac{ \kappa - \gamma^2 - 1 }{ (\kappa - \gamma^2 - 1) (1 + k^2) + (k^2 - k\gamma - 1) }. The necessity of k^2 - k \gamma - 1 > 0 may require k to be reasonably large. In the case E[X^3]=0 this simplifies to :\Pr(X > k\sigma) \le \frac{\kappa-1}{\kappa \left(k^2+1\right)-2} \quad \text{for } k > 1. Since \frac{\kappa-1}{\kappa \left(k^2+1\right)-2} = \frac{1}{2}-\frac{\kappa (k-1)}{2 (\kappa-1)}+O\left((k-1)^2\right) for k close to 1, this bound improves slightly over Cantelli's bound \frac{1}{2}-\frac{k-1}{2}+O\left((k-1)^2\right) as \kappa > 1. wins a factor 2 over Chebyshev's inequality. Gauss's inequality In 1823 Gauss showed that for a distribution with a unique mode at zero, : \Pr( | X | \ge k ) \le \frac{ 4 \operatorname{ E }( X^2 ) } { 9k^2 } \quad\text{if} \quad k^2 \ge \frac{ 4 } { 3 } \operatorname{E} (X^2) , : \Pr( | X | \ge k ) \le 1 - \frac{ k } { \sqrt{3} \operatorname{ E }( X^2 ) } \quad \text{if} \quad k^2 \le \frac{ 4 } { 3 } \operatorname{ E }( X^2 ). Vysochanskij–Petunin inequality The Vysochanskij–Petunin inequality generalizes Gauss's inequality, which only holds for deviation from the mode of a unimodal distribution, to deviation from the mean, or more generally, any center. If X is a unimodal distribution with mean \mu and variance \sigma^2, then the inequality states that : \Pr( | X - \mu | \ge k \sigma ) \le \frac{ 4 }{ 9k^2 } \quad \text{if} \quad k \ge \sqrt{8/3} = 1.633. : \Pr( | X - \mu | \ge k \sigma ) \le \frac{ 4 }{ 3k^2 } - \frac13 \quad \text{if} \quad k \le \sqrt{8/3}. For symmetrical unimodal distributions, the median and the mode are equal, so both the Vysochanskij–Petunin inequality and Gauss's inequality apply to the same center. Further, for symmetrical distributions, one-sided bounds can be obtained by noticing that : \Pr( X - \mu \ge k \sigma ) = \Pr( X - \mu \le -k \sigma ) = \frac{1}{2} \Pr( |X - \mu| \ge k \sigma ). The additional fraction of 4/9 present in these tail bounds lead to better confidence intervals than Chebyshev's inequality. For example, for any symmetrical unimodal distribution, the Vysochanskij–Petunin inequality states that 4/(9 × 3^2) = 4/81 ≈ 4.9% of the distribution lies outside 3 standard deviations of the mode. Bounds for specific distributions DasGupta has shown that if the distribution is known to be normal : \Pr( | X - \mu | \ge k \sigma ) \le \frac{ 1 }{ 3 k^2 } . From DasGupta's inequality it follows that for a normal distribution at least 95% lies within approximately 2.582 standard deviations of the mean. This is less sharp than the true figure (approximately 1.96 standard deviations of the mean). • DasGupta has determined a set of best possible bounds for a normal distribution for this inequality. • Grechuk et al. developed a general method for deriving the best possible bounds in Chebyshev's inequality for any family of distributions, and any deviation risk measure in place of standard deviation. In particular, they derived Chebyshev inequality for distributions with log-concave densities. ==Related inequalities==

Related inequalities

Several other related inequalities are also known. Paley–Zygmund inequality The Paley–Zygmund inequality gives a lower bound on tail probabilities, as opposed to Chebyshev's inequality which gives an upper bound. Applying it to the square of a random variable, we get : \Pr( | Z | > \theta \sqrt{E[Z^2]} ) \ge \frac{ ( 1 - \theta^2 )^2 E[Z^2]^2 }{E[Z^4]}. Haldane's transformation One use of Chebyshev's inequality in applications is to create confidence intervals for variates with an unknown distribution. Haldane noted, using an equation derived by Kendall, that if a variate (x) has a zero mean, unit variance and both finite skewness (\gamma) and kurtosis (\kappa) then the variate can be converted to a normally distributed standard score (z): : z = x - \frac{\gamma}{6} (x^2 - 1) + \frac{ x }{ 72 } [ 2 \gamma^2 (4 x^2 - 7) - 3 \kappa (x^2 - 3) ] + \cdots This transformation may be useful as an alternative to Chebyshev's inequality or as an adjunct to it for deriving confidence intervals for variates with unknown distributions. While this transformation may be useful for moderately skewed and/or kurtotic distributions, it performs poorly when the distribution is markedly skewed and/or kurtotic. He, Zhang and Zhang's inequality For any collection of non-negative independent random variables with expectation 1 {{cite journal : \Pr\left ( \frac{\sum_{i=1}^n X_i }{n} - 1 \ge \frac{1}{n} \right) \le \frac{ 7 }{ 8 }. ==Integral Chebyshev inequality==

Integral Chebyshev inequality

There is a second (less well known) inequality also named after Chebyshev. If f, g : [a, b] \to \R are two monotonic functions of the same monotonicity, then : \frac{ 1 }{ b - a } \int_a^b \! f(x) g(x) \,dx \ge \left[ \frac{ 1 }{ b - a } \int_a^b \! f(x) \,dx \right] \left[ \frac{ 1 }{ b - a } \int_a^b \! g(x) \,dx \right] . If f and g are of opposite monotonicity, then the above inequality works in the reverse way. {{Math proof|Integrate this inequality with respect to x and y over [a,b]: \int_a^b \int_a^b (f(x)-f(y))(g(x)-g(y)) \,dx\,dy \geq 0. Expanding the integrand gives: \int_a^b \int_a^b \left[f(x)g(x) - f(x)g(y) - f(y)g(x) + f(y)g(y)\right] \,dx\,dy \geq 0. Separate the double integral into four parts: \int_a^b \int_a^b f(x)g(x) \,dx\,dy - \int_a^b \int_a^b f(x)g(y) \,dx\,dy - \int_a^b \int_a^b f(y)g(x) \,dx\,dy + \int_a^b \int_a^b f(y)g(y) \,dx\,dy \geq 0. Since the integration variable in each inner integral is independent, we have: • \int_a^b \int_a^b f(x)g(x) \,dx\,dy = (b-a) \int_a^b f(x)g(x) \,dx, • \int_a^b \int_a^b f(y)g(y) \,dx\,dy = (b-a) \int_a^b f(y)g(y) \,dy = (b-a) \int_a^b f(x)g(x) \,dx, • \int_a^b \int_a^b f(x)g(y) \,dx\,dy = \left(\int_a^b f(x) \,dx\right)\left(\int_a^b g(y) \,dy\right), • \int_a^b \int_a^b f(y)g(x) \,dx\,dy = \left(\int_a^b f(y) \,dy\right)\left(\int_a^b g(x) \,dx\right) = \left(\int_a^b f(x) \,dx\right)\left(\int_a^b g(x) \,dx\right). Let I = \int_a^b f(x)g(x) \,dx, \quad F = \int_a^b f(x) \,dx, \quad G = \int_a^b g(x) \,dx. Substitute these into the inequality: (b-a)I - FG - FG + (b-a)I \geq 0. Simplify: 2(b-a)I - 2FG \geq 0. Dividing by 2(b-a) (noting that b-a>0): I \geq \frac{FG}{(b-a)}. Divide both sides by b-a to obtain: \frac{1}{b-a} \int_a^b f(x)g(x) \,dx \geq \left(\frac{1}{b-a}\int_a^b f(x) \,dx\right) \left(\frac{1}{b-a}\int_a^b g(x) \,dx\right). This completes the proof.}} This inequality is related to Jensen's inequality, Kantorovich's inequality, the Hermite–Hadamard inequality Other inequalities There are also a number of other inequalities associated with Chebyshev: • Chebyshev's sum inequality • Chebyshev–Markov–Stieltjes inequalities ==Notes==

Source: Wikipedia ↗

tickerdossier.com tickerdossier.substack.com