Markov’s inequality provides a bound on the upper-tail probability of non-negative random variables. The bound is expressed with the expectation of the random variable.

**[Markov's Inequality]**

Let \(X\) be a non-negative random variable. Then, for any \(\lambda > 0\),

\[

Pr[ X \geq \lambda ] \leq \frac{E[X]}{\lambda}.

\]

Equivalently, for any \(k > 1\),

\[

Pr[ X \geq k E[X] ] \leq \frac{1 }{k}.

\]

**Proof.**

Define a random variable \(Y\) such that

\[

Y = \begin{cases}

\lambda & X\geq \lambda, \\

0 & X< \lambda. \\

\end{cases}

\]

Then \(E[X] \geq E[Y] \geq \lambda\cdot Pr[X\geq \lambda]\).

Markov’s inequality only bounds the upper-tail probability. The following is a similar bound on the lower-tail probability if the random variable is bounded above.

**[Lemma]**

Let \(X\) be a random variable over \([0,1]\). Then, for any positive \(k < 1\),

\[

Pr[ X < k E[X] ] \leq \frac{1 – E[X]}{1 – k E[X]}.

\]

The following are some applications of Markov’s inequality.

**[Borel-Cantelli Lemma]**

Let \(E_1, E_2, \ldots\) be a sequence of events such that \(\sum_n Pr[E_n] <\infty\). Then the probability that no more than \(k\) of the events happen is at least

\[

1 - \frac{ \sum_{n} Pr[E_n] }{ M }.

\]

This implies that, with probability 1, at most finitely many events happen.

**Proof.**

For each event \(E_i\), define binary random variable \(X_i\) such that \(X_i = 1\) if \(E_i\) hold, and \(X_i = 0\) otherwise. Then \(E[X_i] = Pr[E_i]\), and, by linearity of expectation,

\[

E[\sum_i X_i ] = \sum_i E[ X_i ] = \sum_i Pr[ E_i ].

\]

The result follows by applying Markov’s inequality on \(\sum_i X_i\).

**[Chebyshev's Inequality]**

Let \(X\) be a random variable. Then, for any \(\lambda > 0\),

\[

Pr[ |X-\mathbb{E}(X)| \geq \lambda ] \leq \frac{ Var[X] }{\lambda^2}.

\]

Equivalently, for any \(k > 1\),

\[

Pr[ |X-\mathbb{E}(X)| \geq k Var[X] ] \leq \frac{1}{k^2}.

\]

Chebyshev’s inequality follows by applying Markov’s inequality on the random variable \((X – E[X])^2\), and the fact that \(Var[X] = E [ (X - E[X])^2 ]\).