# Azuma’s Inequality

The Chernoff bound gives the concentration of sums of *independent* random variables. The concentration result can be generalized to martingales.

### Martingales and Azuma’s Inequality

A sequence of random variables \(X_0, X_1, \ldots, X_n\) is called a *martingale* if

\[

{\bf E}[X_i\mid X_{i-1}, \ldots, X_{0}] = X_{i-1}.

\]

Let \(Y_i = X_i – X_{i-1}\); then \({\bf E}[Y_i\mid X_{i-1},\ldots, X_{0}] = 0\).

**Theorem (Azuma’s Inequality)** Let \(X_0, X_1,\ldots, X_n\) be a martingale, and \(Y_i = X_i – X_{i-1}\). If there exists constant \(c_i > 0\) such that \(|Y_i| \leq c_i\), then for any \(\lambda > 0\),

\[

{\bf Pr}[X_n – X_0 \geq \lambda] \leq \exp\left( -\frac{\lambda^2}{2\sum_{i=1}^n c_i^2 } \right),

\]

and

\[

{\bf Pr}[X_n – X_0 \leq -\lambda] \leq \exp\left( -\frac{\lambda^2}{2\sum_{i=1}^n c_i^2 } \right).

\]

First, we need the following lemma.

**Lemma** Let \(Y\) be a random variable such that \({\bf E}[Y] = 0\) and \(|Y| \leq c\) for some constant \(c > 0\). Then for any \(t > 0\), it holds that \({\bf E}[e^{tY}] \leq e^{t^2c^2/2}\).

The proof follows from the convexity of \(f(x) = e^{tx}\). For any \(|x| \leq c\), we have

\[

e^{tx} \leq \frac{1}{2c} (e^{tc} – e^{-tc}) x + \frac{1}{2}(e^{tc} + e^{-tc}).

\]

Taking expectations on both sides, we have

\[

{\bf E}[e^{tY}] \leq \frac{1}{2}(e^{tc} + e^{-tc}) = e^{t^2c^2/2}.

\]

Here using Taylor series \(e^{x} = \sum^{\infty}_{n=0} \frac{x^n}{n!}\), we have

\[

\frac{1}{2}(e^x + e^{-x}) = \sum_{n=0}^{\infty} \frac{x^{2n}}{(2n)!}

\leq \sum_{n=0}^{\infty} \frac{x^{2n}}{2^n n!} = e^{x^2/2}

\]

Now we are ready to prove Azuma’s inequality. First,

\[

{\bf E}[e^{tY_i} \mid X_{i-1},\ldots, X_1] \leq e^{t^2c_i^2/2}.

\]

This gives a constant bound independent of the conditioning. By backward induction,

\[

{\bf E}[e^{tX_i}] ={\bf E}[e^{t(Y_i + X_{i-1})}] =

{\bf E}[e^{t X_{i-1}} \cdot {\bf E}[e^{t Y_i}\mid X_{i-1},\ldots, X_1]] \\

\leq {\bf E}[e^{t X_{i-1}} ] \cdot e^{t^2c_i^2/2}=

e^{t^2 \sum_{j=1}^{i}c_j^2/2} {\bf E}[e^{tX_0}].

\]

Therefore, by Markov’s inequality, we have

\[

{\bf Pr} [ X_n – X_0 \geq \lambda ]

= {\bf Pr} [ e^{t(X_n – X_0)} \geq e^{\lambda t} ] \\

\leq e^{-\lambda t} {\bf E}[e^{t(X_n – X_0)}]

\leq e^{-\lambda t + t^2 \sum_{j=1}^{n}c_j^2/2}

\]

The above holds for any \(t > 0\). Taking \(t = \lambda / \sum_{j=1}^{n}c_j^2\), we complete the proof.

### Application: Coin Tossing

Let \(Z_1,\ldots, Z_n\) be a sequence of independent variables taking two values -1 and 1 with equal probability (random coin tossing). Let \(X_0=0\) and \(X_i = \sum_{j=1}^i Z_i\) for \(i= 1, \ldots, n\). Then \(X_0,\ldots, X_n\) is a martingale.

By Azuma’s inequality, for any \(\lambda > 0\)

\[

{\bf Pr}[X_n \geq \lambda] \leq \exp(-\frac{\lambda^2}{2n}).

\]

Note that, in this case, \(X_i\) is a sum of independent variables; a similar bound can be derived using the Chernoff bound.