### Fourier Transform

Consider a function \(f\colon \{0,1\}^n \to \{-1, 1\}\) on \(n\) boolean variables. This can be viewed as a transform of a boolean function \(g\colon \{0,1\}^n \to \{0, 1\}\) by letting \(f(x) = (-1)^{g(x)}\). For \(\alpha \in\{0,1\}^n\), define the characteristic function

\[

\chi_\alpha(x) = (-1)^{\alpha x} = (-1)^{\sum_i \alpha_i x_i}.

\]

Here \(\alpha\) can also be viewed as a subset of \([n]\), that is \(\{i\colon \alpha_i = 1\}\).

The Fourier transform of function \(f\) at \(\alpha\) is

\[

\widehat{f}(\alpha) = \frac{1}{2^n} \sum_{x\in \{0,1\}^n} f(x) \chi_\alpha(x).

\]

That is, \(f\) can be uniquely decomposed as a linear combination of the characteristic functions

\[

f(x) = \sum_{\alpha\in \{0,1\}^n} \widehat{f}(\alpha) \chi_\alpha (x).

\]

Note that the Fourier transform \(\widehat{f}\) is a function from \(\{0,1\}^n \to \mathbb{R}\).

Define the *degree* of \(\widehat{f}\) to be the size of the largest \(\alpha\) such that \(\widehat{f}(\alpha) \neq 0\). If a function is computed by a decision tree of depth \(d\), then the degree of its Fourier transform is at most \(d\).

### Restriction

A restriction of a function is to fix a subset of variables by a specific assignment. This can be done in two steps: picking up a subset of variables, and fixing the values of these variables.

Consider function \(f\colon \{0,1\}^n \to \{-1, 1\}\). Divide the input variables into two disjoint subsets and write \(f(x, y)\) where \(x\) has dimension \(m\) and \(y\) has dimension \(n-m\). Then the Fourier coefficients can be written as

\[

\widehat{f}(\alpha, \beta) = \frac{1}{2^n} \sum_{(x, y)\in \{0,1\}^n} f(x,y) \chi_\alpha(x)\chi_\beta(y).

\]

For a fixed \(y\in\{0,1\}^{n-m}\), let \(f_y \equiv f(x,y)\), which is a boolean function on \(m \) inputs. Its Fourier transform is

\[

\widehat{f_y} (\alpha) = \frac{1}{2^m} \sum_{x} f_y (x) \chi_\alpha(x).

\]

Then we have the following relationship between \(\widehat{f}\) and \(\widehat{f_y}\):

\[

\widehat{f}(\alpha, \beta) = \frac{1}{2^{n-m}} \sum_y \widehat{f_y}(\alpha) \chi_\beta(y),

\]

and for each fixed \(\alpha\)

\[

\sum_\beta \widehat{f}(\alpha, \beta)^2 = \frac{1}{2^{n-m}} \sum_y \widehat{f_y}(\alpha)^2.

\]

Note here we applied the fact that \(\chi_\alpha (x) \chi_\beta (x) = \chi_{\alpha\oplus\beta} (x) \) and \(\sum_x \chi_\alpha (x) = 0\) for \(\alpha\neq 0\).

### Random Restriction on DNFs

A random restriction on a function is done in two steps: first randomly pick up a fraction \(p\) of input variables, and then randomly assign 0 or 1 to the remaining variables. Hastad’s Switching Lemma says that, under a random restriction, a \(k\)-DNF formula is simplified dramatically with high probability.

Let \(f\) be a \(k\)-DNF, and \(\rho\) be a random restriction with parameter \(p\). Denote by \(f|_\rho\) the randomly restricted function. Then the probability that the decision tree depth for \(f|_\rho\) is bigger than \(d\) is bounded by

\[

(5pk)^d.

\]

Note that this bound does not depend on the number of clauses in the formula. This implies that with probability \(1- (5pk)^d\), the degree of the Fourier transform \(\widehat{f|_\rho}\) is at most \(d\). Based on this, next we show that the large Fourier coefficients of the orignial \(k\)-DNF \(f\) is bounded.

By the above, we have

\[

{\bf Pr}_{\rho} \left[ deg( \widehat{f|_\rho}) > d \right] \leq (5pk)^d.

\]

This implies

\[

{\bf E}_{\rho} \left[ \sum_{\alpha \colon |\alpha| >d } \widehat{f|_\rho}(\alpha)^2 \right] \leq (5pk)^d.

\]

We consider the restriction applied in two steps: picking up a random subset \(s\) of size \(pn\) and randomly assign value \(y\) to the remaining variables. Then

\[

{\bf E}_{s} {\bf E}_{y} \left[ \sum_{\alpha \colon |\alpha| >d, \alpha\subseteq s } \widehat{f|_\rho}(\alpha)^2 \right]

=

{\bf E}_{s} \left[ \sum_{\alpha \colon |\alpha| >d, \alpha\subseteq s } {\bf E}_{y}[ \widehat{f|_\rho}(\alpha)^2 ] \right]

\]

By the fact that

\[

{\bf E}_{y}[ \widehat{f|_\rho}(\alpha)^2 ] = \sum_\beta \widehat{f}(\alpha,\beta)^2

\]

we have the above equals

\[

{\bf E}_{s} \left[ \sum_{\alpha \colon |\alpha| >d, \alpha\subseteq s } \sum_\beta \widehat{f}(\alpha,\beta)^2 \right]

= \sum_u \widehat{f}(u)^2 {\bf Pr}_{s} [|s\cap u| > d]

\]

Considering the cases that \(|u| \geq 2d/p\). By Chernoff bound, we have

\[

{\bf Pr}_{s} [|s\cap u| > d] \geq 1 – \exp(-d/4) \geq 1/2.

\]

Finally we have

\[

\sum_{u\colon |u| \geq 2d/p } \widehat{f}(u)^2 \leq 2 \cdot (5pk)^d

\]

For \(p = 1/10k\),

\[

\sum_{u\colon |u| \geq 20kd } \widehat{f}(u)^2 \leq 2 \cdot 2^{-d}

\]