Combinatorial Design and Pseudorandom Generator

For a set of size \(l\), choose \(m\) subsets each of size \(n\) such that each pair of subsets has intersection size at most \(d\). This gives a combinatorial design with parameters \((m,l,n,d)\). That is, we have \(I_1,\ldots, I_m \subseteq [l]\) such that \(|I_i| = n\) for all \(i\) and \(|I_i\cap I_j| \leq d\) for all distinct \(i,j\).

Given an \((m,l,n,d)\) combinatorial design and a function \(f\colon \{0,1\}^n \to \{0,1\}\), define a mapping \(G\colon \{0,1\}^l \to \{0,1\}^m\) such that \(G(x_1,\ldots, x_l) = (y_1,\ldots, y_m)\) where \(y_i = f(x_{I_i})\). The mapping \(G\) stretches \(l\) bits into \(m\) bits; it can be viewed as a pseudorandom generator.

Existence and Construction of Combinatorial Design

We first use the probabilistic method to show that, for a good parameter setting, there exists a combinatorial design satisfying the “small intersection property”.

We independently generate \(m\) random subsets. When generating each subset, we independently choose each element in \([l]\) with probability \(2n/l\).

For each random subset \(I\), by Chernoff bounds, its size is at least \(n\) with high probability.
{\bf E}[|I|] = \frac{2n}{l} \cdot l = 2n,
{\bf Pr}[|I| < n ] < (\frac{2}{e})^n. \] For each pair of random subsets [latex]I, J[/latex], by Chernoff bounds, its intersection is at most [latex]d = 8n^2/l[/latex] with high probability. \[ {\bf E}[|I\cap J|] = (\frac{2n}{l})^2 \cdot l = \frac{4n^2}{l}, \] \[ {\bf Pr}[|I| > d = \frac{8n^2}{l} ] < (\frac{e}{4})^{4n^2/l} =(\frac{e}{4})^{d/2}. \] Let the number of subsets [latex]m = 2^{d/8}[/latex]. Then by the union bound (on the [latex]m[/latex] subsets and about [latex]m \choose 2[/latex] pairs), the probability of the following is less than 1:

  • there exists a subset of size less than \(n\), or
  • there exists a pair of subsets of intersection size less than \(n\).

Removing elements from the subsets will not violate the small intersection property.

The above shows there exists a combinatorial design (with a good choice of parameters). Instead of using the random strategy, we can use greedy search to generate subsets one by one, such that each subset has small intersections with the previously generated subsets. This greedy search runs in time \(2^{O(l)}\).

From Hardness to Randomness

With an \((m,l,n,d)\) combinatorial design and a function \(f\colon \{0,1\}^n \to \{0,1\}\), define a mapping \(G\colon \{0,1\}^l \to \{0,1\}^m\) such that \(G(x_1,\ldots, x_l) = (y_1,\ldots, y_m)\) where \(y_i = f(x_{I_i})\). This mapping \(G\) stretches \(l\) bits into \(m\) bits.

If we choose \(f\) to be a “hard-to-compute” function, then \(G\) is a good pseudorandom generator.

For a circuit size \(S\) and \(\delta \in [0,1]\), say a function \(f\colon \{0,1\}^n \to \{0,1\}\) is \((S,\delta)\)-hard if, for any circuit of size at most \(S\),
{\bf Pr}[ C(x) = f(x) ] \leq 1-\delta,
where \(x\) is chosen uniformly from \(\{0,1\}^n\). This defines the average-case hardness, which means a hard function is hard to be approximated by any circuit with the size bound. Note that \(f\) can always be approximated with \(\delta \leq 1/2\) by a constant circuit either outputting 0 or outputting 1.

A mapping \(G\colon \{0,1\}^l \to \{0,1\}^m\) is an \((S, \epsilon)\)-pseudorandom generator if, for any circuit of size at most \(S\),
{\bf Pr}[ C(x) = 1] – {\bf Pr}[ C(G(y)) = 1 ] \leq \epsilon,
where \(x\) is uniform in \(\{0,1\}^m\) and \(y\) is uniform in \(\{0,1\}^l\). Note that, here \(S\) and \(\epsilon\) are usually parameters of \(m\).

Theorem. (Nisan-Widgerson Generator)
Given an \((m,l,n,d)\) combinatorial design and a function \(f\colon \{0,1\}^n \to \{0,1\}\), let \(G\colon \{0,1\}^l \to \{0,1\}^m\) be such that \(G(x_1,\ldots, x_l) = (y_1,\ldots, y_m)\) where \(y_i = f(x_{I_i})\). If \(f\) is \((S, \frac{1}{2}-\frac{\epsilon}{m})\)-hard, then \(G\) is a \((S’, \epsilon)\)-pseudorandom generator for \(S’ = S – m 2^d\).

The proof of this theorem needs the following lemma.
Lemma. (Pseudorandomness = Unpredictability)
For \(G\colon \{0,1\}^l \to \{0,1\}^m\), define a distribution \(Y = G(X)\) where \(X\) is uniform in \(\{0,1\}^l\). If, for any circuit \(C\) of size at most \(S\) and any \(i\in [m]\)
{\bf Pr}[ C (y_1,\ldots, y_{i-1}) = y_i ] \leq \frac{1}{2} + \frac{\epsilon}{m},
then \(G\) is a \((S, \epsilon)\)-pseudorandom generator.

The proof is by contradiction. Suppose that \(G\) is not a \((S’, \epsilon)\)-pseudorandom generator. Then by the lemma above, we get a predict circuit. That is, there is a circuit \(C\) of size \(S’\) such that for some \(i\in [m]\),
{\bf Pr}[ C (y_1,\ldots, y_{i-1}) = y_i ] \geq \frac{1}{2} + \frac{\epsilon}{m},
where each \(y_j = f(x_{I_j})\).

Now we use \(C\) to construct a circuit that approximates the function \(f\). The above probability distribution is uniform over \(\{0,1\}^l\). By the average argument, we can fix a partial assignment \(a\) for the bits indexed by \([l]\setminus I_{i}\) and uniformly choose \(x_{I_i}\) such that
{\bf Pr}[ C (y_1,\ldots, y_{i-1}) = y_i ] \geq \frac{1}{2} + \frac{\epsilon}{m},
where each \(y_j = f(a_{I_j\setminus I_i}, x_{I_i\setminus I_j})\). Here \(y_i = f(x_{I_i})\) still depends on all \(x_{I_i}\), but each \(y_j\) depends on at most \(d\) bits from \(x_{I_i}\) by the definition of the combinatorial design.

Each restriction of \(f\) by the partial assignment \(a\) depends on at most \(d\) inputs; then it can be computed with a circuit of size \(2^d\). Then by composing the circuit \(C\) of size \(S’\) with at most \(m\) circuits each of size \(2^d\), we get a circuit \(D\) of size \(S = S’ + m \cdot 2^d\) which approximates \(f\) well, that is,
{\bf Pr}[ D (x_{I_i} = f(x_{I_i}) ] \geq \frac{1}{2} + \frac{\epsilon}{m}.
This contradicts the hardness assumption of \(f\).



Tagged with: , ,
Posted in Theory