Pseudorandom Generators and Derandomization

Definition of Pseudorandom Generators

Two distributions \(X\) and \(Y\) over \(\{0,1\}^n\) are \((s, \epsilon)\)-indistinguishable if, for any circuit \(C\) of size at most \(s\),
\left| \Pr_X[C(X) = 1] – \Pr_Y[C(Y) = 1] \right| \leq \epsilon.

A pseudorandom generator is a way of generating bits that look random, which means indistinguishable from truly random bits.

A family of functions \(G_n\colon \{0,1\}^{l(n)} \to \{0,1\}^n\) with \(l(n) < n[/latex] is an [latex](s,\epsilon)\)-pseudorandom generator if, for every \(n\), the distribution \(G_n(U_{l(n)})\) is \((s,\epsilon)\)-indistinguishable from \(U_n\). Here \(U_n\) denotes the uniform distribution over \(n\) bits strings.

We assume \(\{G_n\}\) is computable by a deterministic Turing machine taking \(l\) bits of seeds padded with \(1^n\) as inputs and producing \(n\) bits outputs. The running time is a function of the input size \(l+n\). Here we allow generators running in time \(2^{O(l)}\); this is enough for the purpose of derandomization.

The following is a summary of the parameters defining a pseudorandom generator: for output length \(n\),

  • seed length \(l(n)\),
  • distinguisher circuit size \(s(n)\),
  • error bound \(\epsilon(n)\),
  • running time of the generator \(t(n)\).

Derandomization using Pseudorandom Generators

Lemma. Suppose there exists a \((\hbox{poly}, 1/10)\)-pseudorandom generator with seed length \(l(n) = O(\log n)\). Then \({\bf BPP} = {\bf P}\).

The idea is to replace random strings in a probabilistic algorithm with pseudorandomly generated strings.

Let \(L\) be a language in \({\bf BPP}\). Then there exists a polynomial-time deterministic algorithm \(A\) such that \({\bf Pr}_r [A(x, r) = L(x)] \geq 2/3\), where \(r\) is a random string of length \(m\leq \hbox{poly}(n)\). We next construct a deterministic algorithm deciding \(L\).

On input \(x\in \{0,1\}^n\), our algorithm does the following:

  • try all seeds \(z \in \{0,1\}^{l(m)}\), and use the generator to produce strings of length \(m\);
  • for each generated string \(G(z)\), run algorithm \(A\) on inputs \((x, G(z))\);
  • output the majority of \(A(x, G(z))\) for all \(z \in \{0,1\}^{l(m)}\).

We claim that \(A(x, G(z)) = L(x)\) holds with probability at least \(2/3 – 1/10\), and the majority output is correct.

Fix an input \(x\) of length \(n\). We view algorithm \(A(x,\cdot)\) with a fixed \(x\) as an algorithm over strings of length \(m\leq \hbox{poly}(n)\). Since \(L\) is in \({\bf BPP}\), we have
\Pr_{r\sim U_m} [A(x, r) = L(x)] \geq \frac{2}{3}.
Since \(A(x, \cdot)\) is in polynomial time, and can be computed by polynomial-size circuits, by the property of pseudorandom generators, we have
\left| \Pr_{z\sim U_{l(m)}}[A(x, G(z)) = 1] – \Pr_{r\sim U_m}[A(x, r) = 1] \right| \leq \frac{1}{10}.

If \(x\in L\), then \(\Pr_{z\sim U_{l(m)}}[A(x, G(z)) = 1] \geq 2/3 – 1/10 > 1/2 \), and the majority of \(A(x, G(z))\) is 1. If \(x\notin L\), the argument is similar.

This algorithm runs in time \(2^{O(l(n))} \cdot \hbox{poly}(n) = \hbox{poly}(n)\) and thus in \({\bf P}\). This ends the proof.

The following is a generalization of this lemma with parametrized distinguisher circuit size and seed length. The proof is similar.

Lemma. Suppose there is a \((s(n), 1/10)\)-pseudorandom generator with seed length \(l(n)\). Then there is a constant \(c\) such that
{\bf BPTIME}(s(n)) \subseteq {\bf DTIME}(2^{O( l(s(n))}).

The following are some corollaries from this lemma. Let \(s(n) = \hbox{poly}(n)\) and \({\bf BPTIME}(s(n)) = {\bf BPP}\).

  • If \(l(n) = n^{o(1)}\), then \({\bf BPP}\subseteq {\bf SUBEXP} ={\bf DTIME}(2^{n^{o(1)}})\).
  • If \(l(n) = \hbox{polylog}(n)\), then \({\bf BPP}\subseteq {\bf QuasiP} = {\bf DTIME}(2^{\mathop{\mathrm{polylog}}(n)})\).
  • If \(l(n) = O(\log n)\), then \({\bf BPP} = {\bf P}\).