Pseudorandom Generators and Derandomization
Definition of Pseudorandom Generators
Two distributions \(X\) and \(Y\) over \(\{0,1\}^n\) are \((s, \epsilon)\)-indistinguishable if, for any circuit \(C\) of size at most \(s\),
\[
\left| \Pr_X[C(X) = 1] – \Pr_Y[C(Y) = 1] \right| \leq \epsilon.
\]
A pseudorandom generator is a way of generating bits that look random, which means indistinguishable from truly random bits.
A family of functions \(G_n\colon \{0,1\}^{l(n)} \to \{0,1\}^n\) with \(l(n) < n[/latex] is an [latex](s,\epsilon)\)-pseudorandom generator if, for every \(n\), the distribution \(G_n(U_{l(n)})\) is \((s,\epsilon)\)-indistinguishable from \(U_n\). Here \(U_n\) denotes the uniform distribution over \(n\) bits strings.
We assume \(\{G_n\}\) is computable by a deterministic Turing machine taking \(l\) bits of seeds padded with \(1^n\) as inputs and producing \(n\) bits outputs. The running time is a function of the input size \(l+n\). Here we allow generators running in time \(2^{O(l)}\); this is enough for the purpose of derandomization.
The following is a summary of the parameters defining a pseudorandom generator: for output length \(n\),
- seed length \(l(n)\),
- distinguisher circuit size \(s(n)\),
- error bound \(\epsilon(n)\),
- running time of the generator \(t(n)\).
Derandomization using Pseudorandom Generators
Lemma. Suppose there exists a \((\hbox{poly}, 1/10)\)-pseudorandom generator with seed length \(l(n) = O(\log n)\). Then \({\bf BPP} = {\bf P}\).
Proof.
The idea is to replace random strings in a probabilistic algorithm with pseudorandomly generated strings.
Let \(L\) be a language in \({\bf BPP}\). Then there exists a polynomial-time deterministic algorithm \(A\) such that \({\bf Pr}_r [A(x, r) = L(x)] \geq 2/3\), where \(r\) is a random string of length \(m\leq \hbox{poly}(n)\). We next construct a deterministic algorithm deciding \(L\).
On input \(x\in \{0,1\}^n\), our algorithm does the following:
- try all seeds \(z \in \{0,1\}^{l(m)}\), and use the generator to produce strings of length \(m\);
- for each generated string \(G(z)\), run algorithm \(A\) on inputs \((x, G(z))\);
- output the majority of \(A(x, G(z))\) for all \(z \in \{0,1\}^{l(m)}\).
We claim that \(A(x, G(z)) = L(x)\) holds with probability at least \(2/3 – 1/10\), and the majority output is correct.
Fix an input \(x\) of length \(n\). We view algorithm \(A(x,\cdot)\) with a fixed \(x\) as an algorithm over strings of length \(m\leq \hbox{poly}(n)\). Since \(L\) is in \({\bf BPP}\), we have
\[
\Pr_{r\sim U_m} [A(x, r) = L(x)] \geq \frac{2}{3}.
\]
Since \(A(x, \cdot)\) is in polynomial time, and can be computed by polynomial-size circuits, by the property of pseudorandom generators, we have
\[
\left| \Pr_{z\sim U_{l(m)}}[A(x, G(z)) = 1] – \Pr_{r\sim U_m}[A(x, r) = 1] \right| \leq \frac{1}{10}.
\]
If \(x\in L\), then \(\Pr_{z\sim U_{l(m)}}[A(x, G(z)) = 1] \geq 2/3 – 1/10 > 1/2 \), and the majority of \(A(x, G(z))\) is 1. If \(x\notin L\), the argument is similar.
This algorithm runs in time \(2^{O(l(n))} \cdot \hbox{poly}(n) = \hbox{poly}(n)\) and thus in \({\bf P}\). This ends the proof.
The following is a generalization of this lemma with parametrized distinguisher circuit size and seed length. The proof is similar.
Lemma. Suppose there is a \((s(n), 1/10)\)-pseudorandom generator with seed length \(l(n)\). Then there is a constant \(c\) such that
\[
{\bf BPTIME}(s(n)) \subseteq {\bf DTIME}(2^{O( l(s(n))}).
\]
The following are some corollaries from this lemma. Let \(s(n) = \hbox{poly}(n)\) and \({\bf BPTIME}(s(n)) = {\bf BPP}\).
- If \(l(n) = n^{o(1)}\), then \({\bf BPP}\subseteq {\bf SUBEXP} ={\bf DTIME}(2^{n^{o(1)}})\).
- If \(l(n) = \hbox{polylog}(n)\), then \({\bf BPP}\subseteq {\bf QuasiP} = {\bf DTIME}(2^{\mathop{\mathrm{polylog}}(n)})\).
- If \(l(n) = O(\log n)\), then \({\bf BPP} = {\bf P}\).