Randomized Computation
Randomness is used in computation to design algorithms which are simpler, more efficient, or parallelizable. A randomized algorithm produces outputs based on not only the inputs but also random choices.
A probabilistic Turing machine (PTM) is a deterministic Turing machine with an additional read-only tape which contains random bits; the transition at each step is determined by both the input and the random bits. A PTM can also be viewed as a NTM which chooses transitions at each step randomly. For the same input, a PTM may have different outputs for different runs due to the randomness. Randomness is inherent in the machine, and independent of the input.
Definition.
A language \(L\) is in BPP if there exists a polynomial-time Turing machine \(M\) such that for any input \(x\),
\[
{\bf Pr}_r[M(x, r) = L(x)] \geq {2/3},
\]
where \(r\) is a random string from \(\{0,1\}^{p(|x|)}\) for some polynomial \(p\).
The complement of a language in BPP is also in BPP. The class BPP captures languages that can be decided efficiently but with two-sided errors.
The class BPP has a strong “excluded middle” property: any input is accepted either with a high probability 2/3, or with a low probability 1/3 (rejected with a high probability). Since this criteria works on any input, the class BPP captures complexity on worst-case inputs.
It is obvious that P ⊆ BPP ⊆ EXP. Currently it is still unknown whether BPP is properly contained in NEXP. Another open question is whether BPP = P.
A language \(L\) is in RP if there exists a polynomial-time Turing machine \(M\) such that, for any input \(x\),
- if \(x\in L\), then \({\bf Pr}_r[M(x, r) = 1] \geq 1/{2}\),
- if \(x\notin L\), then \({\bf Pr}_r[M(x, r) = 1] =0\)
where \(r\) is a random string in \(\{0,1\}^{p(|x|)}\) for some polynomial \(p\).
Define coRP to be the class of languages of which the complement is in RP. Languages in RP and coRP can be decided efficiently but with one-sided error. The decision for a language in RP could be false negative, but not false positive. Similar as BPP, in the definitions of RP and coRP, the acceptance probability has a gap (excluded middle).
It is easy to see that RP ⊆ BPP, and P ⊆ RP ⊆ NP
A language is in ZPP if there is a polynomial-time Turing machine \(M\) with three outputs \(\{0,1,?\}\) such that \(\Pr_r[M(x,r) = ?] \leq \frac{1}{3}\) and \(M(x,r) = 1\) implies \(x\in L\) and \(M(x,r) = 0\) implies \(x\notin L\).
An equivalent definition is that, a language \(L\) is in ZPP if there is a PTM \(M\) runs in expected polynomial time, and whenever \(M\) halts on input \(x\), we have \(M(x) = L(x)\).
It holds that ZPP = RP ∩ coRP. Note that \(M\) never makes an error, but sometimes may run for a long time.
A language in BPP has a Monte Carlo algorithm, which makes mistakes with low probability. A language in ZPP has a Las Vegas algorithm, which never makes any mistake, but may run for a long time. A Las Vegas algorithm could be a combination of two Monte Carlo algorithms: one gives no false positives, and the other gives no false negatives.
The error probabilities in the definitions of RP and BPP are somehow arbitrary. With error reduction, the error probabilities can be made exponentially small but the complexity classes remain the same. The error probability in RP can be reduced by a simple procedure: repeat the algorithm for \(n\) times, and accept iff there is one acceptance. The fraction \(\frac{1}{2}\) in the definition of RP can be replaced by any number \(\epsilon >0\). Similarly for BPP, the error can be reduced by repeating the algorithm for \(n\) times and accepting iff there are more than half acceptances. The fractions \(\frac{2}{3}\) and \(\frac{1}{3}\) are can be replaced by \(\frac{1}{2} +\epsilon\) and \(\frac{1}{2} -\epsilon\) for any small \(\epsilon > 0\). Notice that \(\epsilon\) could even be an inverse polynomial.
Theorem. (Error-reduction)
Let \(L\) be a language and there is a polynomial-time PTM \(M\) such that
for some polynomial \(q\) and for any \(x\),
\[
\Pr[M(x) = L(x)] \geq \frac{1}{2} + \frac{1}{q(|x|)}.
\]
Then for any polynomial \(p\), there is a polynomial-time PTM \(M’\) such that
for any \(x\),
\[
\Pr[M'(x) = L(x)] \geq 1 – \frac{1}{2^{p(|x|)}}.
\]
Proof.
The machine \(M’\) will simply repeat \(M\), and the error probability will be reduced by the Chernoff bound.
Fix an input \(x\) of length \(n\). We repeat \(M\) on \(x\) for \(t\) times and output the majority. For each repetition, let \(X_i\) be an indicator variable which is 1 iff \(M\) outputs the correct answer \(L(x)\).
Then we have \(X_1,\ldots, X_t\) are independent Boolean variables with \({\bf E}[X_i] = {\bf Pr}[X_i = 1] \geq \frac{1}{2} + \frac{1}{q(n)}\).
Let \(X = \frac{1}{t} \sum_{i=1}^t X_i\), then \(E[X] \geq \frac{1}{2} + \frac{1}{q(n)}\). The probability that we output the wrong answer is
\[
{\bf Pr} [X < \frac{1}{2}] \leq {\bf Pr} [|X- \mu| \geq \epsilon] \leq 2 e^{-t/4q^2} < \frac{1}{2^p}
\]
for [latex]t(n) = 2 p(n) q^2(n)[/latex].
Theorem (Adleman)
BPP ⊆ P/poly.
Proof.
For an arbitrary \(L\) in BPP, there exists \(M\) such that for each \(x\in\{0,1\}^n\), \(Pr_r[M(x,r) \neq L(x)]\leq \frac{1}{2^{n+1}}\) with random strings \(r\in\{0,1\}^m\).
For a fixed \(x\), consider the set of random strings \(r\) such that \(M(x,r) \neq L(x)\). Its cardinality is at most \(\frac{2^m}{2^{n+1}}\). Then the number of \(r\)‘s such that \(M(x,r) \neq L(x)\) for some \(x\) is at most \(\frac{2^m}{2^{n+1}} \times 2^n = 2^{m-1} <2^m[/latex]. Therefore, there exist [latex]r_0[/latex] such that for all [latex]x[/latex], [latex]M(x,r_0) = L(x)[/latex]. The string [latex]r_0[/latex] can be coded into a polynomial-size circuit [latex]C[/latex] such that [latex]C(x)=L(x)[/latex].
Theorem (Sipser-Gacs)
BPP ⊆ [latex]\Sigma_2^{P} \cap \Pi_2^{P}\).
Proof.
Suppose \(L\) in BPP is decidable by TM \(M\) such that (1) the error probability is at most \(\frac{1}{2^n}\), where \(n\) is the input size, and (2) \(M\) uses \(m\) random bits. For a fixed \(x\), consider the set of strings \(S=\{r \mid M(x,r)= 1\}\). Then either \(|S|\geq 2^m(1-\frac{1}{2^n})\), or \(|S| \leq 2^m \frac{1}{2^n}\).
There are two claims here. Let \(k=\lceil \frac{m}{n} \rceil +1\).
- For any \(S\subseteq\{0,1\}^m\) with \(|S| \leq 2^m \frac{1}{2^n}\), for every \(u_1, \ldots,u_k\), \(\cup_{i=1}^{k}(S\oplus u_i) \neq \{0,1\}^m\). The size of the union is strictly smaller than \(2^m\).
- For any \(S\subseteq\{0,1\}^m\) with \(|S|\geq 2^m(1-\frac{1}{2^n})\), there exists \(u_1, \ldots,u_k\), \(\cup_{i=1}^{k}(S\oplus u_i) = \{0,1\}^m\). Randomly choose \(u_1, \ldots,u_k\) from \(\{0,1\}^m\). For any \(r\in \{0,1\}^m\), \(Pr[r \notin S\oplus u_i] = Pr[r\oplus u_i \notin S] = Pr[u_i \notin S] = 1 -\frac{|S|}{2^m} \leq \frac{1}{2^n}\). Then \(Pr[r \notin \cup_{i=1}^{k}(S\oplus u_i)] \leq \frac{1}{2^{nk}} < \frac{1}{2^m}[/latex]. With union bound, [latex]Pr[\exists r_{\in \{0,1\}^m} \notin \cup_{i=1}^{k}(S\oplus u_i)] < 1[/latex]. Then there must be a sequence [latex]u_1, \ldots, u_k[/latex] such that [latex]\cup = \{0,1\}^m[/latex]
Therefore, [latex]x\in L\) iff \(\exists u_1, \ldots, u_k \forall r \bigvee_{i=1}^k M(x,r\oplus u_i)=1\). Thus \(L\in \Sigma_2^{P}\). Since BPP is closed under complement, BPP ⊆ \(\Sigma_2^{P} \cap \Pi_2^{P}\).