DNF Counting and the Monte Carlo Method

A DNF formula is a disjunction (OR) of clauses; a clause is a conjunction (AND) of literals; a literal is a variable or its negation. For example,
\[
(x_1 \wedge x_2 \wedge \overline{x_3}) \vee (x_1 \wedge \overline{x_4}) \vee (x_3 \wedge x_4 \wedge x_5 \wedge \overline{x_6}).
\]

The DNF counting problem is to compute the number of satisfying assignments for a DNF formula. This problem is #P-complete. It is equivalent to the CNF counting problem. Note that, the DNF satisfiability problem is easy, while CNF satisfiability is NP-complete.

Monte Carlo Algorithms

Let F be a DNF formula with n variables. Consider the following simple randomized algorithm. Randomly generate N assignments and count the number of assignments satisfying F; denote this count by M. Then output the estimation
\[
\frac{M}{N} \cdot 2^n.
\]

If the total number of satisfying assignment is close to \(2^n\), then the above estimation is good. However, if there are very few satisfying assignments, it turns out the number of samples N needs to be exponentially large to get a good estimation. Consider the following improved algorithm.

For each clause \(C_i\), let \(S_i\) be the collection of assignments satisfying it. Then we define a probability
\[
p_i = \frac{|S_i|}{ \sum_{j=1}^{n} |S_j|}.
\]

The algorithm works as below. We run the sampling for N times. Each time, pick up a clause \(C_i\) with probability \(p_i\), and then pick up an assignment from \(S_i\). Check whether this assignment satisfies F. Finally, count the number of satisfying samples; denote the count by M. Then we output the estimate
\[
\frac{M}{N} \cdot \sum_{j=1}^{n} |S_j|.
\]

This algorithm is particularly efficient even when the number of satisfying assignments is small.

Comments

comments