# DNF Counting and the Monte Carlo Method

A DNF formula is a disjunction (OR) of clauses; a clause is a conjunction (AND) of literals; a literal is a variable or its negation. For example,
$(x_1 \wedge x_2 \wedge \overline{x_3}) \vee (x_1 \wedge \overline{x_4}) \vee (x_3 \wedge x_4 \wedge x_5 \wedge \overline{x_6}).$

The DNF counting problem is to compute the number of satisfying assignments for a DNF formula. This problem is #P-complete. It is equivalent to the CNF counting problem. Note that, the DNF satisfiability problem is easy, while CNF satisfiability is NP-complete.

### Monte Carlo Algorithms

Let F be a DNF formula with n variables. Consider the following simple randomized algorithm. Randomly generate N assignments and count the number of assignments satisfying F; denote this count by M. Then output the estimation
$\frac{M}{N} \cdot 2^n.$

If the total number of satisfying assignment is close to $$2^n$$, then the above estimation is good. However, if there are very few satisfying assignments, it turns out the number of samples N needs to be exponentially large to get a good estimation. Consider the following improved algorithm.

For each clause $$C_i$$, let $$S_i$$ be the collection of assignments satisfying it. Then we define a probability
$p_i = \frac{|S_i|}{ \sum_{j=1}^{n} |S_j|}.$

The algorithm works as below. We run the sampling for N times. Each time, pick up a clause $$C_i$$ with probability $$p_i$$, and then pick up an assignment from $$S_i$$. Check whether this assignment satisfies F. Finally, count the number of satisfying samples; denote the count by M. Then we output the estimate
$\frac{M}{N} \cdot \sum_{j=1}^{n} |S_j|.$

This algorithm is particularly efficient even when the number of satisfying assignments is small.