# Pairwise Independence

A collection of random variables is pairwise independent if every pair of variables are independent.

Given $$k$$ independent bits, we define $$n = 2^k-1$$ random variables each of which is a parity of a non-empty subset of the $$k$$ bits. These new random bits are pairwise independent. This gives a construction of $$n=2^k-1$$ pairwise independent bits from just $$k=\log(n+1)$$ bits. The sample space of the distribution of these $$n$$ bits is only $$2^k = n+1$$, while the domain size is $$2^n$$.

Consider the MAXCUT problem: Given a graph, find a partition of the vertices such that the number of edges between the two partitions is maximized. A simple randomized algorithm finds a cut with expected size at least half of the number of edges. That is, assign each vertex a random bit independently, and the cut is between vertices assigned with 0 and those assigned with 1. For each edge, with probability 1/2, it is in the cut; in expectation, the cut has half of the edges.

This algorithm also works if we assign pairwise independent bits to the vertices; the probability that an edge is in the cut is still 1/2. Using the construction above, we can use $$k=\log(n+1)$$ pure random bits to generate $$n$$ pairwise independent bits. Enumerating all $$k$$ bits takes $$O(n)$$ time and $$O(\log n)$$ space.

### Pairwise Independent Hashing Functions

A family of functions $$H = \{h: N \to M\}$$ is pairwise independent if, when we uniformly choose $$h$$ at random from $$H$$, each random variable $$h(x)$$ is uniform, and each pair of variables $$h(x), h(y)$$ for $$x\neq y$$ are independent.

• For any $$x \in N$$, the variable $$h(x)$$ is uniformly distributed in $$M$$.
• For any $$x\neq y$$, the variables $$h(x)$$ and $$h(y)$$ are independent.

Equivalently, for any $$x\neq y$$, and $$a, b\in M$$,
$Pr[ h(x) = a, ~ h(y) = b ] = Pr[ h(x) = a] \cdot Pr[ h(y) = b ] = \frac{1}{|M|^2}$

By randomly choosing $$h \in H$$, and enumerating over all $$x\in N$$, we get a sequence of $$|N|$$ pairwise independent variables, each of which is uniformly distributed in $$M$$.

Let $$F$$ be a finite field. Let $$H = \{ h_{a,b} \colon F\to F \}$$ where $$h_{a,b}(x) = ax+b$$ and $$a,b\in F$$. Then this family of functions is pairwise independent.

Suppose $$|F| =n$$. Then family has $$n^2$$ functions; each function is specified with $$2\log n$$ bits. A truly random function $$f\colon F\to F$$ would require $$n\log n$$ bits to describe. The sample space of the family has size $$n^2$$, which the domain space has size $$n^n$$.

If we take $$F = \{0,1\}^k$$, then each function $$h$$ maps $$k$$ bits to $$k$$ bits.

Consider a random function $$h\in H$$ is used to map elements in a larger set $$N$$ into a smaller set $$M$$ (buckets). If $$H$$ is pairwise independent, then the probability that two distinct elements from $$N$$ are mapped to the same bucket is $$1/|M|^2$$. This probability is the same as if we put a element randomly into the buckets. However, a pairwise independent family is much easier to describe; this is applied in hashing tables in practice.

Pairwise independence can be generalized to k-wise independence. A family of functions $$H = \{h: N \to M\}$$ is k-wise independent if, when we uniformly choose $$h$$ at random from $$H$$, for any k distinct elements $$x$$ from $$N$$, the random variables $$h(x)$$ are uniform and independent.

Let $$F$$ be a finite field. Let $$H = \{ h_{a} \colon F\to F, ~ a=(a_1, \ldots, a_k) \}$$ where $$h_{a}(x) = a_1 + a_2 x + \cdots + a_k x^{k-1}$$. Then this family of functions is k-wise independent.