# k-wise Independence and Linear Code

A collection of random variables is k-wise independent if any k variables are mutually independent. Consider a distribution $$D$$ on $$n$$ bits. It is k-wise independent and uniform if for any $$S\subseteq [n]$$ of size k, for any $$a\in\{0,1\}^k$$,
$Pr[ x_S = a] = 1 / 2^k.$
Here we will only consider k-wise independent and uniform distributions over boolean domains.

A distribution $$f$$ is $$k$$-wise independent if and only if all Fourier coefficients of degree at most k are 0, that is, for all $$\alpha\in \{0,1\}^n$$ with $$|\alpha|\leq k$$, we have $$\widehat{f}(\alpha) = 0$$.

[Proof.] The Fourier coefficient $$\widehat{f}(\alpha) = \sum_{x} f(x) (-1)^{\alpha x}$$. Given $$\alpha$$, we consider the marginal distribution of $$f$$ on all $$x_i$$ such that $$\alpha_i =1$$. We have $$\widehat{f}(\alpha)$$ is the same as the Fourier coefficient of the marginal distribution on vector 1. For the reverse direction, we can argue inductively from 2 to k.

We wish to construct a k-wise independent distribution with a space as small as possible and also efficiently enumerable. Linear code provides a way to construct k-wise independent distribution.

Consider an $$[n, l, d]$$ linear code $$C$$ with block length $$n$$, dimension $$l$$ and minimum distance $$d$$. Let $$G$$ be its generator matrix of dimension $$l\times n$$; it transforms a message $late m$ of length $$l$$ to a codeword $$w$$ of length $$n$$. That is, $$c = m \cdot G$$. Let $$H$$ be the parity-checking matrix of dimension $$n-l\times n$$. For every codeword $$c$$, it holds that $$c \cdot H’ = 0$$.

The space of all codewords $$C$$ is a linear subspace of $$F_2^n$$, with dimension $$l$$ and size $$2^l$$. The rows of the generator matrix is a basis of the space. The orthogonal complement of $$C$$, denoted by $$C^\bot$$ contains all vectors $$x$$ such that $$\langle x, c\rangle =0$$ for all $$c\in C$$. This forms an $$[n, n-l]$$ code, which is called the dual code. The following hold:

• The dual code of $$C^\bot$$ is $$C$$.
• The generator matrix of $late C$ is the parity-check matrix of $$C^\bot$$.

We will argue that, a uniform distribution over the dual code $$C^\bot$$ is $$(d-1)$$-wise independent distribution. In other words, if we uniformly choose a message $$m$$ of length $$n-l$$, then the codeword $$m \cdot H$$ has a $$(d-1)$$-wise independent distribution.

[Proof.]
The code $$C$$ has minimum distance $$d$$ means that the minimum Hamming weight of its codeword is at least $$d$$, and any $$d-1$$ columns of the parity-check matrix are linearly independent.

Let $$f$$ be a uniform distribution over $$C^\bot$$. Let $$g$$ be a uniform distribution over $$\{0,1\}^{n-l}$$. Then
$\widehat{f}(\alpha) = \sum_{x} f(x)\cdot (-1)^{x \alpha’} = \sum_{u} g(u) (-1)^{u H \alpha’}$
Here $$u$$ is over the space $$\{0,1\}^{n-l}$$. For $$\alpha$$ such that |\alpha| < d[/latex], $2^{n-l}\cdot \widehat{f}(\alpha) = \sum_{u} (-1)^{uH \alpha'} = \sum_{u} (-1)^{u\beta} = 0$ Here $H \alpha'$ is a non-zero vector since any $d-1$ columns of $H$ are linearly independent. For any non-zero vector $\beta$, the final summation is 0. Therefore, using an $[n, l, d]$ code, we can construct a $(d-1)$-wise independent distribution with space size $2^{n-l}$.