k-wise Independence and Linear Code

A collection of random variables is k-wise independent if any k variables are mutually independent. Consider a distribution $D$ on $n$ bits. It is k-wise independent and uniform if for any $S\subseteq [n]$ of size k, for any $a\in\{0,1\}^k$,
\[
Pr[ x_S = a] = 1 / 2^k.
\]
Here we will only consider k-wise independent and uniform distributions over boolean domains.

A distribution $f$ is $k$-wise independent if and only if all Fourier coefficients of degree at most k are 0, that is, for all $\alpha\in \{0,1\}^n$ with $|\alpha|\leq k$, we have $\widehat{f}(\alpha) = 0$.

[Proof.] The Fourier coefficient $\widehat{f}(\alpha) = \sum_{x} f(x) (-1)^{\alpha x}$. Given $\alpha$, we consider the marginal distribution of $f$ on all $x_i$ such that $\alpha_i =1$. We have $\widehat{f}(\alpha)$ is the same as the Fourier coefficient of the marginal distribution on vector 1. For the reverse direction, we can argue inductively from 2 to k.

We wish to construct a k-wise independent distribution with a space as small as possible and also efficiently enumerable. Linear code provides a way to construct k-wise independent distribution.

Consider an $[n, l, d]$ linear code $C$ with block length $n$, dimension $l$ and minimum distance $d$. Let $G$ be its generator matrix of dimension $l\times n$; it transforms a message $late m$ of length $l$ to a codeword $w$ of length $n$. That is, $c = m \cdot G$. Let $H$ be the parity-checking matrix of dimension $ n-l\times n$. For every codeword $c$, it holds that $c \cdot H’ = 0$.

The space of all codewords $C$ is a linear subspace of $F_2^n$, with dimension $l$ and size $2^l$. The rows of the generator matrix is a basis of the space. The orthogonal complement of $C$, denoted by $C^\bot$ contains all vectors $x$ such that $\langle x, c\rangle =0$ for all $c\in C$. This forms an $[n, n-l]$ code, which is called the dual code. The following hold:

The dual code of $C^\bot$ is $C$.
The generator matrix of $late C$ is the parity-check matrix of $C^\bot$.

We will argue that, a uniform distribution over the dual code $C^\bot$ is $(d-1)$-wise independent distribution. In other words, if we uniformly choose a message $m$ of length $n-l$, then the codeword $m \cdot H$ has a $(d-1)$-wise independent distribution.

[Proof.]
The code $C$ has minimum distance $d$ means that the minimum Hamming weight of its codeword is at least $d$, and any $d-1$ columns of the parity-check matrix are linearly independent.

Let $f$ be a uniform distribution over $C^\bot$. Let $g$ be a uniform distribution over $\{0,1\}^{n-l}$. Then
\[
\widehat{f}(\alpha) = \sum_{x} f(x)\cdot (-1)^{x \alpha’} = \sum_{u} g(u) (-1)^{u H \alpha’}
\]
Here $u$ is over the space $\{0,1\}^{n-l}$. For $\alpha$ such that |\alpha| < d[/latex], \[ 2^{n-l}\cdot \widehat{f}(\alpha) = \sum_{u} (-1)^{uH \alpha'} = \sum_{u} (-1)^{u\beta} = 0 \] Here [latex]H \alpha'[/latex] is a non-zero vector since any [latex]d-1[/latex] columns of [latex]H[/latex] are linearly independent. For any non-zero vector [latex]\beta[/latex], the final summation is 0. Therefore, using an [latex][n, l, d][/latex] code, we can construct a [latex](d-1)[/latex]-wise independent distribution with space size [latex]2^{n-l}[/latex].

Comments

comments