5.2 Kraft Inequality

发布时间 2023-10-15 08:24:40作者: 李斯赛特

首先引入一些基础的定义:

  • \(C: S_X \rightarrow \mathcal{D}^*\): the source code for a r.v. \(X\), where \(S_X\) is the range of \(X\), \(\mathcal{D}^*\) is the set of finite-length strings of symbols from a \(D\)-ary alphabet.

  • WLOG, assume that \(D\)-ary alphabet = \(\mathcal{D}=\{0,1,...,D-1\}\).

  • \(C(x)\): the codeword corresponding to \(x\)

  • \(l(x)\): the length of \(C(x)\)

  • \(L(C)=\sum_{x\in S_X} p(x)l(x)\): the expected length of a source code \(C\) for \(X\), \(p(x)\) is the pmf of \(X\).

进一步引入一些 codes 的类别定义:

  • A code \(C\) is said to be nonsingular if \(x\neq x' \Rightarrow C(x)\neq C(x')\).

  • the extension \(C^*\) of a code \(C\) is \(C(x_1 x_2 ... x_n)=c(x_1)C(x_2)...C(x_n)\).

  • A code is called uniquely decodable if its extension is nonsingular.

  • A code is called a prefix/ instantaneous code if no codeword is a prefix of any other codeword.

我们的目标:

We wish to design prefix codes with minimum \(L(C)\). If we assign the short codewords to all source symbols, \(C\) may not be prefix-free. So Kraft inequality gives the limitation of codeword lengths for prefix codes.

Thm. 5.2.1 (Kraft Inequality) For any prefix code over an alphabet of size \(D\), the codeword lengths \(l_1,l_2,...,l_m\) must satisfy the inequality

\[\sum_{i=1}^{m} D^{-l_i} \leq 1 \]

Conversely, given a set of codeword lengths that satisfy (1), there exists a prefix code with these word lengths.

证明:

  • Consider a D-ary tree, which means each node in this tree has D children.

  • Assign \(\{0,1,..., D-1\}\) to the branches arising from one node(including the root).

  • Then, each codeword is represented by a leaf on the tree, the specific symbols of each codeword are the path from root to this leaf.

  • The above figure is an example of D=2.

  • Let \(l_m\) be the length of the longest codeword, then if the code tree is complete, on \(l_m\)-th layer, there are \(D^{l_m}\) nodes/ leaves.

  • If there is a leaf at level \(l_i\) (\(l_i \leq l_m\)), which means its \(D^{l_m-l_i}\) desendants at level \(l_m\) don't exist in the tree, because \(C\) is prefix-free.

  • Then, it is very obvious that the number of these descendants are less than or equal to \(D^{l_m}\):

\[\sum_{i=1}^{m} D^{l_{m}-l_i} \leq D^{l_{m}} \Rightarrow \sum_{i=1}^{m} D^{-l_i} \leq 1 \]