3 huffman code generation, 1 huffman tree generation, 2 code length adjustment – Intel Extensible Firmware Interface User Manual

Page 753

Advertising
background image

Protocols

— Compression Algorithm Specification

Version 1.10

12/01/02

17-13

17.3.3 Huffman Code Generation

Another major component of the compressor design is generation of the Huffman Code.

Huffman Coding is applied to the Char&Len Set, the Position Set, and the Extra Set. The Huffman
Coding used here has the following features:

1. The Huffman tree is represented as an array of code lengths (“canonical” Huffman Coding);
2. The maximum code length is limited to 16 bits.

The Huffman code generation process can be divided into three steps. These are the generation of
Huffman tree, the adjustment of code lengths, and the code generation.

17.3.3.1 Huffman Tree Generation

This process generates a typical Huffman tree. First, the frequency of each symbol is counted, and
a list of nodes is generated with each node containing a symbol and the symbol’s frequency. The
two nodes with the lowest frequency values are merged into a single node. This new node becomes
the parent node of the two nodes that are merged. The frequency value of this new parent node is
the sum of the two child nodes’ frequency values. The node list is updated to include the new
parent node but exclude the two child nodes that are merged. This process is repeated until there is
a single node remaining that is the root of the generated tree.

17.3.3.2 Code Length Adjustment

The leaf nodes of the tree generated by the previous step represent all the symbols that were
generated. Traditionally the code for each symbol is found by traversing the tree from the root
node to the leaf node. Going down a left edge generates a “0,” and going down a right edge
generates a “1.” However, a different approach is used here. The number of codes of each code
length is counted. This generates a 16-element LengthCount array, with LengthCount[i] = Number
Of Codes whose Code Length is i. Since a code length may be longer than 16 bits, the sixteenth
entry of the LengthCount array is set to the Number Of Codes whose Code Length is greater than
or equal to 16.

The LengthCount array goes through further adjustment described by following code:

INT32 i, k;
UINT32 cum;

cum = 0;
for (i = 16; i > 0; i--) {
cum += LengthCount[i] << (16 - i);
}
while (cum != (1U << 16)) {
LengthCount[16]--;
for (i = 15; i > 0; i--) {
if (LengthCount[i] != 0) {
LengthCount[i]--;
LengthCount[i+1] += 2;
break;
}
}
cum--;
}

Advertising