12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758 |
- The compressor achieves an average compression rate of 60% of the
- original size which is on par with "gzip". It seems that you cannot do
- much better for compressing compiled binaries. This means that the
- break even point for using compressed images is reached, once the
- uncompressed size approaches 1.5kB. We can stuff more than 12kB into
- an 8kB EPROM and more than 25kB into an 16kB EPROM. As there is only
- 32kB of RAM for both the uncompressed image and its BSS area, this
- means that 32kB EPROMs will hardly ever be required.
-
- The compression algorithm uses a 4kB ring buffer for buffering the
- uncompressed data. Before compression starts, the ring buffer is
- filled with spaces (ASCII character 0x20). The algorithm tries to
- find repeated input sequences of a maximum length of 60 bytes. All
- 256 different input bytes plus the 58 (60 minus a threshold of 2)
- possible repeat lengths form a set of 314 symbols. These symbols are
- adaptively Huffman encoded. The algorithm starts out with a Huffmann
- tree that assigns equal code lengths to each of the 314 symbols
- (slightly favoring the repeat symbols over symbols for regular input
- characters), but it will be changed whenever the frequency of any of
- the symbols changes. Frequency counts are kept in 16bit words until
- the total number of compressed codes totals 2^15. Then, all frequency
- counts will be halfed (rounding to the bigger number). For unrepeated
- characters (symbols 0..255) the Huffman code is written to the output
- stream. For repeated characters the Huffmann code, which denotes the
- length of the repeated character sequence, is written out and then the
- index in the ring buffer is computed. From this index, the algorithm
- computes the offset relative to the current index into the ring
- buffer. Thus, for typical input data, one would expect that short to
- medium range offsets are more frequent than extremely short or medium
- range to long range offsets. Thus the 12bit (for a 4kB buffer) offset
- value is statically Huffman encoded using a precomputed Huffman tree
- that favors those offset values that are deemed to be more
- frequent. The Huffman encoded offset is written to the output data
- stream, directly following the code that determines the length of
- repeated characters.
-
- This algorithm, as implemented in the C example code, looks very good
- and its operating parameters are already well optimized. This also
- explains why it achieves compression ratios comparable with
- "gzip". Depending on the input data, it sometimes excells considerably
- beyond what "gzip -9" does, but this phenomenon does not appear to be
- typical. There are some flaws with the algorithm, such as the limited
- buffer sizes, the adaptive Huffman tree which takes very long to
- change, if the input characters experience a sudden change in
- distribution, and the static Huffman tree for encoding offsets into
- the buffer. The slow changes of the adaptive Huffman tree are
- partially counteracted by artifically keeping a 16bit precision for
- the frequency counts, but this does not come into play until 32kB of
- compressed data is output, so it does not have any impact on our use
- for "etherboot", because the BOOT Prom does not support uncompressed
- data of more then 32kB (c.f. doc/spec.doc).
-
- Nonetheless, these problems do not seem to affect compression of
- compiled programs very much. Mixing object code with English text,
- would not work too well though, and the algorithm should be reset in
- between. Actually, we might gain a little improvement, if text and
- data segments were compressed individually, but I have not
- experimented with this option, yet.
|