algorithm.doc 3.7KB

12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758
  1. The compressor achieves an average compression rate of 60% of the
  2. original size which is on par with "gzip". It seems that you cannot do
  3. much better for compressing compiled binaries. This means that the
  4. break even point for using compressed images is reached, once the
  5. uncompressed size approaches 1.5kB. We can stuff more than 12kB into
  6. an 8kB EPROM and more than 25kB into an 16kB EPROM. As there is only
  7. 32kB of RAM for both the uncompressed image and its BSS area, this
  8. means that 32kB EPROMs will hardly ever be required.
  9. The compression algorithm uses a 4kB ring buffer for buffering the
  10. uncompressed data. Before compression starts, the ring buffer is
  11. filled with spaces (ASCII character 0x20). The algorithm tries to
  12. find repeated input sequences of a maximum length of 60 bytes. All
  13. 256 different input bytes plus the 58 (60 minus a threshold of 2)
  14. possible repeat lengths form a set of 314 symbols. These symbols are
  15. adaptively Huffman encoded. The algorithm starts out with a Huffmann
  16. tree that assigns equal code lengths to each of the 314 symbols
  17. (slightly favoring the repeat symbols over symbols for regular input
  18. characters), but it will be changed whenever the frequency of any of
  19. the symbols changes. Frequency counts are kept in 16bit words until
  20. the total number of compressed codes totals 2^15. Then, all frequency
  21. counts will be halfed (rounding to the bigger number). For unrepeated
  22. characters (symbols 0..255) the Huffman code is written to the output
  23. stream. For repeated characters the Huffmann code, which denotes the
  24. length of the repeated character sequence, is written out and then the
  25. index in the ring buffer is computed. From this index, the algorithm
  26. computes the offset relative to the current index into the ring
  27. buffer. Thus, for typical input data, one would expect that short to
  28. medium range offsets are more frequent than extremely short or medium
  29. range to long range offsets. Thus the 12bit (for a 4kB buffer) offset
  30. value is statically Huffman encoded using a precomputed Huffman tree
  31. that favors those offset values that are deemed to be more
  32. frequent. The Huffman encoded offset is written to the output data
  33. stream, directly following the code that determines the length of
  34. repeated characters.
  35. This algorithm, as implemented in the C example code, looks very good
  36. and its operating parameters are already well optimized. This also
  37. explains why it achieves compression ratios comparable with
  38. "gzip". Depending on the input data, it sometimes excells considerably
  39. beyond what "gzip -9" does, but this phenomenon does not appear to be
  40. typical. There are some flaws with the algorithm, such as the limited
  41. buffer sizes, the adaptive Huffman tree which takes very long to
  42. change, if the input characters experience a sudden change in
  43. distribution, and the static Huffman tree for encoding offsets into
  44. the buffer. The slow changes of the adaptive Huffman tree are
  45. partially counteracted by artifically keeping a 16bit precision for
  46. the frequency counts, but this does not come into play until 32kB of
  47. compressed data is output, so it does not have any impact on our use
  48. for "etherboot", because the BOOT Prom does not support uncompressed
  49. data of more then 32kB (c.f. doc/spec.doc).
  50. Nonetheless, these problems do not seem to affect compression of
  51. compiled programs very much. Mixing object code with English text,
  52. would not work too well though, and the algorithm should be reset in
  53. between. Actually, we might gain a little improvement, if text and
  54. data segments were compressed individually, but I have not
  55. experimented with this option, yet.