1 files changed, 0 insertions, 58 deletions
diff --git a/contrib/compressor/algorithm.doc b/contrib/compressor/algorithm.doc
deleted file mode 100644
index 74a7646c..00000000
--- a/contrib/compressor/algorithm.doc
+++ /dev/null
@@ -1,58 +0,0 @@
-The  compressor achieves  an  average compression  rate of 60%  of the
-original size which is on par with "gzip". It seems that you cannot do
-much better for compressing  compiled  binaries.  This means that  the
-break even  point  for using compressed  images is   reached, once the
-uncompressed size approaches 1.5kB. We  can stuff more than 12kB  into
-an 8kB EPROM and more than 25kB into an 16kB EPROM.   As there is only
-32kB of RAM  for both the uncompressed  image  and its BSS  area, this
-means that 32kB EPROMs will hardly ever be required.
-
-The compression  algorithm uses a  4kB  ring buffer  for buffering the
-uncompressed data. Before   compression starts,  the  ring buffer   is
-filled  with spaces (ASCII  character  0x20).  The algorithm tries  to
-find repeated  input sequences of a  maximum length of  60 bytes.  All
-256 different input  bytes  plus the 58 (60   minus a threshold of  2)
-possible  repeat lengths form a set  of 314 symbols. These symbols are
-adaptively Huffman encoded.  The  algorithm starts out with a Huffmann
-tree  that  assigns equal code lengths    to each of  the  314 symbols
-(slightly favoring the repeat  symbols over symbols for regular  input
-characters), but  it will be changed whenever  the frequency of any of
-the symbols  changes. Frequency counts are  kept in 16bit  words until
-the total number of compressed codes totals 2^15.  Then, all frequency
-counts will be halfed (rounding to the bigger number).  For unrepeated
-characters (symbols 0..255) the Huffman code  is written to the output
-stream.  For repeated characters the  Huffmann code, which denotes the
-length of the repeated character sequence, is written out and then the
-index in the ring buffer is computed.   From this index, the algorithm
-computes  the offset   relative to  the current  index  into  the ring
-buffer. Thus,  for typical input data,  one would expect that short to
-medium range offsets are more frequent  than extremely short or medium
-range to long range offsets. Thus the  12bit (for a 4kB buffer) offset
-value  is statically Huffman encoded  using a precomputed Huffman tree
-that favors  those  offset  values    that  are deemed to   be    more
-frequent. The  Huffman encoded offset  is  written to the output  data
-stream,  directly  following the code  that   determines the length of
-repeated characters.
-
-This algorithm, as implemented in the  C example code, looks very good
-and  its operating parameters are   already well optimized. This  also
-explains   why  it achieves     compression ratios    comparable  with
-"gzip". Depending on the input data, it sometimes excells considerably
-beyond what "gzip -9" does, but this  phenomenon does not appear to be
-typical. There are some flaws with  the algorithm, such as the limited
-buffer  sizes, the  adaptive  Huffman tree  which takes  very  long to
-change, if    the input  characters  experience   a sudden   change in
-distribution, and the static Huffman   tree for encoding offsets  into
-the  buffer.   The slow  changes of   the  adaptive  Huffman  tree are
-partially counteracted by  artifically keeping  a 16bit precision  for
-the frequency counts, but  this does not  come into play until 32kB of
-compressed data is output, so  it does not  have any impact on our use
-for "etherboot", because  the BOOT Prom  does not support uncompressed
-data of more then 32kB (c.f. doc/spec.doc).
-
-Nonetheless,  these problems  do  not  seem  to affect  compression of
-compiled  programs very much.  Mixing  object code with English  text,
-would not work too  well though, and  the algorithm should be reset in
-between. Actually, we  might  gain a little  improvement, if  text and
-data   segments    were compressed  individually,    but   I have  not
-experimented with this option, yet.