11 Şubat 2017 Cumartesi

Sıkıştırma - LZW

Giriş
Lempel, Ziv, Welch Decompression Algorithm olarak bilinir. Açıklaması şöyle. 1984 yılında icat edilmiş.
LZW compression is named after its developers, A. Lempel and J. Ziv, with
later modifications by Terry A. Welch.
Tüm Modern Sıkıştırma Algoritmalarının Atasıdır
Açıklaması şöyle.
It is the foremost technique for general purpose data compression due to its simplicity and versatility. Typically, you can expect LZW to compress text, executable code, and similar
data files to about one-half their original size. LZW also performs well when presented with extremely redundant data files, such as tabulated numbers, computer source code, and acquired signals. Compression ratios of 5:1 are common for these cases. LZW is the basis of several personal computer utilities that claim to "double the capacity of your hard drive."

LZW compression is always used in GIF image files, and offered as an option in TIFF and PostScript.
Encoding Table
Açıklaması şöyle.
LZW compression uses a code table, .... A common choice is to provide 4096 entries in the table. In this case, the LZW encoded data consists entirely of 12 bit codes, each referring to one of the entries in the code table. Uncompression is achieved by taking each code from the compressed file, and translating it through the code table to find what character or characters it represents. Codes 0-255 in the code table are always assigned to represent single bytes from the input file. For example, if only these first 256 codes were used, each byte in the original file would be converted into 12 bits in the LZW encoded file, resulting in a 50% larger file size. During uncompression, each 12 bit code would be translated via the code table back into the single bytes. Of course, this wouldn't be a useful situation.

The LZW method achieves compression by using codes 256 through 4095 to represent sequences of bytes. For example, code 523 may represent the sequence of three bytes: 231 124 234. Each time the compression algorithm encounters this sequence in the input file, code 523 is placed in the encoded file. During uncompression, code 523 is translated via the code table to
recreate the true 3 byte sequence. The longer the sequence assigned to a single code, and the more often the sequence is repeated, the higher the compression achieved.
KwKwK Problemi
Açıklaması şöyle
The abnormal case occurs whenever an input character string contains the sequence KwKwK, where Kw already appears in the compressor string table. The compressor will parse out Kw, send CODE(Kw), and add KwK to its string table. Next it will parse out KwK and send the just-generated CODE(KwK). The decompressor, on receiving CODE(KwK), will not yet have added that code to its string table because it does not yet know an extension character for the prior string.



Hiç yorum yok:

Yorum Gönder