What is LZW Compression?
LZW stands for Lempel-Ziv-Welch, the names of the three creators of this data compression technique. Data compression is the process of reducing the size of a file by summarizing its data. Smaller files can be downloaded more quickly and require less disk space for storage. LZW is one of the most popular compression algorithms and is used in many programs and image formats.
-
History
-
In 1983 Sperry filed a patent for an algorithm developed by Terry Welch, an employee at the Sperry Research Center. This algorithm is Welch's variation on a data compression technique first proposed by Jakob Ziv and Abraham Lempel in 1978. Welch's technique is both simpler and faster. He published an article in the June 1984 issue of IEEE Computer Magazine describing the technique. The technique became very popular and was widely adopted.
-
Function
-
LZW compression is a form of substitutional compression. In this form of compression, a specific, unique string of characters is replaced with a reference to that phrase, which is maintained in a dictionary. The resulting data compresses because the reference to the repeated phrase is much smaller. Suppose you were compressing the phrase, "the mako shark is the fastest breed of shark." Because the word "the" is repeated, it can be replaced with a placeholder like "*".
Considerations
-
While LZW compression is very fast, it is best suited for files that contain repetitive data. Text files and monochrome graphic images are ideal for LZW compression. Compressed files that do not contain repetitive data will actually grow in size because of the LZW data dictionary. U.S. software manufacturers who wished to use the LZW algorithm were obligated to pay a licensing fee to Unisys before the patent expired in June of 2003.
Significance
-
Many software developers have adopted LZW compression. Spencer Thomas, the creator of the UNIX compress utility, coded LZW compression into version 1.2 of compress in July of 1984. In 1987, Bob Berry and a team at Compuserve created the GIF (Graphics Interchange Format) file format, which is still in use as of October, 2009. GIF is incredibly flexible, due in part to its use of LZW to compress graphics data.
Scandal
-
Terry Welch's 1984 article made no mention of the pending patent Sperry had filed in 1983. The patent was granted in 1985 to Sperry, which later merged with the Burroughs Corporation--which merged to become Unisys in 1986. For nine years, the GIF format grew in popularity and adoption. Then on December 24, 1994, Unisys and Compuserve announced that any developers writing software that created or read the GIF file format had to pay a licensing fee to Unisys. This was widely decried as the "Unisys GIF Tax", and considered to be unethical, if not illegal.
LZW Today
-
LZW compression is in the public domain, and freely available for use by anyone. The U.S. patent expired in 2003, and the European, Canadian and Japanese patents expired in 2004.
References
Resources
- Photo Credit Image by Flickr.com, courtesy of Procsilas Moscas