The gzip format is the de facto standard compression format in the UNIX and Linux worlds. In a previous Dev Shed article titled Zip Meets Java, Kulvir demonstrated how to use the java.util.zip package to programmatically manipulate files in the ZIP format. In this article, we’ll cover how to use the java.util.zip package to create and read files using the gzip format.
gzip (short for GNU zip) is a compression utility (mainly found on the *NIX platforms) that produces files with an extension of .gz. gzip was created by by Jean-Loup Gailly and Mark Adler as a replacement for the compress utility and offers better compression ratios and an open (non-patented) compression algorithm. The sister utility, gunzip, is used to decompress files that are in the gzip format. The intricacies of the gzip format are beyond the scope of this article. However, you can learn more about the compression and decompression algorithms used by gzip by following the link in our references section at the end of this article. You can learn more about the format of the gzip files by reading RFCs 1951 and 1952.
Some of the key points include:
A lossless compressed data format
Data is compressed using the LZ77 algorithm and Huffman coding
Format is not covered by patents, thus making it publicly usable without fear of legal repercussion
Format includes a cyclic redundancy check value to detect data corruption and ensure data integrity
Since its inception, the gzip format has gained a popular following. For example, the gzip utility has been formally adopted by the GNU project. There are free downloadable utilities for various platforms that can compress and decompress files in the gzip format. In Java, the gzip functionality lives in the java.util.zip package, which has been around since Java 1.1.