The
tar utility included in most Linux distributions can extract .tar.gz files by passing the option, e.g., , where -z instructs decompression, -x means extraction, and -f specifies the name of the compressed archive file to extract from. Optionally, -v (
verbose) lists files as they are being extracted. The
zlib library supports the gzip file format. The gzip format is used in
HTTP compression, a technique used to speed up the sending of
HTML and other content on the
World Wide Web. It is one of the three standard formats for HTTP compression as specified in RFC 2616. This
RFC also specifies a zlib format (called "DEFLATE"), which is equal to the gzip format except that gzip adds eleven bytes of overhead in the form of headers and trailers. Still, the gzip format is sometimes recommended over zlib because
Internet Explorer does not implement the standard correctly and cannot handle the zlib format as specified in RFC 1950. Since the late 1990s,
bzip2, a file compression utility based on a block-sorting algorithm, has gained some popularity as a gzip replacement. It produces considerably smaller files (especially for source code and other structured text), but at the cost of memory and processing time (up to a factor of 4). AdvanceCOMP,
Zopfli, libdeflate and
7-Zip can produce gzip-compatible files, using an internal DEFLATE implementation with better compression ratios than gzip itself—at the cost of more processor time compared to the reference implementation. Research published in 2023 showed that simple lossless compression techniques such as gzip could be combined with a
k-nearest-neighbor classifier to create an attractive alternative to
deep neural networks for text classification in
natural language processing. This approach has been shown to equal and in some cases outperform conventional approaches such as
BERT due to low resource requirements, e.g. no requirement for
GPU hardware. ==See also==