ZIP (file format)
|
The ZIP file format is a popular data compression and archival format. A ZIP file contains one or more files that are compressed or stored.
The format was designed by Phil Katz for PKZIP, and in the form now used (PKZIP 2 format) it uses his DEFLATE algorithm for compression. Many software utilities other than PKZIP itself are now available to create, modify or open zip files, notably WinZip, PicoZip, Info-ZIP, WinRAR and 7-Zip. Microsoft even included minimal zip support under the name "compressed folders" in later versions of Windows.
ZIP files generally use the file extensions ".zip" or ".ZIP" and have the MIME media type application/zip
. Some software uses the zip file format as a wrapper to easilly store a large number of small items. Generally when this is done a non-standard file extention is used. Examples of this usage are java Jar files and the various OpenOffice.org document formats.
Contents |
|
Technical info
Zip is a fairly simple archive format that compresses every file seperately. Files can be stored either uncompressed or using a variety of compression algorithms. However , in practice, zip is nearly always used with deflate. All popular modern zip tools use it by default and almost all zips seen online are made using deflate. Some zip tools (especilly the more minimal ones) don't even support reading zips made with the older compression methods let alone producing them.
Zip supports a simple password based symetric encryption system which is known to be seriously flawed. It also supports spreading archives accross multiple removable disks (generally floppies, but it could also be used with other removable media).
New features including new compression and encryption methods have been added to zip in more recent times, but these are not supported by many tools and are not in wide use.
Compression methods
Shrinking (method 1)
Shrinking is a variant of LZW with a few minor tweaks, and as such it was affected by the LZW patent issue. It was never clear if the patent covered unshrinking but some open source projects (for example Info-ZIP) decided to play it safe and not include unshrinking support in the default builds.
Reducing (methods 2-5)
Reducing involves a combination of compressing repeated byte sequences then applying a probability based encoding to the result.
Imploding (method 6)
Imploding involves compressing repeated byte sequences with a sliding window then compressing the result using multiple Shannon-Fano trees.
Tokenizing (Method 7)
This method number is reseved. The PKWARE specification does not define an algorithm for it.
Deflate and enhanced deflate (methods 8 and 9)
These methods use the well known deflate algoritm. Deflate allows a window up to 32K. enhanced deflate a window up to 64K.
Bzip2
This method uses the well known bzip2 algorithm. This algorithm performs better than deflate but is not widely supported by zip tools.
History
Early history
The ZIP file format was originally created by Phil Katz, founder of PKWARE. Katz publicly released technical documentation on the ZIP file format, along with the first version of his PKZIP archiver, in January 1989.
An earlier compression and archival program, ARC, was distributed not only as the executable software, but also its C source code. Katz had copied ARC and converted the compression routines from C to optimised assembler code, which made it much faster. SEA initially tried to license Katz's archiver, called PKARC, but Katz refused. SEA then sued Katz for copyright infringement and won.
During settlement, Katz still refused to license PKARC to SEA, instead agreeing to pay SEA's legal fees and stop selling PKARC. He then went on to create his own file format, and the .ZIP format he designed was a much more efficient compression format than .ARC. Once the PKZIP software was released, many users abandoned .ARC because of its slower performance and because Katz had successfully convinced them that he was the "good guy" being unfairly treated by an evil corporation.
Moving beyond the command line
In the mid 1990s, as more new computers included graphical user interfaces, there were more users who were not comfortable with the command-line operation of PKZIP. Seeing an opportunity, shareware authors began pitching compression and archival programs with graphical user interfaces. Many of these used the ZIP format. WinZip was among the most popular. PKWare (Katz's company) also offered a graphical version of PKZip. These graphical compression programs were easier to learn to use than the older command-line equivalents, but they still required learning an additional program and an additional interface just for compression.
In the late 1990s, various file manager software started integrating support for the ZIP format into the file manager user interface. The KDE file manager (kfm) supported this very early, and support was also added to Windows Explorer (as of Windows Me and Windows XP), the Mac OS Finder (as of Mac OS X), the Nautilus file manager used with GNOME, the Konqueror file manager used with newer versions of KDE, and others. By 2002, all major desktop environments included zipfile support in their file managers. Typically, in any modern file manager, a ZIP file may be treated as a directory or folder, so that files are copied into and out of it in the same manner as any other folder; the compression is handled in a way that is largely transparent to the end user. This eliminates the need for the user to learn to use a program and an interface just for the purpose of compression and archival, since the same interface can be used as for regular file management.
See also
External links
- http://www.bbsdocumentary.com/research/CONTROVERSY/LAWSUITS/SEA/baker.html Ben Baker remembers Phil Katz
- http://www.esva.net/~thom/philkatz.html Thom Henderson's opinion of Phil Katz
- http://www.info-zip.org/pub/infozip/doc/ Technical specifications of the PKZIP file formats from info-zip.
- http://www.pkware.com/company/standards/appnote/appnote.txt current file format specification from pkware (including many recent features that are not widely supported)
- http://groups.google.com/groups?selm=14480%40cup.portal.com Original specification for the first version of the formatcy:ZIP (fformat ffeil)
de:ZIP (Dateiformat) es:Formato de compresión ZIP fr:Zip it:ZIP (formato di file) ja:ZIP (ファイルフォーマット) nl:ZIP (bestandsformaat) ru:Zip sl:Datotečni format ZIP