Java Rich Internet Applications Guide > Networking > Compression Formats for Network Deployment
To increase server and network availability and bandwidth, two compression formats are available to Java deployment of applications and applets: gzip and Pack200. With both techniques, the compressed JAR files are transmitted over the network and the receiving application uncompresses and restores them.
See Reducing the Download Time in the Java Tutorials to create and deploy a compressed JAR file for a rich Internet application.
This section describes the technical details of how a web server handles a compressed JAR file. The following topics are covered:
Hypertext Transfer Protocol -- HTTP 1.1 (RFC 2616) discusses HTTP compression. HTTP Compression allows applications JAR files to be deployed as compressed JAR files. The supported compression techniques are gzip, compress, and deflate.
As of SDK/JRE version 5.0, HTTP compression is implemented in Java Web Start and Java Plug-in in compliance with RFC 2616. The supported techniques are gzip and pack200-gzip.
The requesting application can send an HTTP request to the server indicating its ability to handle compressed versions of the file. The following is an example HTTP request created when the Dynamic Tree Demo applet, whose JAR file has been compressed with Pack200, is loaded:
GET http://www.example.com/DynamicTreeDemo.jar.pack.gz HTTP/1.1 accept-encoding: pack200-gzip,gzip User-Agent: Mozilla/4.0 (Windows 7 6.1) Java/1.7.0 Host: example.com Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2 Connection: keep-alive
The following is the HTTP response from the server:
HTTP/1.1 200 OK Date: Wed, 21 Mar 2012 20:13:22 GMT Server: Apache/2.2.11 (Solaris, Linux, or Mac OS X) mod_ssl/2.2.11 OpenSSL/0.9.8k SVN/1.6.2 DAV/2 Last-Modified: Thu, 08 Mar 2012 03:48:34 GMT ETag: "489ee5-112d-4bab326774e43" Accept-Ranges: bytes Content-Length: 4397 Keep-Alive: timeout=5, max=99 Connection: Keep-Alive Content-Type: application/x-gzip Content-Encoding: gzip
For more information about the Dynamic Tree Demo applet, see Deploying an Applet in the Java Tutorials.
The Accept-Encoding
field specifies what the client can accept, which is set by the client. The Content-Encoding
field indicates what is being sent, which is set by the server. The Content-Type
field indicates what the client should expect when the transformation or decoding is done.
In this example, the Accept-Encoding
field is set to pack200-gzip
and gzip
, indicating to the server that the application (in this case, Mozilla Firefox running in Windows 7 with the Java Plug-in that comes with JRE 7) can handle pack200-gzip
and gzip
formats.
The server searches for the requested JAR file with a .pack.gz
or .gz
file extension and responds with the located file. The server sets the response header Content-Encoding
field to pack200-gzip
, gzip
, or NULL
depending on the type of file that is being sent, and optionally may set the Content-Type
to application/x-java-archive
. Therefore, by inspecting the Content-Encoding
field, the requesting application can apply the corresponding transformation to restore the original JAR file.
Example 1: Application requesting packed or compressed JAR
In Example 1, the client requests the file foo.jar
with the Accept-Encoding
field pack200-gzip,gzip
. The server searches for the file foo.jar.pack.gz
. If the server finds the file, it will send the file to the client and set the Content-Encoding
field to pack200-gzip
.
Example 2: Application requesting packed or compressed JAR
In Example 2, if the file foo.jar.pack.gz
is not found, the server responds with the file foo.jar.gz
, if it is found, and sets the Content-Encoding
field to gzip
.
Example 3: Application requesting packed or compressed JAR
In Example 3, if the files foo.jar.pack.gz
and foo.jar.gz
are not found, then the server responds with the file foo.jar
and either does not set the Content-Encoding
field or sets it to NULL
.
Example 4: Legacy application requesting JAR
In Example 4, a legacy application (an application without HTTP or Pack200 compressions) requests the file foo.jar
; consequently this application will continue to work seamlessly. Therefore, it is recommended that you host all three files foo.jar
, foo.jar.gz
, and foo.jar.jar.gz
.
gzip
is a freely available compressor available within the JRE and the SDK as java.util.zip.GZIPInputStream
and java.util.zip.GZIPOutputStream
.
The command line versions are available with most Solaris, Linux, or Mac OS X operating systems, Windows UNIX toolkits (Cygwin and MKS Toolkit), or from http://www.gzip.org/.
You can get the highest degree of compression using gzip
to
compress an uncompressed JAR file versus compressing a compressed JAR
file. The downside is that the JAR file may be stored uncompressed on target systems.
Here is an example:
gzip
to compress a JAR file that contains individual deflated entries:
Notepad.jar
: 46.25 kbNotepad.jar.gz
: 43.00 kbgzip
to compress a JAR file that contains stored entries (stored entries are entries that are not compressed):
Notepad.jar
: 987.47 kbNotepad.jar.gz
: 32.47 kbAs you can see, the download size can be reduced by 14% by compressing an uncompressed JAR file compared to 3% by compressing a compressed JAR file.
Pack200 compresses large files very efficiently, depending on the density and size of the class files in the JAR file. You can expect compression to one-ninth the size of the JAR file if it contains only class files and is in the order of several megabytes.
Using the same JAR file in the previous example:
Notepad.jar
: 46.25 kbNotepad.jar.pack.gz
: 22.58 kbIn this case, the same JAR file can be reduced by 50%.
Pack200 works most efficiently on Java class files. It uses several techniques to efficiently reduce the size of JAR files:
Compress and uncompress JAR files with the command line interfaces
pack200
and unpack200
in the bin
directory of your SDK or JRE directory.
You can also programmatically invoke Pack200 interfaces; see java.util.jar.Pack200
.
All these factors play into choosing a compression technique. The unpack200
tool is designed to be as efficient as possible, and it takes little time to restore the original file. If you have large JAR files (2 MB or more) comprised mostly of class files, Pack200 is the preferred compression technique. If you have large JAR files which are comprised of resource files (JPEG, GIF, data, etc.), then gzip is the preferred compression technique.
Pack200 loads the entire compressed file into memory. However, when target systems are memory and resource constrained, setting the Pack200.Packer.SEGMENT_LIMIT
to a lower value will reduce the memory requirements during compression and uncompression.
As a special case, a value of -1
will produce a single large segment with all input files, while a value of 0
will produce one segment for each class. Larger archive segments result in less fragmentation and better compression, but processing them requires more memory.
The default is -1
, which means pack200
will always create a single segment output file. In cases where extremely large output files are generated, you are are strongly encouraged to use segmenting or break up the input file into smaller JARs.
For example, a 10 MB JAR packed without this limit will typically pack about 10% smaller, but pack200
may require a larger Java heap (about ten times the segment limit).
Pack200 rearranges the contents of the resulting JAR file. The jarsigner
tool hashes the contents of the class file and stores the hash in an encrypted digest in the manifest. When unpack200
uncompresses a file, the contents of the classes will be rearranged and thus invalidate the signature. Therefore, the JAR file must be normalized first using pack200
and unpack200
, and thereafter signed.
Here's why this works: Any reordering pack200
does on any class file structures is idempotent, so the
second time it is compressed, it does not change the orderings produced by the first compression. Also, unpack200
is guaranteed by the JSR 200
specification to produce a specific bytewise image for any given
transmission ordering of archive elements.
For example, suppose you want to use HelloWorld.jar
:
Recompress, or repack, the file to normalize the JAR file; repacking uncompresses and compresses the JAR file in one step.
% pack200 --repack HelloWorld.jar
Sign the JAR.
% jarsigner -keystore myKeystore HelloWorld.jar user_name
Note: You must sign the repacked file with the same key that was used when building the original JAR file. Alternatively, delete all signature files found in the META-INF
directory before repacking, re-signing and verifying. The signature files are named MANIFEST.MF
, *.DSA
and *.SF
.
Verify the just signed JAR file to ensure the signing worked.
% jarsigner -verify HelloWorld.jar jar verified.
Ensure the JAR file still works.
% Java -jar HelloWorld.jar HelloWorld
Compress the JAR file with pack200
.
% pack200 HelloWorld.jar.pack.gz HelloWorld.jar
Note: You must compress the JAR file with the same options that you used to repack the file to normalize the JAR file, as demonstrated in step 1. Additionally, you must set the segment limit to -1
(unlimited) for all packing steps when using JDK 6 and earlier releases to prevent accidental variations of segment boundaries; class file sizes can change slightly under these circumstances, thus disrupting signatures. The default segment limit for JDK 7 and later is -1
.
Uncompress the file with unpack200
% unpack200 HelloWorld.jar.pack.gz HelloT1.jar
Verify the JAR file.
% jarsigner -verify HelloT1.jar jar verified.
Test the JAR file.
% Java -jar HelloT1.jar HelloWorld
After verification, you can deploy the compressed pack file HelloWorld.jar.pack.gz
..
Pack200 by default behaves in a High Fidelity (Hi-Fi) mode, meaning all the original attributes present in the classes as well as the attributes of each individual entry in a JAR file is retained. These typically tend to add to the packed file size; here are some of the techniques one can use to further reduce the size of the download:
Modification times: If modification time of the individual entries in a JAR file is not a concern, you can specify the option Pack200.Packer.MODIFICATION_TIME="LATEST"
. This will allow one modification time to be transmitted in the pack file for each segment. The latest time will be the latest time of any entry within that segment.
Deflation hint: Similar to setting the modification time to "LATEST"
, if the compression state of the individual entries in the archive is not required, set Pack200.Packer.DEFLATION_HINT="false"
. This will fractionally reduce the download size, as individual compression hints will not be transmitted. However, the JAR file when recomposed will contain "stored" entries and hence may consume more disk space on the target system.
For example:
pack200 --modification-time=latest --deflate-hint="true" tools-md.jar.pack.gz tools.jar
Note: the above optimizations will yield better results with a JAR file containing thousands of entries.
Attributes: Several class attributes are not required when deploying JAR files. These attributes can be stripped out of class files, significantly reducing download size. However, care must be taken to ensure that required runtime attributes are maintained.
Debugging attributes: If debugging information, such as Line Numbers and Source File, is not required (typically in applications stack traces), then these attributes can be discarded by specifying Pack200.Packer.STRIP_DEBUG=true.
This typically reduces the packed file by about 10%.
Example:
pack200 --strip-debug tools-stripped.jar.pack.gz tools.jar
Other attributes: Advanced users may use some of the other strip-related properties to strip out additional attributes. However, extreme caution should be used when doing so, the resultant JAR file must be tested on all possible Java runtime systems to ensure that the runtime does not depend on the stripped attributes.
Pack200 deals with standard attributes defined by the Java Virtual Machine Specification; however compilers are free to introduce custom attributes. When such attributes are present, by default, Pack200 passes through the class, emitting a warning message. These "passed-through" class files may contribute to bloating of packed files. If the unknown attributes are prevalent in the classes of a JAR file, this may lead to a very large bloat in the compressed output. In such cases, consider the following strategies:
Strip the attribute if the attribute is deemed to be redundant at runtime; this can be achieved by setting the property Pack200.Packer.UNKNOWN_ATTRIBUTE=STRIP
:
pack200 --unknown-attribute=strip foo.pack.gz foo.jar
If the attributes are required at runtime, and they do contribute to inflation in the size of the compressed file, then identify the attribute from the warning message and apply a suitable layout for these as described in the Pack200 JSR 200 specification, and the Java API reference section for the interface Pack200.Packer.
It is possible that a compiler could define an attribute not implemented in the layout specification of Pack200, and may cause the pack200
to malfunction. In such cases, an entire class file or class files can be "passed through", as if it were a
resource by virtue of its name and can be specified as follows:
pack200 --pass-file="com/acme/foo/bar/baz.class" foo.pack.gz foo.jar
The following passes through an entire directory and its contents,
pack200 --pass-file="com/acme/foo/bar/" foo.pack.gz foo.jar
Note: When signing large JAR files, this step may fail with a security error. A likely cause is bug 5078608. Use one of the following workarounds:
--segment-limit=-1
during repacking and packing.pack200 --repack b.jar a.jar
b.jar
.pack200 --repack c.jar b.jar
c.jar
.pack200 out.jar.pack.gz c.jar
out.jar.pack.gz
.You may wish to take advantage of the Pack200 technology in your installation program, whereby a product's JAR files may need to compressed using Pack200 and uncompressed during installation. If the JRE or SDK is bundled in the installation, you are free to use the unpack200
(Solaris, Linux, or Mac OS X) or unpack200.exe
(Windows) tool in the distribution bin
directory. This implementation is a pure C++ application requiring no Java runtime to be present for it to run.
Windows: Installers may use a better algorithm than the one in GZIP to compress entries. In such cases, one will get better compression using the Installer's intrinsic compression, by using the pack200
tool as follows:
pack200 --no-gzip foo.jar.pack foo.jar
This will prevent the output file from being gzip compressed.
unpack200
is a Windows console application; i.e. it will display a MS-DOS window during the install. To suppress this, use a launcher with a WinMain
, which will suppress this window, as shown below.
Sample Code:
#include "windows.h" #include <stdio.h> int APIENTRY WinMain(HINSTANCE hInstance, HINSTANCE hPrevInstance, LPSTR lpCmdLine, int nCmdShow) { STARTUPINFO si; memset(&si, 0, sizeof(si)); si.cb = sizeof(si); PROCESS_INFORMATION pi; memset(&pi, 0, sizeof(pi)); //Test //lpCmdLine = "c:/build/windows-i586/bin/unpack200 -l c:/Temp/log c:/Temp/rt.pack c:/Temp/rt.jar"; int ret = CreateProcess(NULL, /* Exec. name */ lpCmdLine, /* cmd line */ NULL, /* proc. sec. attr. */ NULL, /* thread sec. attr */ TRUE, /* inherit file handle */ CREATE_NO_WINDOW | DETACHED_PROCESS, /* detach the process/suppress console */ NULL, /* env block */ NULL, /* inherit cwd */ &si, /* startup info */ &pi); /* process info */ if ( ret == 0) ExitProcess(255); // Wait until child process exits. WaitForSingleObject( pi.hProcess, INFINITE ); DWORD exit_val; // Be conservative and return if (GetExitCodeProcess(pi.hProcess, &exit_val) == 0) ExitProcess(255); ExitProcess(exit_val); // Return the error code of the child process return -1; }
It is required that all JAR files, compressed and uncompressed, be tested for correctness with your applications test qualifiers. When using the command line interface pack200
, the output file will be compressed using gzip with default values. A user may create a simple pack file and compress using gzip
with user-specified options or using some other compressor.
For more information, see pack200
and unpack200
in Java Deployment Tools.
In Java SE 6, the Java class file format has been updated. For more information see JSR 202: Java Class File Specification Update. Due to JSR 202, the Pack200 engine needs to be updated accordingly for the following reasons:
To keep the changes minimal and seamless for users, pack200
will generate appropriately versioned pack files based on the version of the input class files.
Also to maintain backward compatibility, if the input JAR files are solely comprised of JDK 5 or older class files, a JDK 5 compatible pack file is produced. Otherwise a Java SE 6 compatible Pack200 file is produced. For more information, refer to the pack200 man page for Solaris, Linux, Mac OS X, or Windows.