Best Linux compression tool: 8 utilities tested

8 best Linux compression utilities
We have rounded up a selection of file compression utilities for Linux to see which one crunches the bytes best

In the '80s and early '90s, compression was king. As you struggled to connect to a BBS (bulletin board system) with the latest Amiga utilities on, you dreamed of when things would be faster and not having to spend as long decompressing files as they took to download.

Fast forward a few decades and the sheer size of the data files we juggle about is pretty boggling. Many have built in compression of some kind. Bandwidth isn't such an issue any more, and in some ways neither is disk space, but it would still be nice if there was a quick and convenient way of reclaiming a few GB here or there, or not having to wait so long when uploading email attachments.

RAR

Verdict

RAR
Version: 4.00 beta
Web: www.rarlab.com
Price: 29 Euros

As with ARJ, only really useful for trading files with Windows users.

Rating: 5/10

Bzip2

Julian Seward released the original bzip2 in 1997 under a BSD licence. In case you are wondering, there was indeed a bzip before that, but it was withdrawn by the author after possible patent worries loomed menacingly (ah, software patents, don't we all love them?).

Not to worry though, because bzip2 is better than it anyway. Using a combination of different algorithms - such as run-length encoding (RLE), the Burrows-Wheeler transform, and other such cunning trickery - it immediately became noteworthy in Unix circles because of the impressive compression achieved compared to the standard utility of the day, gzip.

Cunningly coded to be almost identical in terms of usage, bzip2 soon became a shoo-in replacement for all types of archiving purposes. Most notably, much source code was shipped using a tar/bzip2 combination instead of the usual tar/gzip combination of the time.

It's somewhat disappointing that in the intervening 14 years or so bzip2 hasn't replaced gzip entirely - changing the habits of Unix users is obviously like trying to steer a particularly fat continental shelf or something.

However, for large volumes of archiving, it seems the trade-off between space savings and compute time isn't always worth it. The figures we generated for Test 3 show that bzip2 running on maximum compression does shave a few per cent off the file size, but at the expense of taking around four times as long.

So if speed is of paramount importance to you, gzip is still a better option… Hang on, before we say that, you should check out the review for lbzip2.

bzip

Verdict

bzip2
Version: 1.0.6
Web: http://bzip.org
Price: Free (GPL)

It's fast and widely used, but switch to lbzip2 for a speed boost.

Rating: 5/10

lbzip2

This is an intriguing contender for the modern age. Using POSIX threads, this tool parallelises the compression routines so they can be run in more than one process and later combined. We care about this because lots of machines now have a multi-core processor.

Standard bzip and indeed many of the other tools on test are only capable of running in a single thread. That means if you have a dual-core processor, such as the one we used for testing, only one is being used for the hard work of compressing, while others lie idle. Of course, the other cores can take care of the system overhead, but it is a bit of a waste.

Parallelising the task does include a bit of overhead in terms of processor time, because there has to be a 'dispatcher' component that allocates tasks to the threads and combines their results at the end. Even so, on a dual-core machine you should see a reduction in the time taken by around 40%, depending on the actual task.

This is borne out by our results - with the same settings, the time taken by lbzip is between 35 and 45% faster. The significant thing is that it is by and large the same process, and you should end up with pretty much exactly the same files. In our tests, however, the resultant filesizes were a few bytes off in either direction, which may simply be due to slightly different application of the algorithms.

Importantly, files created with lbzip2 are valid bzip2 archives - the format hasn't changed, so they can be distributed to and uncompressed by those using bzip2. Lbzip2 is available in some repos, and some quarters suggest that it should just be aliased to the standard bzip2 commands - there is no real disadvantage to it even on a single core.

lbzip

Verdict

lbzip2
Version: 0.23
Web: http://lacos.hu
Price: Free (GPL)

This is a faster version of the old Unix favourite.

Rating: 7/10

7zip

Released in 1999, 7zip (aka 7z or 7za) is a relative newcomer to compression. It was written by Igor Pavlov, who also designed the LZMA algorithm that forms the default compression mode.

The 7zip code also includes other compression methods, such as bzip2, so it can support formats other than the default .7z.

Although it's open source, the main development focus is on the Windows platform, where 7z enjoys a great deal of popularity, and the code comes with a natty front-end. The basic source code has been tweaked by some, while other projects have made use of the LZMA SDK to produce very similar variants. One of these is xz, and others include p7zip. For this test we compiled from the original source code.

Looking at the test results, it's easy to think that 7z isn't making use of the multiple cores on offer. In fact, it is a threaded application, but even so takes slightly longer than the single-threaded bzip2 archiver, and twice as long as lbzip2. We could make some allowances for this code, since it's compiled from the generic source rather than being geared to work on Linux, but it fares better than pxz, the parallelised version of the derivative xz compressor.

One area in which this algorithm does perform well is decompression, as this and the xz utilities consistently perform better than the rest of the pack (apart from gzip, which isn't as compressed to begin with).

7z is certainly a useful tool, and one which may become more worthwhile on faster machines, or in cases where you want the compression to be good, but the decompression to be speedy (such as distributing apps and data).

7zip

Verdict

7zip
Version: 9.13 beta
Web: www.7-zip.org
Price: Free (GPL)

Pure LZMA action fares better than some of the derivatives

Rating: 7/10

TOPICS