All you need to know about JPEG compression

Page 2 of 2:

How does JPEG work?

How does JPEG work?

JPEG is a lossy compression format conceived explicitly for making photo files smaller and it exploits the imperfect characteristics of our perception. JPEG files are more correctly described as being in JFIF (JPEG File Interchange Format), which is a limited expression of the full JPEG standard.

The human eye is able to discriminate the brightness of an image much more finely than its colour information (indeed, at low levels of light we actually see in black and white, since the illumination is too dim to stimulate the cone cells in the retina). This means that the luma value needs much higher fidelity than the two chroma components do. The JPEG format exploits the eye's imbalance by downsampling the chroma values.

The next step is to split the downsampled pixels in the image into 8 x 8 blocks. Each colour component is split up separately, and each component sample goes through the same process in what follows. Note that on many occasions, the size of the image will not be a simple multiple of eight pixels in either direction. This can result in some pixel artefacts being created along the right and bottom sides of a JPEG picture.

The next step is fun, but puzzling. Each 8 x 8 block is converted into another matrix using a Discrete Cosine Transform (DCT). This transform, which is similar to a Fourier transform, analyses the frequencies of the original values along each row and column using a set of cosine waves oscillating at different frequencies and amplitudes. The reason for doing this is that the higher frequencies can be minimized or zeroed out since we do not perceive their loss as acutely as the more energetic lower frequencies.

The interesting thing about this transform is that the value with the biggest amplitude of the matrix is found at the top-left cell (known as the DC coefficient) and the values get smaller the further away from that point they get (all 63 other values are known as the AC coefficients). Generally, we'll need more bits to represent the values in this transformed matrix than can be held in a byte (which is what we've been using up to now).

Finally, the resulting quantised matrix is encoded using Huffman compression. To make the most use of the way the values in the matrix seem to radiate out from the top-left corner, the values are encoded not across each row for all rows but in a zig-zag pattern. This means that the zero cells tend to appear at the end of the zig-zag chain and therefore can be ruthlessly compressed (in fact, there's a special code that indicates that all remaining cell values are zero in the 8 x 8 block).

After all that, how do you decompress a JPEG image to a raster bitmap to display on a screen? Well, pretty obviously, you should perform all of these steps in reverse order. First of all, you have to decode the Huffman compressed 8 x 8 block. This gives you the quantised matrix. Now you can multiply the quantised matrix by the relevant quantisation matrix to give the matrix of DCT coefficients. This is then transformed by the inverse DCT to give the original component matrix in the YCbCr colour space.

Current page: How does JPEG work?

Prev Page An overview of file compression

Useful links