How MP3 compression works

How MP3 compression works
How does MP3 shrink CD files to such a small size?

Ask us to name a universally known file format, and it would probably be a toss-up between MP3 and JPG.

Simply put, if you're a fully paid up member of the digital multimedia revolution, you will have thousands of these files on your computer's hard drive – music you listen to and photos you look at – both of which have been compressed to cram as much information as possible into the minimum of space.

Figure 1

If we sample at too low a rate, we may miss some peaks and troughs in the original audio and so the resulting waveform may sound completely different and muddy.

figure 2

Figure 2 shows this scenario, where the resulting waveform in red looks quite different from the original. We therefore need to sample much more often. Given that the human ear (in general) only hears a tone up to about 20kHz in frequency, we should therefore sample at least twice that rate in order to properly capture the highs and lows of the audio wave at that frequency. With a fudge factor added just in case, the rate settled on was 44,100Hz.

Figure 3

Figure 3 shows a different problem: the number of possible values for the amplitude is fairly small. From the original measured amplitude, the processor must choose the closest value it can record. Here we've got a fairly high sample rate, but the measurements of the amplitude are pretty coarse.

Again, the resulting waveform looks different from the original – a little more subtle perhaps, but it could still alter the sound pretty badly (highs might be higher than the original, for example, making the result more shrill and meaning that subtle nuances in the music are lost).

Here, a different criterion comes into play: making the sample values fit into a whole number of bytes to help make the output DAC's job easier. One byte would be far too small for this (with only 256 different values for the amplitude), so the original designers decided on two bytes per sample.