The game-changing MP3 began as a project at Fraunhofer IIS, a German research institution, as a continuation of initial research from the University Erlangen-Nuremberg. Serious work began in 1987. The goal was to create a method to compress audio files to make it easier to transfer them over networks. This was necessary because networks back then didn’t have the capability to transfer large files quickly.
An uncompressed audio file can be relatively large. For example, a three-minute audio clip sampled at 44.1 kilohertz with two channels will take up about 31 megabytes of space. That’s not much by today’s standards, but back in 1987 that was a hefty file.
The MP3 format allowed you to convert a raw audio file into a compressed MP3 file, reducing the file size to something more manageable. It did this by analyzing audio and eliminating frequencies that fall outside the range of human hearing. The logic was that if the audience couldn’t hear the sound, there was no need to keep it. By cutting out all the imperceptible noise, the MP3 converter could reduce the overall file size.
If we took that 31-megabyte raw audio file and ran it through an MP3 converter at a bit rate of 128 kilobytes per second, we’d end up with a file that’s just 2.8 megabytes in size. That’s much easier to transfer across a network.
One way MP3 converters compress files is to look for frequency masking. This strategy takes advantage of a peculiarity in human hearing. If we perceive two frequencies that are similar but not quite identical and one is louder than the other, we’ll only really hear the louder one. So an MP3 converter can analyze an audio track and drop any similar frequencies of lower volume since we wouldn’t perceive them anyway.
Another type of masking is temporal masking, in which a short, loud sound temporarily masks our ability to perceive quieter, more subtle sounds. Again, the MP3 converter can drop some sounds that we wouldn’t hear due to temporal masking. In the original recording, the sounds would be there but we wouldn’t hear them anyway. Out they go!
The same is true for any frequencies that fall outside the typical range of human hearing, which spans from about 20 hertz to 20 kilohertz. Sounds below or above that range are outside our perception, though they may be found on a recording. Compression algorithms typically ditch these sounds to conserve space.
It took some time to perfect the algorithm. According to the researchers, they would try out their compression formulas using Suzanne Vega’s song “Tom’s Diner” as a benchmark test. The earlier formulas made the song sound chunky and unpleasant, giving the researchers the feedback they needed to make changes and create a working compression algorithm.
Eventually, they succeeded. The MP3 format and the introduction of the Winamp program helped create a new industry: the online distribution of digital music. It also gave rise to the MP3 player phenomenon, contributed to the birth of the iPod and, much to the chagrin of music labels, made widespread music piracy possible.
Today, there are other compression algorithms that are just as effective (or better) as MP3s at compressing audio while preserving quality. And while the MP3 is taking a bow, that doesn’t mean all the files will suddenly become useless. The files will still work on any devices or programs capable of playing them. But the MP3 format, which forced a major transformation in the music industry, has earned itself a rest.