Digital Audio:  A Brief Introduction

    Upon entering the world of MP3, one often hears about the differences between analog and digital recording.  MP3 is a type of digital technology, as is its predecessor and companion, the compact disc (CD).  Predating both of these technologies are the phonograph and the audio-cassette recorder, devices which utilize analog waves in order to record and play back sound.

    The phonograph was the first known device capable of recording music, and its basic principles are critical to understanding the differences between analog and digital recording.   The phonograph worked by capturing the vibrations caused by sound waves.  As the vibrations of an analog wave were produced, vibrations in a nearby diaphragm caused an attached needle to scratch a rotating tin foil cylinder.   The resulting impressions left in the tin foil cylinder represented the original analog signal received by the diaphragm.

    In order to play back the sounds, the needle was rotated back across the impressions, causing the needle and the attached diaphragm to vibrate.   The vibrating of the diaphragm, in turn, produced the sound.   The modern phonograph works in a substantially similar manner, except that the amplification of the signals read by the needle is accomplished electronically rather than mechanically using the diaphragm.

    The introduction of the compact disk and its digital recording system marked significant improvements over analog recording technologies.  These improvements were most notably related to fidelity and reproduction.   "Fidelity" refers to the similarity between the original signal and the reproduced signal.   Reproduction, or in this case, "perfect reproduction," means that the sound stays the same regardless of the number of times that it is reproduced.   The compact disc aimed to eliminate the scratchy noise and the signal distortion which accompanied analog recording devices.   It also sought to create a method of storage and playback which would not lend itself to as much wear and tear as was associated with repeated pressure of a needle across the groove of a record.

    Compact discs were able to accomplish these goals through the use of digital recording technology, the converting of the analog wave into a "stream of numbers."  In digital recording, it is the stream of numbers that is stored rather than the wave itself.  The conversion is accomplished using a device called an analog-to-digital converter.  In order to play back the music, the stream of numbers must be converted back to wave form using a digital-to-analog converter (DAC).  Speakers are then used to amplify the sound.

    The analog-to-digital process is key to understanding the fidelity of CD sound.  In the analog-to-digital process, the analog wave is converted to numerical form.  This is accomplished through rapid sampling of the wave, taking at each sample a number which corresponds to a gradation in the overall sound.   Thus, in the analog-to-digital process, there are two variables which can be adjusted in order to improve sound quality:  the number of possible gradations (quantization levels) in recording and the frequency with which sampling occurs.   The more gradations that are possible, the closer the digital representation of the sound will come to replicating the original analog signal produced (the DAC will produce higher fidelity sound).  This is called sampling precision.   The same holds true for the frequency at which such readings are taken, i.e. the sampling rate.   The difference between the original wave and the digital encoding of the wave is called sampling error.   Sampling error is reduced as the sampling precision and the sampling rate are increased.

    In CD sound recording, the sampling rate is 44,100 samples per second and the number of gradations is 65,536.   The fidelity of CD sound is such that the difference between the sound produced and the original wave is almost imperceptible to human ears.   However, in order to achieve such high sampling and precision rates, a lot of data is required.  On a CD, the numerical representations of the sound wave are stored as bytes .  As two bytes are necessary to represent the 65,536 gradations used to encode CD sound, each sample that is taken will require two bytes of data.   In addition, two sound streams are recorded (one for each speaker) for the full capacity of the CD (which holds about seventy-four minutes worth of music).   At 44,100 samples/channel/second, a single CD must hold 783,216,000 bytes.

    MP3 technology is important because it can reduce the number of bytes a CD requires by a factor of 12.   If it were not for audio compression, transferring music over the Internet would be highly impracticable.  This is apparent when one understands the actual amount of data that is required to encode digital music.  A song which is 3-minutes in length will take up about 32 million bytes of space on a CD.   Over a 56 kbit modem, it would take almost 2 hours to download one song coded in CD format.   Compression technology aims to reduce the total number of bytes without hurting sound quality.  MP3, for example, allows a 32 megabyte song to be compressed down to about 3 megabytes of sound.   Thus, with MP3, transferring music becomes feasible where it was not before compression technology.

The facts presented here are largely derived from "How Compact Discs (CDs) Work" by Marshall Brain.