Representing sound
Sound is an analogue wave (continuous pressure variations in air). Computers can only store discrete numbers, so we approximate the wave by sampling it at regular intervals and storing each sample's amplitude as a binary number.
Sampling
A sample is a measurement of the wave's amplitude at one instant. The sampling rate (or sample frequency) is the number of samples taken per second, measured in Hertz (Hz) or kilohertz (kHz):
- 8 kHz — telephone quality.
- 22.05 kHz — radio quality.
- 44.1 kHz — CD quality (the standard for music).
- 48 kHz — DVD/professional video.
- 96 kHz — studio recording.
Higher sample rate → more accurate reconstruction of high frequencies, larger file size.
The Nyquist theorem (extension): you need to sample at twice the highest frequency you want to capture. Human hearing tops out at ~20 kHz, so 44.1 kHz captures it comfortably.
Bit depth
The bit depth is the number of bits used to store each sample's amplitude. Like colour depth in images, more bits → finer amplitude resolution.
- 8-bit: 256 levels — noticeable noise.
- 16-bit: 65,536 levels — CD quality.
- 24-bit: ~16.7 million levels — studio quality.
File size formula
For a raw audio file (mono, no compression, no metadata):
File size in bits = sample rate × duration × bit depth
File size in bytes = (sample rate × duration × bit depth) ÷ 8
For stereo (2 channels), multiply by 2.
✦Worked example— Worked example — CD-quality stereo
A 3-minute song at 44,100 Hz, 16 bits, stereo.
Duration: 3 × 60 = 180 seconds. Bits: 44,100 × 180 × 16 × 2 = 254,016,000. Bytes: ÷ 8 = 31,752,000 B. ≈ 31,752,000 ÷ 1024 ÷ 1024 ≈ 30.3 MB.
✦Worked example— Worked example — short voice clip
A 5-second voicemail at 8 kHz, 8-bit, mono.
Bits: 8000 × 5 × 8 = 320,000. Bytes: ÷ 8 = 40,000 B ≈ 39 KB.
Effect of changing parameters
- Doubling sample rate → 2× file size.
- Doubling bit depth → 2× file size.
- Mono → stereo → 2× file size.
- Doubling duration → 2× file size.
Trade-offs
| Parameter | Higher means | Cost |
|---|---|---|
| Sample rate | Captures higher pitches | Larger file |
| Bit depth | More dynamic range, less noise | Larger file |
| Channels | Stereo / surround sound | Larger file |
| Duration | Longer audio | Larger file |
Comparing sound to images
| Audio | Image |
|---|---|
| Sample (one number per moment) | Pixel (one colour per location) |
| Sample rate (Hz) | Resolution (pixels) |
| Bit depth | Colour depth |
| Mono/stereo channels | Colour channels (R/G/B) |
Compression contexts
Raw audio (WAV) is huge. Real audio files use compression:
- Lossless (FLAC) — keeps all data, ~50% size reduction.
- Lossy (MP3, AAC, Opus) — discards inaudible data, ~10% of original.
You don't need to know the algorithms for GCSE, but you should mention compression if a question asks "why is a CD-quality file 30 MB but an MP3 only 3 MB?".
⚠Common mistakes— Pitfalls
- Confusing sample rate units. 44.1 kHz means 44,100 samples per second, not 44.1.
- Forgetting the channels multiplier. Stereo doubles the file size compared to mono.
- Forgetting to convert duration. "3 minutes" = 180 seconds.
- Bits vs bytes. Always divide by 8 at the end.
- Mistaking sample rate for bit depth. Sample rate is "how often"; bit depth is "how precisely".
✦Worked example— Worked example — telephone audio
A 2-minute phone call at 8 kHz, 8-bit, mono. File size? 8000 × 120 × 8 × 1 = 7,680,000 bits = 960,000 B ≈ 938 KB.
➜Try this— Quick check
A piece of music is 4 minutes long, recorded at 44.1 kHz with 16-bit depth in stereo. Calculate the raw file size in MB.
Bits: 44,100 × (4 × 60) × 16 × 2 = 44,100 × 240 × 32 = 338,688,000. Bytes: 42,336,000. KB: 42,336,000 ÷ 1024 ≈ 41,344. MB: ÷ 1024 ≈ 40.4 MB.
AI-generated · claude-opus-4-7 · v3-deep-computer-science