Show understanding of how sound is represented and encoded

Published by Patrick Mutisya · 14 days ago

Cambridge A-Level Computer Science 9618 – Multimedia: Sound Representation and Encoding

1.2 Multimedia – Sound Representation and Encoding

What is Sound?

Sound is a mechanical wave that propagates through a medium (air, water, solids) as variations in pressure. In digital systems we must convert this continuous analogue signal into a discrete digital form that a computer can store, process and transmit.

Key Concepts in Digital Audio

  • Sampling – measuring the amplitude of the analogue waveform at regular time intervals.
  • Quantisation – assigning each sampled amplitude a numeric value from a finite set of levels.
  • Bit depth – the number of bits used for each quantised sample.
  • Sample rate (frequency) – the number of samples taken per second, measured in Hertz (Hz).
  • Dynamic range – the ratio between the loudest undistorted signal and the quietest detectable signal.

Sampling Theory

The Nyquist–Shannon sampling theorem states that a continuous signal can be perfectly reconstructed from its samples if the sampling frequency \$fs\$ is greater than twice the highest frequency component \$f{max}\$ of the signal.

\$fN = \frac{fs}{2}\$

where \$f_N\$ is the Nyquist frequency. For human hearing (≈20 kHz), a common minimum sample rate is 44.1 kHz (used in CD audio).

Quantisation and Bit Depth

Quantisation maps each sampled amplitude to one of \$2^n\$ discrete levels, where \$n\$ is the bit depth.

The theoretical dynamic range of a PCM (Pulse‑Code Modulation) system is:

\$DR = 6.02 \times n + 1.76\ \text{dB}\$

Examples:

  • 8‑bit audio → \$DR \approx 50\$ dB
  • 16‑bit audio → \$DR \approx 96\$ dB (CD quality)
  • 24‑bit audio → \$DR \approx 144\$ dB (professional recording)

PCM Encoding Process

  1. Analogue signal enters an analogue‑to‑digital converter (ADC).
  2. ADC samples the signal at the chosen sample rate.
  3. Each sample is quantised to the nearest level defined by the bit depth.
  4. The quantised values are stored as binary numbers (e.g., 16‑bit signed integers).
  5. Optionally, the PCM data may be compressed (lossless or lossy) before storage or transmission.

Common Audio File Formats

FormatCompression TypeTypical Sample RatesTypical Bit DepthsTypical Use
WAVUncompressed (PCM)8 kHz – 192 kHz8, 16, 24, 32‑bitProfessional audio, editing
AIFFUncompressed (PCM)8 kHz – 192 kHz8, 16, 24, 32‑bitApple ecosystem, studio work
MP3Lossy (psychoacoustic)16 kHz – 48 kHzVariable (bit‑rate 64–320 kbps)Streaming, portable devices
AACLossy (advanced psychoacoustic)16 kHz – 48 kHzVariable (bit‑rate 64–256 kbps)Modern streaming services
FLACLossless8 kHz – 192 kHz16, 24‑bitHigh‑fidelity archiving

Lossless vs. Lossy Compression

  • Lossless algorithms (e.g., FLAC, ALAC) reduce file size without discarding any audio information. The original PCM data can be perfectly reconstructed.
  • Lossy algorithms (e.g., MP3, AAC) remove audio components that are less audible to the human ear, achieving much higher compression ratios at the cost of some quality loss.

Example Calculation: File Size of Uncompressed Audio

Given:

  • Sample rate \$f_s = 44{,}100\ \text{Hz}\$
  • Bit depth \$n = 16\ \text{bits}\$
  • Channels = 2 (stereo)
  • Duration \$t = 3\ \text{minutes}\$

Data rate:

\$\text{Data rate} = f_s \times n \times \text{channels} = 44{,}100 \times 16 \times 2 = 1{,}411{,}200\ \text{bits/s}\$

File size:

\$\text{Size} = \frac{\text{Data rate} \times t}{8} = \frac{1{,}411{,}200 \times 180}{8} \approx 31{,}752{,}000\ \text{bytes} \approx 30.3\ \text{MiB}\$

Practical Considerations for A‑Level Projects

  1. Choose a sample rate that matches the intended playback device (e.g., 44.1 kHz for CD‑quality audio).
  2. Use 16‑bit depth for most applications; higher bit depths are only needed for professional recording.
  3. When bandwidth is limited, prefer lossy formats with an appropriate bit‑rate (e.g., 128 kbps MP3 for speech).
  4. Always test the audible quality after compression; subjective listening tests are essential.

Suggested diagram: Flowchart of the audio encoding process – from analogue signal, through sampling, quantisation, PCM storage, optional compression, to final file format.