Show understanding of how a text file, bitmap image, vector graphic and sound file can be compressed

Published by Patrick Mutisya · 14 days ago

Cambridge A-Level Computer Science 9618 – 1.3 Compression

1.3 Compression

Learning Objective

Show understanding of how a text file, bitmap image, vector graphic and sound file can be compressed.

Key Concepts

  • Compression reduces the amount of data required to store or transmit information.
  • Two broad categories:

    • Lossless compression – original data can be perfectly reconstructed.
    • Lossy compression – some information is permanently discarded to achieve higher reduction.

  • Typical measures:

    • Compression ratio \$R = \frac{\text{original size}}{\text{compressed size}}\$.
    • Average bits per symbol \$b = \frac{\text{compressed size (bits)}}{\text{number of symbols}}\$.
    • Entropy \$H = -\sum{i} pi \log2 pi\$ gives a theoretical lower bound for lossless coding.

1. Text Files

Plain text consists of a sequence of characters, each typically represented by a fixed number of bits (e.g., 8‑bit ASCII). Compression exploits redundancy in character frequencies.

Common Lossless Techniques

  • Run‑Length Encoding (RLE) – replaces consecutive identical characters with a count and the character (e.g., AAAAA5A).
  • Huffman Coding – builds a variable‑length prefix code based on character frequencies. Frequently occurring characters get shorter codes.
  • Lempel‑Ziv‑Welch (LZW) – builds a dictionary of repeated substrings during encoding; widely used in the .zip format.

Example: Huffman Coding

Given character probabilities \$p(A)=0.5\$, \$p(B)=0.25\$, \$p(C)=0.125\$, \$p(D)=0.125\$, the optimal Huffman tree yields codes:

  • A → 0
  • B → 10
  • C → 110
  • D → 111

The average bits per character \$b = \sum pi \cdot \text{code length}i = 0.5(1)+0.25(2)+0.125(3)+0.125(3)=1.75\$ bits, compared with the original 8 bits.

2. Bitmap Images

A bitmap (raster) image stores colour information for each pixel. For a \$W \times H\$ image with \$c\$ colour bits per pixel, the uncompressed size is \$W \times H \times c\$ bits.

Lossless Methods

  • PNG (Portable Network Graphics) – uses a combination of filtering, LZ77 dictionary compression and Huffman coding.
  • GIF (Graphics Interchange Format) – limited to 256 colours; uses LZW compression.

Lossy Methods

  • JPEG (Joint Photographic Experts Group) – transforms 8×8 pixel blocks using the Discrete Cosine Transform (DCT), quantises the coefficients, then entropy‑codes the result.

JPEG Compression Steps (simplified)

  1. Convert RGB to YCbCr colour space (separates luminance from chrominance).
  2. Divide each channel into 8×8 blocks.
  3. Apply DCT to each block:

    \$F(u,v)=\frac{1}{4}\alpha(u)\alpha(v)\sum{x=0}^{7}\sum{y=0}^{7}f(x,y)\cos\frac{(2x+1)u\pi}{16}\cos\frac{(2y+1)v\pi}{16}\$

  4. Quantise the DCT coefficients using a quantisation matrix (higher frequencies are rounded more aggressively).
  5. Encode the quantised coefficients with RLE and Huffman coding.

3. Vector Graphics

Vector graphics describe images using geometric primitives (lines, curves, shapes) and attributes (colour, stroke). The data is resolution‑independent.

Typical Formats

  • S \cdot G (Scalable \cdot ector Graphics) – an XML‑based text format. Compression is usually achieved by applying a generic lossless compressor (e.g., GZIP) to the XML file, yielding .svgz.
  • PDF (Portable Document Format) – can embed vector objects; uses a combination of LZW, Flate (deflate) and other filters.

Why \cdot ector Graphics Compress Well

Because they store mathematical descriptions rather than per‑pixel data, the amount of information grows roughly with the number of objects, not with image resolution. Applying a lossless compressor removes whitespace, repeated attribute strings, and exploits the regular structure of XML.

4. Sound Files

Digital audio records amplitude samples at a fixed sampling rate (e.g., 44.1 kHz) and bit depth (e.g., 16‑bit). Uncompressed PCM size for mono audio is:

\$\text{Size (bits)} = \text{sampling rate} \times \text{bit depth} \times \text{duration (seconds)}\$

Lossless Audio Compression

  • FLAC (Free Lossless Audio Codec) – uses linear prediction, residual coding, and entropy coding (Rice coding).
  • ALAC (Apple Lossless Audio Codec) – similar predictive approach.

Lossy Audio Compression

  • MP3 (MPEG‑1 Audio Layer III) – splits audio into sub‑bands, applies a modified DCT, psychoacoustic masking, quantisation, and Huffman coding.
  • AAC (Advanced Audio Coding) – improves on MP3 with more efficient filter banks and better psychoacoustic models.

MP3 Compression Overview

  1. Divide the signal into 1152‑sample frames.
  2. Apply a hybrid filter bank (polyphase filter + MDCT).
  3. Analyse psychoacoustic model to determine masking thresholds.
  4. Quantise coefficients according to the thresholds (less audible components receive coarser quantisation).
  5. Encode quantised data with Huffman tables.

Comparison Summary

File TypeTypical FormatsCommon Compression MethodsLossless / LossyKey Idea
Text.txt, .csvRLE, Huffman, LZW (ZIP)LosslessExploit character frequency and repeated substrings.
Bitmap Image.bmp, .png, .gif, .jpgPNG (filter + LZ77 + Huffman), GIF (LZW), JPEG (DCT + quantisation + Huffman)PNG/GIF – lossless; JPEG – lossyTransform spatial redundancy (DCT) or encode repeated patterns.
Vector Graphic.svg, .pdfXML text + GZIP/Deflate (S \cdot GZ), PDF filters (Flate, LZW)LosslessCompress descriptive text; no pixel data to lose.
Sound.wav, .flac, .mp3, .aacFLAC (prediction + Rice coding), MP3/AAC (sub‑band transform + psychoacoustic masking + Huffman)FLAC – lossless; MP3/AAC – lossyPredictive coding for redundancy; discard inaudible frequencies for lossy.

Suggested diagram: Flowcharts illustrating lossless (e.g., Huffman) and lossy (e.g., JPEG, MP3) compression pipelines.