Show understanding of the need for and examples of the use of compression

Published by Patrick Mutisya · 8 days ago

Cambridge A-Level Computer Science 9618 – 1.3 Compression

1.3 Compression

Objective

Show understanding of the need for and examples of the use of compression.

Why is Compression Needed?

  • Storage limitations – data must fit on disks, SSDs, or removable media.
  • Transmission efficiency – reduces bandwidth usage and speeds up network communication.
  • Cost reduction – less storage and lower transmission costs.
  • Performance – smaller files can be loaded and processed more quickly.

Types of Compression

  • Lossless compression: original data can be perfectly reconstructed.
  • Lossy compression: some data is permanently discarded to achieve higher compression ratios.

Common Lossless Techniques

  1. Run‑Length Encoding (RLE)
  2. Huffman Coding
  3. Lempel‑Ziv‑Welch (LZW)

Common Lossy Techniques

  1. Transform coding (e.g., JPEG for images)
  2. Perceptual coding (e.g., MP3 for audio)
  3. Predictive coding (e.g., MPEG for video)

Examples of Compression in Everyday Use

  • Text files – ZIP, GZIP, BZIP2
  • Images – PNG (lossless), JPEG (lossy)
  • Audio – FLAC (lossless), MP3/AAC (lossy)
  • Video – H.264/A \cdot C, H.265/HE \cdot C (lossy)
  • Web pages – Brotli, GZIP for HTTP transfer

Illustrative Comparison of Lossless vs. Lossy Compression

AspectLosslessLossy
Data integrityExact original data can be recoveredOriginal data is approximated; some information is lost
Typical compression ratio2:1 – 3:1 (occasionally up to 5:1)10:1 – 100:1 depending on content and quality settings
Common applicationsSource code, text documents, executable files, archival storagePhotographs, music, streaming video, web graphics
Algorithm examplesHuffman, LZW, DEFLATEJPEG, MP3, H.264

Run‑Length Encoding (RLE) – Simple Example

Consider the binary string \$1011111100\$. Using RLE we store it as:

\$\text{RLE} = (1,1), (0,1), (1,6), (0,2)\$

Each pair represents the bit value and the number of consecutive occurrences.

Huffman Coding – Principle

Huffman coding assigns shorter codes to more frequent symbols. For the text “ABRACADABRA” with symbol frequencies:

SymbolFrequency
A5
B2
R2
C1
D1

Resulting Huffman codes might be:

  • A → 0
  • B → 110
  • R → 111
  • C → 1010
  • D → 1011

Impact of Compression on System Design

  • CPU overhead – encoding/decoding requires processing time.
  • Memory usage – buffers may be needed for both compressed and decompressed data.
  • Latency – real‑time applications must balance compression ratio against delay.

Suggested diagram: Flowchart showing the stages of compression and decompression, including input data, encoder, compressed output, transmission/storage, decoder, and reconstructed data.

Summary

Compression is essential for efficient use of storage and bandwidth. Understanding when to use lossless versus lossy techniques, and being familiar with common algorithms such as RLE, Huffman, and LZW, enables developers to make informed decisions that balance data fidelity, performance, and resource constraints.