Show understanding of lossy and lossless compression and justify the choice of a method for a given situation.
Quantitative example: a 100 MB uncompressed video (≈ 800 Mbit) streamed over a 2 Mbps link would need about 7 minutes. The same video encoded with H.264 at 5 Mbps needs only ≈ 2.7 minutes, saving both time and data.
Compression reduces the number of bits required to represent information by exploiting redundancy or the limits of human perception. The result is a smaller compressed representation that can be stored or transmitted more efficiently.
Used when exact fidelity is essential (e.g., source code, legal documents, medical images, archival audio).
Example: AAAAABBBCCDAA → 5A3B2C1D2A.
Entropy formula: \(H = -\sum{i=1}^{n} pi \log2 pi\).
A Huffman tree gives the optimal average code length for the given probabilities.
| Technique | Typical media / use‑case |
|---|---|
| RLE | Simple bitmap images, fax transmission, monochrome icons |
| Huffman | PNG image format, DEFLATE streams, JPEG‑2000 lossless mode |
| LZW | GIF images, early Unix compress, PDF internal streams |
| DEFLATE (ZIP/GZIP) | Text files, source code archives, generic file bundles (ZIP, GZIP) |
Most text files are archived with ZIP/DEFLATE. Repetitive words and spaces are replaced by dictionary references, achieving typical ratios of 2 : 1 to 3 : 1.
Vector graphics (e.g., SVG) already describe images mathematically, so they are inherently lossless. Compression focuses on reducing file‑size overhead:
stroke-width → sw) where possible..svgz format).Concrete example:
example.svg (plain XML) = 45 KB.svgo to strip whitespace and unused definitions → 38 KB.example.svgz = 12 KB (≈ 73 % reduction).Lossy techniques are rarely used because the visual fidelity of a vector image is defined by its mathematical description, not by pixel data.
Used when a perfect replica is not required and higher compression ratios are desirable (e.g., photographs, audio, video streaming).
| Technique | Typical media / use‑case |
|---|---|
| JPEG | Photographs, web images |
| PNG (lossless mode) | Line art, screenshots, images requiring transparency |
| MP3 / AAC | Music streaming, podcasts |
| FLAC | Archival audio, high‑resolution music collections |
| H.264 / H.265 | Online video, Blu‑ray, video conferencing |
| Aspect | Lossless | Lossy |
|---|---|---|
| Data integrity | Exact reconstruction | Approximate reconstruction |
| Typical applications | Text, source code, archives, medical imaging, archival audio (FLAC) | Photographs, streaming audio/video, web images (JPEG, MP3, H.264) |
| Common algorithms | RLE, Huffman, LZW, DEFLATE, PNG, JPEG‑2000 (lossless mode) | JPEG, MP3, AAC, H.264, HEVC |
| Typical compression ratio | 2 : 1 – 3 : 1 (up to ~5 : 1 for highly redundant data) | 10 : 1 – 100 : 1 or higher |
| Impact on quality | No loss of quality | Quality degrades as compression increases; artefacts may become visible. |
When asked to justify a compression method, tick the criteria that apply and then choose the algorithm that best satisfies them.
| Criterion | Lossless needed? | Lossy acceptable? |
|---|---|---|
| Exact data fidelity required (e.g., legal text, medical diagnosis) | ✓ | ✗ |
| Very limited storage or bandwidth | ✗ (or moderate) | ✓ (higher ratios) |
| Processing power limited (e.g., embedded device) | ✓ (simple RLE, Huffman) | ✗ (complex transforms may be too heavy) |
| Human perception can hide artefacts (photos, audio, video) | ✗ | ✓ |
“Explain why lossless compression is required for archiving legal documents and suggest a suitable algorithm. Justify your choice with reference to data integrity, typical compression ratio, and processing requirements.”
Your generous donation helps us continue providing free Cambridge IGCSE & A-Level resources, past papers, syllabus notes, revision questions, and high-quality online tutoring to students across Kenya.