Understand how files are compressed using lossy and lossless compression methods

Computer Systems – Cambridge IGCSE (0478) – Module 1

Learning Objectives (All AO levels)

  • AO1 – Knowledge & Understanding: Explain key concepts, terminology and the purpose of each computer‑systems component.
  • AO2 – Application: Perform calculations (binary conversions, file‑size, compression ratios) and apply algorithms such as Huffman coding.
  • AO3 – Analysis & Evaluation: Choose appropriate hardware, software or compression method for a given task and justify the decision.

1. Number Systems & Data Representation (Syllabus 1.1)

1.1 Binary, Decimal and Hexadecimal

BaseDigits usedPlace values
Binary (base‑2)0, 1… 2³ 2² 2¹ 2⁰
Decimal (base‑10)0–9… 10³ 10² 10¹ 10⁰
Hexadecimal (base‑16)0–9, A–F… 16³ 16² 16¹ 16⁰

Conversion shortcuts (AO2)

  • Binary → Hex: group bits in sets of four (starting from the right) and replace each group with its hex equivalent.
  • Hex → Binary: replace each hex digit with its 4‑bit binary pattern.
  • Binary → Decimal: multiply each bit by its weight (2ⁿ) and add.

Practice

  1. Convert 10110110₂ to hexadecimal.
  2. Convert 3F₁₆ to binary.
  3. Convert 11010101₂ to decimal.

2. Data Representation – Text, Images, Audio & Video (Syllabus 1.2)

2.1 Text (ASCII)

  • Each character = 8 bits (1 byte).
  • Example: the word IGCSE = 5 × 8 = 40 bits = 5 B.

2.2 Images

  • Resolution: number of pixels (width × height).
  • Colour depth: bits per pixel (bpp). Common depths: 1 bpp (black‑white), 8 bpp (256 colours), 24 bpp (true colour – 8 bits for R, G, B).
  • Uncompressed size = width × height × bpp bits.

2.3 Audio

  • Sampling rate (samples s⁻¹) – e.g., 44.1 kHz.
  • Bit depth – bits per sample (e.g., 16 bits).
  • Uncompressed size = sampling rate × duration (s) × bit depth × channels.

2.4 Video (basic idea)

  • Frames per second (fps) × resolution × colour depth gives data per second.
  • Even short clips generate huge amounts of data → compression is essential.

3. Data Storage & Compression (Syllabus 1.3)

3.1 Why compress?

  • Reduce the number of bits required → more files fit on a storage medium and less bandwidth is needed for transmission.
  • Two main strategies:
    • Lossless compression – original data can be reconstructed exactly.
    • Lossy compression – some information is permanently discarded to achieve higher reduction.

3.2 Binary Prefixes – Syllabus terminology (AO1)

Term (syllabus)SymbolValue (binary)
bitb1 binary digit
nibblen4 bits
byteB8 bits
kibibyteKiB1 KiB = 1024 B
mebibyteMiB1 MiB = 1024 KiB = 1 048 576 B
gibibyteGiB1 GiB = 1024 MiB = 1 073 741 824 B

3.3 File‑size calculations (AO2)

Example 1 – Colour photograph

Resolution: 1024 × 768 px Colour depth: 24 bits pixel⁻¹

  1. Total pixels = 1024 × 768 = 786 720.
  2. Bits required = 786 720 × 24 = 18 881 280 bits.
  3. Bytes = 18 881 280 ÷ 8 = 2 360 160 B.
  4. KiB = 2 360 160 ÷ 1024 ≈ 2 304 KiB.
  5. MiB = 2 304 ÷ 1024 ≈ 2.25 MiB.

Uncompressed size ≈ 2.25 MiB.

Example 2 – Stereo audio (44.1 kHz, 16‑bit, 3 min)

  1. Samples s⁻¹ per channel = 44 100.
  2. Total samples = 44 100 × 180 s × 2 channels = 15 876 000.
  3. Bits = 15 876 000 × 16 = 254 016 000 bits.
  4. Bytes = 254 016 000 ÷ 8 = 31 752 000 B.
  5. MiB = 31 752 000 ÷ 1 048 576 ≈ 30.3 MiB.

Uncompressed size ≈ 30 MiB.

Additional practice (cover every unit in the hierarchy)

  1. Convert 3 MiB to KiB and to bytes.
  2. Calculate the uncompressed size of a 640 × 480 px, 16‑bit colour image.
  3. A 5‑minute mono audio clip is sampled at 22.05 kHz with 8‑bit depth. Find its size in KiB.
  4. Express 2 GiB as MiB, KiB and bytes.

3.4 Lossless compression methods (AO1 & AO2)

MethodHow it worksTypical uses
Run‑Length Encoding (RLE) Replace a run of identical symbols with a count and the symbol (e.g., AAAAA5A). Simple graphics, bitmap fonts.
Huffman Coding Assign shorter binary codes to more frequent symbols; build a binary tree based on symbol frequencies. Text files, PNG images, DEFLATE (ZIP, GZIP).
Lempel‑Ziv‑Welch (LZW) Build a dictionary of repeated patterns during compression; output dictionary indexes. GIF images, early ZIP utilities.

Worked example – Huffman coding (AO2)

Encode the text ABRACADABRA.

SymbolFrequency
A5
B2
R2
C1
D1

One possible Huffman tree gives the codes:

  • A = 0
  • B = 111
  • R = 110
  • C = 1010
  • D = 1011

Encoded bit‑stream: 0 111 110 0 1010 0 1011 0 111 110 0 (27 bits).
Original size = 11 characters × 8 bits = 88 bits.
Compression ratio = 88 ÷ 27 ≈ 3.3 : 1 (≈ 70 % saving).

Advantages & Disadvantages (AO3)

AspectAdvantageDisadvantage
Data integrity Exact original can be recovered. Typical ratios only 2 : 1 – 5 : 1.
Typical file types Text, spreadsheets, source code, archival images (PNG, GIF). Not suitable when very small size is required for media.

3.5 Lossy compression methods (AO1 & AO2)

MethodKey ideaTypical uses
Transform coding – Discrete Cosine Transform (DCT) Convert spatial data to frequency components; discard high‑frequency components that are less perceptible. Underlying technique for JPEG.
JPEG (image) DCT → quantisation → entropy coding. Photographs, web graphics.
MP3 (audio) Psycho‑acoustic model removes masked frequencies; then Huffman coding. Music, podcasts.
H.264 / MP4 (video) Combines motion‑compensation, DCT, quantisation and entropy coding. Streaming video, DVDs.

Advantages & Disadvantages (AO3)

AspectAdvantageDisadvantage
Compression ratio Often 10 : 1 – 100 : 1 (or higher for video). Irreversible loss; quality degrades on re‑compression.
Suitable data Photographs, video, music, web graphics where human perception tolerates loss. Unsuitable for text, spreadsheets, or any data requiring exactness.

3.6 Decision‑making flowchart (AO3)

Decision flowchart
Decision‑flowchart: Data type → Is lossless required? → Choose algorithm & file extension.

How to use the flowchart (AO3)

  1. Identify the data type (text, image, audio, video).
  2. Ask: “Is any loss of information acceptable?”
    • Yes → consider lossy methods (JPEG, MP3, H.264).
    • No → use lossless methods (ZIP, PNG, FLAC).
  3. Select a specific algorithm and note the typical file extension.
  4. Record original and compressed sizes, then calculate the compression ratio (see 3.7).
  5. Justify the choice in terms of quality, storage, and purpose.

3.7 Calculating Compression Ratio (AO2)

Two formats are accepted in the exam:

  • Percentage saved $$\text{Saving %}= \frac{U_{\text{original}}-U_{\text{compressed}}}{U_{\text{original}}}\times100\%$$
  • Ratio form (original : compressed) $$\text{Ratio}= \frac{U_{\text{original}}}{U_{\text{compressed}}}:1$$

Worked example

Original size = 2 250 KiB, compressed size = 750 KiB.

  • Saving % = (2250 − 750) ÷ 2250 × 100 ≈ 66.7 %
  • Ratio = 2250 ÷ 750 = 3 : 1

Practice questions

  1. A 5 MiB PNG image is compressed to 3 MiB. Find the percentage saved and the ratio.
  2. A 12‑minute MP3 (bitrate 128 kbps) occupies 11 MiB. If the same track were stored as uncompressed CD‑quality audio (44.1 kHz, 16‑bit stereo), what would be the compression ratio?

4. Computer Hardware (Syllabus 1.4)

  • CPU – processes instructions; clock speed measured in GHz.
  • Memory – RAM (volatile) and ROM (non‑volatile).
  • Storage – HDD, SSD, optical media, flash drives; capacity expressed in KiB, MiB, GiB.
  • Input/Output devices – keyboard, mouse, scanner, printer, monitor.
  • Motherboard & buses – data pathways, address bus, data bus.

Key AO3 evaluation point

When selecting storage for large media libraries, SSDs give faster access but are more expensive per GiB than HDDs; a hybrid solution may be the most cost‑effective.


5. Software (Syllabus 1.5)

  • System software – operating systems, utility programs (e.g., file‑compression utilities).
  • Application software – word processors, spreadsheets, image editors, media players.
  • Utility software for compression – ZIP/7‑Zip (lossless), WinRAR (LZMA), Photoshop (JPEG), Audacity (MP3).

AO3 example

Choosing between a cloud‑based document editor (collaboration, automatic backup) and a locally installed word processor (offline use, no subscription) depends on the school’s connectivity and data‑security policy.


6. The Internet, Cyber‑security & Emerging Technologies (Syllabus 1.6)

6.1 The Internet

  • Client‑server model, IP addressing, DNS.
  • Data transmission rates (bandwidth) are often the limiting factor – hence the importance of compression for web images and streaming media.

6.2 Cyber‑security basics

  • Confidentiality, integrity, availability (CIA triad).
  • Encryption (lossless) vs. compression – both reduce data size but serve different purposes.

6.3 Emerging technologies (relevant to compression)

  • 4K/8K video – requires advanced lossy codecs (HEVC, AV1) to be feasible.
  • Artificial Intelligence – neural‑network based compression (e.g., JPEG‑XL, AI‑enhanced audio codecs).

7. Summary Box – Syllabus Terminology (AO1)

Syllabus termDefinition (exact wording)
Binary prefixA prefix that denotes a power of 2 (e.g., KiB = 2¹⁰ bytes).
Lossless compressionA method in which the original data can be reconstructed exactly after decompression.
Lossy compressionA method in which some original information is permanently discarded to achieve a higher reduction.
Compression ratioThe ratio of original file size to compressed file size, expressed as “original : compressed” or as a percentage saved.
Data integrityThe property that data remains accurate and unchanged during storage or transmission.
Entropy codingA stage in many compression algorithms (e.g., Huffman, arithmetic) that assigns shorter codes to more frequent symbols.

8. Practical Checklist for IGCSE Exams (AO3)

  1. Identify the data type (text, image, audio, video).
  2. Decide whether lossless or lossy compression is required (refer to the decision‑flowchart).
  3. Select an appropriate algorithm and note its typical file extension.
  4. Record the original size using the correct binary units (bit → nibble → byte → KiB → MiB → GiB).
  5. Record the size after compression (as given in the question or measured).
  6. Calculate the compression ratio – either as a percentage saved or as “original : compressed”.
  7. Explain the impact on:
    • Quality (visual/audio fidelity, artefacts).
    • Storage space and transmission time.
    • Suitability for the intended purpose (e.g., archival vs. web publishing).
  8. For AO3, evaluate alternative methods (e.g., could a higher JPEG quality be used? Would PNG be better for a line‑drawing?) and justify the most appropriate choice.

9. Further Study – Algorithms, Programming & Databases (Syllabus 7‑10)

These topics are covered in the second half of the Cambridge IGCSE Computer Science syllabus and are not included in this module. They will be addressed in a separate set of notes covering:

  • Algorithm design, flowcharts and pseudocode.
  • Core programming concepts (variables, selection, iteration, arrays, file handling).
  • Database fundamentals (SQL, relational tables, primary keys).
  • Boolean logic and logic gates.

Students should ensure they master the concepts in Sections 1‑8 before moving on to the programming and algorithmic content.

Create an account or Login to take a Quiz

30 views
0 improvement suggestions

Log in to suggest improvements to this note.