Data Storage, Transmission & Hardware – Cambridge IGCSE 0478
1. Data Representation
1.1 Binary, Decimal & Hexadecimal
Binary (base‑2) : digits 0 and 1. Used internally by computers.
Decimal (base‑10) : digits 0‑9. The number system we use in everyday life.
Hexadecimal (base‑16) : digits 0‑9 and A‑F. One hex digit = 4 bits.
Example conversion (8‑bit number):
Binary Decimal Hexadecimal
1101 0110 214 D6
1.2 Two’s‑Complement (negative integers)
To represent a negative integer in an n‑bit word:
Write the absolute value in binary.
Invert every bit (0→1, 1→0).
Add 1 to the result.
Example – –18 in 8‑bit two’s‑complement
+18 = 0001 0010
Invert → 1110 1101
Add 1 → 1110 1110
Result: 1110 1110₂ (‑18). The most‑significant bit (MSB) indicates sign (0 = positive, 1 = negative).
1.3 Logical Shifts
Logical left shift (<<) : moves all bits left, inserts 0 on the right. Equivalent to multiplication by 2 (ignoring overflow).
Logical right shift (>>) : moves all bits right, inserts 0 on the left. Equivalent to integer division by 2 (ignoring remainder).
Example (8‑bit unsigned)
Original Binary Shift left ×2 Result
25 0001 1001 0011 0010 50
If the shift moves a ‘1’ out of the MSB the bit is lost – this is **overflow**.
1.4 Text Representation
Encoding Bits per character Typical range
ASCII 7 bits (stored as 8 bits) 0‑127 (English letters, digits, basic symbols)
Unicode (UTF‑8) 1‑4 bytes per character All world scripts; most common characters use 1 byte
Example – “IGCSE” in ASCII
5 characters × 1 byte = 5 bytes.
1.5 Image Representation (Bitmap)
Resolution : width × height (pixels).
Colour depth : bits per pixel (bpp). Common depths: 1, 8, 24, 32.
Uncompressed size: size (bits) = width × height × colour‑depth
Convert bits → bytes ÷ 8, then to KiB/MiB using 1 024.
1.6 Sound Representation (PCM audio)
Sample‑rate (samples s⁻¹, e.g., 44 100 Hz).
Bit‑depth (bits per sample, e.g., 16‑bit).
Channels : 1 = mono, 2 = stereo.
Uncompressed size: size (bits) = sample‑rate × bit‑depth × channels × duration (s)
1.7 Compression
Lossless : original data can be recovered exactly (e.g., ZIP, RLE).
Lossy : some information is permanently discarded for higher reduction (e.g., JPEG, MP3).
Compression ratio R = S_original ÷ S_compressed.
2. Measuring Data Storage
2.1 Units
Unit (symbol) Decimal (base 10) Binary (IEC, base 2)
kilobyte (kB) 1 000 bytes ‑
kibibyte (KiB) ‑ 1 024 bytes
megabyte (MB) 1 000 000 bytes ‑
mebibyte (MiB) ‑ 1 048 576 bytes
gigabyte (GB) 1 000 000 000 bytes ‑
gibibyte (GiB) ‑ 1 073 741 824 bytes
terabyte (TB) 1 000 000 000 000 bytes ‑
tebibyte (TiB) ‑ 1 099 511 627 776 bytes
2.2 Converting between units
To convert upwards (bytes → KiB → MiB …) divide by 1 024 repeatedly.
To convert downwards (MiB → KiB → bytes) multiply by 1 024.
Example : 4 000 000 bytes ÷ 1 024 = 3 906.25 KiB ÷ 1 024 ≈ 3.81 MiB.
3. Calculating File Sizes
3.1 Integer data
32‑bit integer = 4 bytes.
For 1 000 000 integers:
size = 4 bytes × 1 000 000 = 4 000 000 bytes ≈ 3.81 MiB
3.2 Bitmap image (uncompressed)
Formula: size (bits) = width × height × colour‑depth
Example : 800 × 600 px, 24‑bit colour
Bits = 800 × 600 × 24 = 11 520 000 bits
Bytes = 11 520 000 ÷ 8 = 1 440 000 bytes
MiB = 1 440 000 ÷ 1 024 ÷ 1 024 ≈ 1.37 MiB
3.3 PCM audio (uncompressed)
Formula: size (bits) = sample‑rate × bit‑depth × channels × duration (s)
Example : 3 min (180 s) mono, 44.1 kHz, 16‑bit
Bits = 44 100 × 16 × 1 × 180 = 127 008 000 bits
Bytes = 127 008 000 ÷ 8 = 15 876 000 bytes
MiB ≈ 15.14 MiB
3.4 Text file (ASCII)
Each character = 1 byte.
Example : “Hello, world!” (13 characters)
Size = 13 bytes ≈ 0.013 kB
4. Compression
4.1 Why compress?
Save storage space – more files on a disk, USB stick, or cloud.
Reduce transmission time and bandwidth usage.
Lower cost of media and network resources.
Enable real‑time services (video calls, streaming).
4.2 Types of compression
Method Lossless? Typical use Typical ratio
Run‑Length Encoding (RLE) Yes Simple graphics, fax 5 : 1 – 10 : 1
ZIP / GZIP Yes Text, source code, spreadsheets 2 : 1 – 5 : 1
JPEG No Photographic images ≈ 10 : 1
MP3 / AAC No Music & speech ≈ 12 : 1 (music)
4.3 Compression ratio
R = S_original ÷ S_compressed
Lossless example : 1 440 000 bytes bitmap → 180 000 bytes after RLE
R = 1 440 000 ÷ 180 000 = 8 (file is 8 times smaller).
Lossy example : 50 MB PCM song → 4 MB MP3
R = 50 ÷ 4 = 12.5 (≈ 12½ times reduction, with some quality loss).
5. Data Transmission
5.1 Packet structure
Header : source/destination address, control bits.
Payload : the actual data.
Trailer : error‑checking information (e.g., CRC).
5.2 Switching methods
Circuit switching : dedicated path for the whole communication (e.g., traditional telephone).
Packet switching : data broken into packets that travel independently (used on the Internet).
5.3 USB (Universal Serial Bus)
Common wired transmission standard for peripherals.
Supports simplex, half‑duplex and full‑duplex modes.
Typical speeds: USB 2.0 = 480 Mbit s⁻¹, USB 3.0 = 5 Gbit s⁻¹.
5.4 Error‑detection & correction
Technique How it works Typical use
Even / odd parity bit Add one bit so total number of 1’s is even (or odd). Simple serial links, early memory modules.
Checksum Sum of data bytes; receiver recomputes and compares. Internet Protocol (IP) headers.
CRC (Cyclic Redundancy Check) Polynomial division; detects burst errors. Ethernet, USB, storage media.
ARQ (Automatic Repeat reQuest) Receiver asks sender to retransmit corrupted packets. TCP protocol.
5.5 Simple encryption (conceptual)
Data is transformed using an algorithm and a key.
Only someone with the correct key can reverse the process.
In the syllabus you need only know that encryption protects data during transmission.
6. Hardware Overview
6.1 CPU architecture & the fetch‑decode‑execute (FDE) cycle
Fetch : read the next instruction from memory (address held in the Program Counter).
Decode : interpret the opcode; identify required operands.
Execute : perform the operation (ALU, register transfer, memory access).
Diagram placeholder: labelled CPU block showing ALU, Control Unit, Registers, and the three FDE stages.
6.2 Cores, cache & clock speed
Core : an independent processing unit; modern CPUs have multiple cores.
Cache : small, fast memory (L1, L2, L3) that stores frequently used data/instructions.
Clock speed (MHz or GHz): how many cycles per second; higher speed → more instructions per second, all else equal.
6.3 Instruction set
Defines the operations a CPU can perform (e.g., ADD, SUB, LOAD, STORE, JUMP).
6.4 Embedded systems & I/O devices
Embedded system : a computer built into another device (e.g., microwave, car engine controller).
I/O devices : input (keyboard, mouse, sensor) and output (monitor, speaker, actuator).
6.5 Storage hierarchy
Level Typical device Typical capacity Typical speed
Primary RAM (DRAM) 4 GB – 32 GB nanoseconds
Secondary HDD / SSD / Optical disc 500 GB – 4 TB micro‑ to milliseconds
Tertiary Magnetic tape, Cloud storage TB‑PB seconds‑minutes (network latency)
6.6 Virtual memory
Technique that uses part of secondary storage (often called a “page file” or “swap space”) to extend the apparent size of RAM.
Pages are moved between RAM and disk as needed.
6.7 Cloud storage
Data kept on remote servers accessed via the Internet.
Advantages: accessibility, redundancy, scalability.
Disadvantage: dependence on network connection and trust in provider.
6.8 Network hardware
NIC (Network Interface Card) : provides a physical connection to a network; has a unique MAC address (48‑bit hexadecimal).
IP addressing : IPv4 (32‑bit) vs. IPv6 (128‑bit); used for logical routing.
Router : forwards packets between networks, uses IP addresses to determine the best path.
7. Software Overview
7.1 System software vs. application software
System software : operating system (OS) and utility programs that manage hardware resources.
Application software : programs that perform specific tasks for the user (e.g., word processor, web browser, games).
7.2 Core OS functions (relevant to the syllabus)
File management : create, store, retrieve, delete files; maintain directories.
Memory management : allocate RAM to processes, implement virtual memory.
Process management : start, schedule, and terminate programs.
Device control : drivers translate OS commands to hardware actions.
Security : user authentication, permissions, basic encryption support.
8. Summary
Data is represented in binary; hexadecimal provides a convenient shorthand.
Two’s‑complement and logical shifts are essential for integer arithmetic.
Text, images, and sound each have specific storage formulas; use the binary (IEC) units required by the syllabus.
Compression (lossless vs. lossy) reduces storage and transmission requirements; the compression ratio quantifies the reduction.
Data transmission relies on packets, switching, USB, and error‑detection methods such as parity, checksum, CRC, and ARQ.
Understanding CPU operation, cores, cache, instruction sets, and the storage hierarchy links hardware to the data concepts above.
System software manages hardware resources, while application software provides user‑oriented functionality.
All these ideas together explain how modern computers store, process, and move information efficiently.
Suggested diagram: (a) a file before and after compression showing size reduction; (b) the fetch‑decode‑execute cycle inside a CPU; (c) a simple packet with header, payload, trailer.