1. Data Representation – Overview (AS 1.1)
Learning Objectives (AO1 & AO2)
- AO1 – Knowledge: Define bits, bytes and the main binary magnitudes; distinguish decimal and binary prefixes; recall the basic number systems (binary, decimal, hexadecimal, BCD) and common character encodings (ASCII, extended ASCII, Unicode/UTF‑8).
- AO2 – Analysis: Convert between the above number systems; perform binary addition and two’s‑complement subtraction, identifying overflow; explain why storage‑device manufacturers use decimal prefixes while operating systems use binary prefixes.
2. Binary Magnitudes & Prefixes
The smallest unit of information is the bit. Eight bits form a byte. Larger quantities are expressed with either:
- Decimal prefixes – based on powers of 10 (kilo, mega, giga, tera …).
- Binary prefixes – based on powers of 2 (kibi, mebi, gibi, tebi …). The International Electrotechnical Commission (IEC) recommends these to avoid confusion.
Quick‑Reference Box – Bit‑level values
| Prefix (type) | Symbol | Bits | Bytes |
|---|
| kilo (decimal) | k | 10³ = 1 000 | 10³ = 1 000 |
| kibi (binary) | Ki | 2¹⁰ = 1 024 | 2¹⁰ = 1 024 |
| mega (decimal) | M | 10⁶ = 1 000 000 | 10⁶ = 1 000 000 |
| mebi (binary) | Mi | 2²⁰ = 1 048 576 | 2²⁰ = 1 048 576 |
| giga (decimal) | G | 10⁹ = 1 000 000 000 | 10⁹ = 1 000 000 000 |
| gibi (binary) | Gi | 2³⁰ = 1 073 741 824 | 2³⁰ = 1 073 741 824 |
| tera (decimal) | T | 10¹² = 1 000 000 000 000 | 10¹² = 1 000 000 000 000 |
| tebi (binary) | Ti | 2⁴⁰ = 1 099 511 627 776 | 2⁴⁰ = 1 099 511 627 776 |
Why the Distinction Matters (AO2)
- Manufacturers quote storage capacities using decimal prefixes (kB, MB, GB, …) because they are easier to market.
- Operating systems, file‑systems and most programming languages report sizes using binary prefixes (KiB, MiB, GiB, …) because memory is addressed in powers of two.
- This mismatch creates the impression of “missing” space. Example:
\[
\begin{aligned}
500\ \text{GB (decimal)} &= 500\times10^{9}\ \text{bytes}\\
&= \frac{500\times10^{9}}{2^{30}}\ \text{GiB}\\
&\approx 465\ \text{GiB (binary)}.
\end{aligned}
\]
Conversion Example – 2 MiB
- Binary value: \(2\ \text{MiB}=2\times2^{20}\ \text{bytes}=2\,097\,152\ \text{bytes}\).
- To kibibytes (binary): \(\displaystyle\frac{2\,097\,152}{2^{10}}=2\,048\ \text{KiB}\).
- To kilobytes (decimal): \(\displaystyle\frac{2\,097\,152}{10^{3}}=2\,097.152\ \text{kB}\).
3. Number‑System Conversions
3.1 Binary ⇄ Decimal
To convert binary → decimal, multiply each bit by \(2^{n}\) where \(n\) counts from 0 at the right‑most position, then sum.
Example: \(101101_2\)
\[
1\cdot2^{5}+0\cdot2^{4}+1\cdot2^{3}+1\cdot2^{2}+0\cdot2^{1}+1\cdot2^{0}=45_{10}.
\]
Decimal → binary is performed by successive division by 2, recording the remainders (or by using known powers of 2).
3.2 Binary ⇄ Hexadecimal
Hexadecimal (base 16) uses digits 0‑9 and letters A‑F. Group binary digits in sets of four (starting from the right) and replace each quartet with its hex equivalent.
Example: Convert \(0b1101101_2\) to hexadecimal.
- Pad to a multiple of four bits: 0110 1101.
- Separate: 0110 = 6, 1101 = D.
- Result: \(0x6D_{16}\).
3.3 Binary‑Coded Decimal (BCD)
BCD stores each decimal digit as a separate 4‑bit binary nibble.
- Encode 45₁₀: 4 → 0100, 5 → 0101 →
0100 0101₂. - Encode 73₁₀: 7 → 0111, 3 → 0011 →
0111 0011₂. - Decoding is the reverse: split the binary string into nibbles and translate each back to its decimal digit.
Practice Questions (AO1)
- Convert \(0b1101101_2\) to hexadecimal.
- Express \(73_{10}\) in BCD (show both encoding and decoding).
- Convert \(0x2F_{16}\) to binary and then to decimal.
4. Binary Arithmetic
4.1 Addition (8‑bit example)
| 10110101₂ | + | 01101011₂ |
| ↓ |
| 1 001 000 000₂ (9 bits) |
The left‑most “1” is a carry‑out. In an 8‑bit unsigned operation this carry is discarded, leaving 0010 0000₂ = 32₁₀. Because a carry‑out occurred, an overflow flag would be set in hardware.
4.2 Two’s‑Complement Subtraction
To compute \(A - B\) (both 8‑bit signed numbers):
- Write \(B\) in binary.
- Invert every bit (one’s complement).
- Add 1 → two’s complement of \(B\).
- Add this value to \(A\). Discard any final carry‑out; the result is the signed difference.
Example: \(A = 01001100₂\) (76₁₀) and \(B = 00110101₂\) (53₁₀).
- One’s complement of \(B\):
11001010₂. - Two’s complement of \(B\):
11001011₂ (add 1). - Add to \(A\):
01001100₂
+ 11001011₂
= 1 00010111₂
- Discard the carry‑out (the leading ‘1’). The 8‑bit result is
00010111₂ = 23₁₀, which matches \(76-53\).
4.3 Carry‑Out vs. Overflow (AO2)
| Operation | Carry‑out | Overflow (signed) | Interpretation |
|---|
| Unsigned addition | Yes → result wraps modulo 2ⁿ | N/A | Carry‑out indicates loss of the most‑significant bit. |
| Signed addition (two’s complement) | May occur without error | Yes if the carry into the sign‑bit ≠ carry out of the sign‑bit | Overflow means the mathematical result cannot be represented in the given bit‑width. |
Practice Question (AO2)
Add 10110101₂ and 01101011₂. State whether overflow occurs for unsigned and for signed 8‑bit numbers.
5. Character Encoding
5.1 ASCII (7‑bit)
ASCII defines 128 characters (0‑127). The most common printable characters are shown below.
| Char | Decimal | Hex | Binary |
|---|
| A | 65 | 0x41 | 0100 0001₂ |
| a | 97 | 0x61 | 0110 0001₂ |
| 0 | 48 | 0x30 | 0011 0000₂ |
| Space | 32 | 0x20 | 0010 0000₂ |
5.2 Extended ASCII (8‑bit)
Uses the high‑order bit to add another 128 characters (e.g., accented letters, box‑drawing symbols). The exact set depends on the code page (ISO‑8859‑1, Windows‑1252, etc.). Example (ISO‑8859‑1): € = 0x80 = 1000 0000₂.
5.3 Unicode & UTF‑8
Unicode assigns a unique code point (U+0000 … U+10FFFF) to every character. UTF‑8 is a variable‑length encoding that uses 1–4 bytes:
- 1‑byte for U+0000 – U+007F (identical to ASCII).
- 2‑byte for U+0080 – U+07FF:
110xxxxx 10xxxxxx. - 3‑byte for U+0800 – U+FFFF:
1110xxxx 10xxxxxx 10xxxxxx. - 4‑byte for U+10000 – U+10FFFF:
11110xxx 10xxxxxx 10xxxxxx 10xxxxxx.
All continuation bytes start with 10, which makes UTF‑8 self‑synchronising.
Encoding Example – “Cam”
Practice Task (AO1)
Encode the following characters:
- ‘£’ (U+00A3) – give binary UTF‑8 and hexadecimal.
- ‘Ω’ (U+03A9) – give binary UTF‑8 and hexadecimal.
6. Integrated Practice (AO1 + AO2)
- Prefix conversion: Convert 3 GiB to bits using (a) binary prefixes and (b) decimal prefixes.
- Binary: \(3\times2^{30}\) bytes = \(3\times2^{33}\) bits = 25 769 803 776 bits.
- Decimal: \(3\times10^{9}\) bytes = 24 000 000 000 bits.
- Number‑system chain: Convert
0x2F → binary → decimal → BCD. - Arithmetic check: Subtract 53 from 76 using two’s complement (show each step) and verify the result.
- Encoding challenge: Write the word “Café” in UTF‑8 hexadecimal, indicating which bytes belong to each character.
7. Summary Checklist (AO1)
- Recall bit‑level values for decimal (k, M, G, T) and binary (Ki, Mi, Gi, Ti) prefixes.
- Convert between binary, decimal, hexadecimal and BCD (both directions).
- Perform binary addition and two’s‑complement subtraction; recognise when overflow occurs.
- Identify ASCII, extended ASCII and Unicode code points; encode characters in binary and UTF‑8.
- State explicitly whether a size or data‑rate figure uses a decimal or binary prefix.
8. Suggested Diagram (for teacher’s use)
Bar chart (not displayed) comparing the magnitude of each prefix up to tera‑level. Use blue bars for decimal prefixes (k, M, G, T) and green bars for binary prefixes (Ki, Mi, Gi, Ti). Include a small inset showing the “missing” space when a 500 GB drive is displayed as ~465 GiB.