AO2 – Analysis: evaluate alternative solutions, justify design choices.
AO3 – Design & Development: produce algorithms, write/pseudocode, test and debug.
13 Floating‑Point Numbers – Representation and Manipulation
Learning Objectives (AO1‑AO3)
Explain why integer representation is insufficient for many real‑world values.
Convert between binary floating‑point (IEEE 754) and decimal (denary) forms.
Analyse the impact of rounding, precision loss and special patterns on program correctness (AO2).
Design and implement a routine (pseudocode or a chosen language) that performs the conversion in both directions, handling normalised, denormalised, zero, infinity and NaN cases (AO3).
A very large dynamic range (≈10⁻³⁸ … 10³⁸ for single precision).
Predictable storage size (fixed number of bits).
Floating‑point notation satisfies these needs by separating a significand (the “mantissa”) from an exponent, analogous to scientific notation in decimal.
Convert the absolute value to binary scientific notation:
Find 1.f × 2ᴱ where 1 ≤ 1.f < 2.
Biased exponent – e = E + 127; write e as an 8‑bit binary number.
Fraction field – take the bits after the binary point of 1.f and fill the 23‑bit mantissa (pad with zeros if fewer bits).
Combine – s | e | f gives the 32‑bit pattern.
Rounding
If more than 23 fraction bits are required, round to the nearest even value (the default IEEE 754 rounding mode). Other modes (toward 0, toward +∞, toward –∞) may be required for specialised applications.
Example 2 – Decimal to Binary
Convert -0.15625 to IEEE 754 single‑precision.
Sign = 1.
Binary fraction of 0.15625:
0.15625 × 2 = 0.3125 → 0
0.3125 × 2 = 0.625 → 0
0.625 × 2 = 1.25 → 1 (subtract 1 → 0.25)
0.25 × 2 = 0.5 → 0
0.5 × 2 = 1.0 → 1 (terminates)
Result: 0.00101₂.
Normalise: 0.00101₂ = 1.01₂ × 2⁻³ → E = –3.
Biased exponent: e = –3 + 127 = 124 = 01111100₂.
Fraction bits after the leading 1: 01 → pad to 23 bits:
01000000000000000000000.
Combined pattern:
1 01111100 01000000000000000000000
Pseudocode for Decimal → IEEE 754 (Single‑Precision)
FUNCTION floatToIEEE754(x):
IF x = 0:
RETURN "0 00000000 00000000000000000000000"
ENDIF
sign ← 0 IF x >= 0 ELSE 1
a ← ABS(x)
// 1. Convert integer part to binary
intPart ← FLOOR(a)
intBin ← binaryString(intPart) // e.g. 13 → "1101"
// 2. Convert fractional part to binary (up to 30 bits for safety)
frac ← a - intPart
fracBin ← ""
REPEAT 30 TIMES:
frac ← frac * 2
IF frac >= 1:
fracBin ← fracBin + "1"
frac ← frac - 1
ELSE
fracBin ← fracBin + "0"
ENDIF
END REPEAT
// 3. Normalise
IF intBin ≠ "" // number ≥ 1
E ← LENGTH(intBin) - 1
mantissaBits ← SUBSTRING(intBin,1) + fracBin // drop leading 1
ELSE // 0 ≤ a < 1
firstOne ← POSITION("1", fracBin)
E ← -firstOne
mantissaBits ← SUBSTRING(fracBin, firstOne) // bits after first 1
ENDIF
// 4. Biased exponent
e ← E + 127
expBits ← toBinary(e, 8) // pad to 8 bits
// 5. Fraction field (23 bits, round‑to‑nearest‑even)
fraction ← ROUNDTOEVEN(mantissaBits, 23)
RETURN sign + " " + expBits + " " + fraction
END FUNCTION
Precision, Rounding Errors and Algorithmic Implications (AO2)
Relative error – \(\displaystyle \frac{|V{\text{exact}}-V{\text{float}}|}{|V_{\text{exact}}|}\). Errors accumulate when many operations are chained.
Catastrophic cancellation – subtracting two nearly equal numbers can erase significant digits, producing a large relative error.
Best practice – avoid unnecessary subtraction of close values; use scaling, higher precision (double‑precision) or arbitrary‑precision libraries when required.
Extension: IEEE 754 Double‑Precision (64 bits)
For A‑Level work that demands greater accuracy, the double‑precision format uses:
Field
Bits
Bias
Sign
1
—
Exponent
11
1023
Fraction
52
—
The conversion steps are identical; only the exponent width, bias and mantissa length change.
Practice Questions (AO1‑AO3)
Convert the binary pattern 0 10000010 01000000000000000000000 to decimal. Show each step and comment on any rounding.
Express the decimal number 12.75 as a 32‑bit IEEE 754 binary pattern. Write pseudocode that performs the conversion and test it with this value.
What decimal value does the pattern 1 11111111 00000000000000000000000 represent? Explain why this pattern is special.
Design a simple calculator routine that adds two single‑precision numbers entered by a user. Discuss how rounding might affect the result and propose a method to detect overflow.
Suggested Diagram
Figure: 32‑bit IEEE 754 single‑precision word.
Further Reading & Resources
Cambridge International AS & A Level Computer Science (9618) – Specification, Sections 13.1‑13.4.
IEEE 754‑2008 Standard – summary of rounding modes and special values.
“Numerical Analysis” (chapter on floating‑point arithmetic) – for deeper insight into error propagation.
Quick‑Look Checklist (use while comparing lecture notes to the syllabus)
Syllabus section
What to verify in the notes
Typical gaps & how to fix them
1 Information representation (binary, BCD, hexadecimal, ASCII/Unicode, two’s‑/one’s‑complement, overflow)
Binary‑base conversions shown step‑by‑step.
Clear distinction between kibi/kilo, mebi/mega.
Example of BCD ↔ decimal (e.g. digital clock).
Unicode code‑point illustration.
Overflow example (e.g. 8‑bit addition 200 + 100).
Students often see only decimal↔binary – add a short “BCD ↔ decimal” table.
Include a diagram of overflow with carry‑out discarded.
1.2 Multimedia graphics (bitmap vs. vector, colour depth, resolution)
Bitmap‑size formula with a worked example (e.g. 1920 × 1080 × 24‑bit).
Vector‑vs‑bitmap decision matrix.
Simple SVG‑like example to show scaling without quality loss.
Many notes omit vector graphics – insert a tiny SVG illustration and explain why it scales.
1.3 Compression (lossy vs. loss‑less, RLE, JPEG, MP3)
One loss‑less (RLE) and one lossy (JPEG) case study.
Table of typical compression ratios.
Add a short activity: compress a 10 KB text file with RLE and compare sizes.
2 Communication (LAN/WAN, topologies, client‑server vs. peer‑to‑peer, thin/thick client, Ethernet/CSMA‑CD, IP addressing, DNS, IPv4/IPv6, subnetting, wireless vs. wired, cloud basics)
Diagram of each topology with pros/cons.
Example IP address breakdown (e.g. 192.168.1.10/24) and a subnet‑mask exercise.
Definition of cloud computing with a real‑world example (e.g. Google Drive).
Subnetting often omitted – add a “calculate the number of hosts” worksheet.
Clarify difference between IP address and URL with a simple DNS lookup illustration.
Examples of converting a Boolean expression to a circuit and vice‑versa.
Provide a step‑by‑step simplification example using De Morgan’s laws.
Support e-Consult Kenya
Your generous donation helps us continue providing free Cambridge IGCSE & A-Level resources,
past papers, syllabus notes, revision questions, and high-quality online tutoring to students across Kenya.