Describe the different stages of the assembly process for a two‑pass assembler and relate assembly language to the machine code that the CPU executes.
Each assembly instruction is mapped to a fixed‑length instruction word consisting of:
| Field | Typical Size (bits) | Content |
|---|---|---|
| Opcode | 8 | Identifies the operation (e.g. LDA, ADD, JMP) |
| Address / Register | 16 (or two 8‑bit bytes) | Operand address, immediate constant, or register identifier, depending on the addressing mode |
| Unused / Padding | 0–8 | Ensures all instruction words have the same length (commonly 24 bits = 3 bytes) |
The assembler replaces each mnemonic with its 8‑bit opcode and each symbolic operand with the numeric value (address, constant or register code) required by the operand field.
A two‑pass assembler separates the tasks of collecting symbols and generating code. This avoids the need for complex back‑patching when forward references are used.
label → LC in the symbol table. Duplicate labels are flagged as errors.| Directive | Purpose | LC Effect (example) |
|---|---|---|
.ORG 0x1000 | Set the starting address of the program. | LC ← 0x1000 |
.ALIGN 4 | Round LC up to the next multiple of 4. | LC 0x1003 → 0x1004 |
.BYTE 0xFF | Reserve one byte of storage. | LC ← LC + 1 |
.WORD 0x1234 | Reserve two bytes (a word). | LC ← LC + 2 |
.RESW 5 | Reserve 5 words (10 bytes). | LC ← LC + 10 |
.BYTE, .WORD, etc.) and write the corresponding bytes to the object file.| Pass | Primary Activities | Outputs Produced |
|---|---|---|
| First Pass |
| Symbol table, final LC value, duplicate‑label diagnostics |
| Second Pass |
| Object code, listing file, error report |
Source program (hypothetical 24‑bit instruction format, 3 bytes per instruction):
.ORG 0x0000
START: LDA VALUE
STA RESULT
JMP END
VALUE: .WORD 5
RESULT: .WORD 0
END: HLT
| Line (source) | LC before line | Action / Symbol Table update |
|---|---|---|
| .ORG 0x0000 | — | LC ← 0x0000 |
| START: LDA VALUE | 0x0000 | add START → 0x0000; LC ← 0x0003 |
| STA RESULT | 0x0003 | LC ← 0x0006 |
| JMP END | 0x0006 | LC ← 0x0009 |
| VALUE: .WORD 5 | 0x0009 | add VALUE → 0x0009; LC ← 0x000B |
| RESULT: .WORD 0 | 0x000B | add RESULT → 0x000B; LC ← 0x000D |
| END: HLT | 0x000D | add END → 0x000D; LC ← 0x0010 |
| Address | Source | Machine Word (hex) |
|---|---|---|
| 0x0000 | LDA VALUE | 01 00 09 |
| 0x0003 | STA RESULT | 02 00 0B |
| 0x0006 | JMP END | 03 00 0D |
| 0x0009 | .WORD 5 | 00 00 05 |
| 0x000B | .WORD 0 | 00 00 00 |
| 0x000D | HLT | FF 00 00 |
Notice how the addresses (09, 0B, 0D) were obtained from the symbol table built during the first pass.
For the simplified processor used in the Cambridge syllabus the instruction word is 24 bits (3 bytes):
| Bits | Field | Explanation |
|---|---|---|
| 23‑16 | Opcode (8 bits) | Identifies the operation (e.g. 01 = LDA) |
| 15‑0 | Operand (16 bits) | Address, immediate constant, or register code, depending on addressing mode |
Example: 01 00 09 → opcode = 01 (LDA), operand = 0x0009 (address of VALUE).
| Group | Typical Mnemonics | Purpose |
|---|---|---|
| Data‑movement | LDA, STA, LD, ST | Transfer data between registers and memory |
| Arithmetic | ADD, SUB, MUL, DIV | Perform integer arithmetic |
| Logical / Bit‑manipulation | AND, OR, XOR, NOT, SHL, SHR | Boolean operations and shifts |
| Control‑flow | JMP, JZ, JNZ, CALL, RET | Alter the sequential execution order |
| Compare / Test | CMP, TEST, TST | Set condition codes for subsequent branches |
The assembler uses the symbol table to resolve the operand required by each mode.
| Mode | Syntax (example) | What the CPU sees |
|---|---|---|
| Immediate | LDA #5 | Operand field contains the constant value 5 |
| Direct | LDA VALUE | Operand field contains the absolute address of VALUE |
| Indirect | LDA @PTR | CPU fetches the address stored at PTR, then loads the value at that address |
| Indexed | LDA ARRAY,X | Effective address = address(ARRAY) + contents of register X |
| Relative (branch) | JMP LABEL | Operand is a signed offset from the current PC; assembler computes offset using the symbol table |
Macros are not required for the Cambridge exam but many assemblers support them. If used, macro expansion occurs during the first pass, before the symbol table is finalised.
MACRO INCR REG
ADD #1, REG
ENDM
START: INCR A ; expands to: ADD #1, A
HLT
During expansion the macro body is inserted into the source stream, and any labels defined inside the macro are made unique (often by appending a numeric suffix) to avoid collisions.
.ORG to a non‑word boundary, mis‑aligned .ALIGN) – flagged in the pass where they appear.Later in the syllabus (Topic 5) you will study how to implement bit‑wise operations and shifts directly in assembly. The op‑codes shown in the instruction‑set table (e.g., SHL, SHR, AND, OR) are the building blocks for tasks such as masking, setting, clearing, and rotating bits.

Your generous donation helps us continue providing free Cambridge IGCSE & A-Level resources, past papers, syllabus notes, revision questions, and high-quality online tutoring to students across Kenya.