Show understanding of the relationship between assembly language and machine code

Assembly Language, Machine Code and the Wider Computer‑Science Syllabus – Cambridge IGCSE/A‑Level (9618)

Learning Objectives

  • Explain the relationship between assembly language and machine code.
  • Describe the two‑pass assembly process and how symbols are resolved.
  • Identify and use the five basic addressing modes.
  • Translate a representative instruction set between mnemonic, binary and hexadecimal forms.
  • Apply bit‑manipulation techniques (shifts, masks, overflow handling).
  • Connect these low‑level concepts to the CPU’s fetch‑decode‑execute cycle and the Von Neumann architecture.
  • Summarise the additional AS and A‑Level topics required by the syllabus (hardware, networking, system software, security, ethics, databases, algorithms, data structures, programming paradigms, SDLC, testing, virtual machines, encryption, AI, etc.).

1. Relationship Between Assembly Language and Machine Code

  • Machine code – the binary words (usually 16‑ or 32‑bit) that the processor fetches, decodes and executes directly.
  • Assembly language – a symbolic, human‑readable notation that uses mnemonics for the opcode and symbolic names for registers, constants and labels.
  • For most exam‑board CPUs there is a one‑to‑one correspondence: one assembly instruction ↔ one fixed‑length machine instruction.
  • An assembler converts the symbolic form into the binary (or hexadecimal) representation required by the CPU.

1.1 Typical 16‑bit Instruction Format

FieldSize (bits)Purpose
Opcode4Identifies the operation (ADD, LDM, MOV …).
Operand 14Register number or high‑order address bits.
Operand 24Register number, low‑order address bits or immediate value.
Operand 34Register, immediate value, or unused (depends on instruction).

1.2 Example Translation

Assume the following opcode and register encodings (binary):

  • ADD = 0111
  • R0–R15 = 00001111

Assembly: ADD R1, R2, R3

  1. Opcode = 0111
  2. R1 = 0001, R2 = 0010, R3 = 0011
  3. Binary word = 0111 0001 0010 0011
  4. Hexadecimal = 0x7123

2. Two‑Pass Assembly Process

The assembler must resolve forward references (labels that appear later in the program). It therefore works in two passes.

PassWhat is Done
First Pass

  • Read source line‑by‑line.
  • When a label is encountered, store its address in a symbol table.
  • Calculate the length of each instruction to determine the address of the next line.

Second Pass

  • Read the source again.
  • Replace each symbolic operand (label, constant) with the numeric value from the symbol table.
  • Generate the final machine‑code word and write it to the object file.

2.1 Worked Example – Forward Reference

START: LDM R0, VALUE ; load address of VALUE (forward reference)

JMP END

VALUE: .WORD 0x55

END: HLT

  • First pass records VALUE at address 0x0006 and END at 0x0008.
  • Second pass substitutes those addresses, producing:

    • LDM R0, 0x00060x1006
    • JMP 0x00080xE008
    • HLT0xF000

3. Addressing Modes

How an operand’s location is specified.

ModeSyntax (example)MeaningTypical Encoding (illustrative)
ImmediateLDI R1, #0x3AConstant value is embedded in the instruction.Opcode = 0011, Rd = 0001, 8‑bit constant = 00111010
DirectLDD R2, 0x30Address field gives the exact memory location.Opcode = 0010, Rd = 0010, address = 00110000
IndirectLDR R3, (R4)Register contains the address of the operand.Opcode = 0101, Rd = 0011, Rb = 0100
IndexedLDX R5, 0x04(R6)Effective address = contents of R6 + offset 0x04.Opcode = 0100, Rd = 0101, Rb = 0110, offset = 00000100
Relative (branch)JMP LOOPSigned offset from the current PC to the target label.Opcode = 1110, 8‑bit signed offset (calculated in Pass 2).

4. Representative Instruction Set (Cambridge 9618)

MnemonicOpcode (binary)Operand formatExample assemblyMachine code (hex)
LDM0001Rd, address (direct)LDM R0, 0x200x1120
LDD0010Rd, address (direct)LDD R1, 0x300x2130
LDI0011Rd, #constant (immediate)LDI R2, #0x0F0x320F
LDX0100Rd, offset(Rb) (indexed)LDX R3, 0x04(R4)0x4344
LDR0101Rd, (Rb) (indirect)LDR R5, (R6)0x5566
MOV0110Rd, RsMOV R7, R80x6788
ADD0111Rd, Rs, RtADD R9, R10, R110x79AB
SUB1000Rd, Rs, RtSUB R12, R13, R140x8CDE
AND1001Rd, Rs, RtAND R0, R1, R20x9012
OR1010Rd, Rs, RtOR R3, R4, R50xA345
JMP1110label (relative)JMP LOOP0xE0??
HLT1111noneHLT0xF000

In the JMP entry the low‑order byte (shown as ??) is the signed offset calculated during Pass 2.

5. Bit‑Manipulation Techniques (Syllabus 4.3)

  • Logical shift left (LSL) – inserts 0s on the right; equivalent to unsigned multiplication by 2.
  • Logical shift right (LSR) – inserts 0s on the left; equivalent to unsigned division by 2.
  • Arithmetic shift right (ASR) – copies the sign bit into the vacated high‑order bit, preserving two’s‑complement sign.
  • Rotate (cyclic) shift – bits that fall off one end re‑appear on the opposite end.
  • Masking – AND with a constant to keep selected bits and clear the rest.
  • Overflow detection – after an addition, examine the carry into and out of the sign bit; for subtraction, use two’s‑complement addition and check the same condition.

5.1 Example – Extract the low‑order nibble

LDI R0, #0xAB ; R0 = 1010 1011₂

AND R0, #0x0F ; mask = 0000 1111₂ → R0 = 0000 1011₂ (= 0x0B)

5.2 Example – Set a flag bit (bit 3) in a control register

LDI R1, #0x08 ; 0000 1000₂ – the flag we want to set

OR RCTRL, R1 ; RCTRL = RCTRL | 0x08

5.3 Example – Arithmetic right shift preserving sign

; R2 contains –8 = 1111 1000₂ (two’s complement)

ASR R2, #1 ; result = 1111 1100₂ = –4

5.4 Example – Detect overflow after addition

LDI R0, #0x7F ; 0111 1111₂ (127)

LDI R1, #0x01 ; 0000 0001₂ (1)

ADD R2, R0, R1 ; R2 = 1000 0000₂ (–128 in two’s complement)

; Carry into sign bit = 0, carry out = 1 → overflow flag set

6. Integration with CPU Architecture (Syllabus 4.1 & 4.2)

  • The CPU follows the Von Neumann model: a single memory holds both instructions and data.
  • Fetch – PC supplies the address, instruction word is placed in the Instruction Register (IR).
  • Decode – opcode and operand fields are extracted; control logic activates the appropriate datapath components (ALU, registers, memory interface).
  • Execute – arithmetic/logic operation, memory read/write, or PC modification for a branch.
  • The binary layout of an instruction directly determines which control signals are asserted, so understanding the encoding explains the hardware behaviour.

7. Overview of the Remaining AS Topics (Syllabus 1‑12)

TopicKey Points to Cover (AO1‑AO3)
Data RepresentationBinary, hexadecimal, signed/unsigned integers, two’s complement, floating‑point (IEEE‑754 single precision), character encodings (ASCII, Unicode).
Algorithms & Problem SolvingFlowcharts, pseudocode, basic constructs (sequence, selection, iteration), algorithm efficiency (Big‑O), recursion basics.
Data StructuresArrays, records/structures, linked lists (conceptual), stacks and queues, simple tree concepts.
Computer ArchitectureCPU components (ALU, registers, control unit), bus systems, cache hierarchy, RAM vs ROM, interrupts.
System SoftwareOperating system functions (process management, memory management, I/O control), language translators (assembler, compiler, interpreter), IDE features.
Communication & NetworksLAN/WAN concepts, topologies, Ethernet/CSMA‑CD, Wi‑Fi basics, TCP/IP stack, IPv4/IPv6 addressing, DNS, client‑server vs peer‑to‑peer, cloud computing.
Security & PrivacyAuthentication, firewalls, malware types, encryption basics (symmetric vs asymmetric), SSL/TLS, digital certificates, hashing, data integrity checks.
Ethics, Legal & Environmental IssuesIntellectual property, data protection laws, ethical use of data, e‑waste, sustainable computing.
DatabasesRelational model, tables, primary/foreign keys, normalization (1NF‑3NF), SQL DDL/DML statements, basic queries.
Software Development Life‑Cycle (SDLC)Planning, analysis, design, implementation, testing, maintenance; agile vs waterfall; version control basics.
Testing & EvaluationUnit testing, integration testing, black‑box vs white‑box, debugging techniques, performance evaluation.

8. A‑Level Extensions (Syllabus 13‑20)

ExtensionEssential Content for Exams
Advanced Computer ArchitectureRISC vs CISC, pipelining, superscalar execution, hazards, virtual memory, address translation (MMU, page tables).
Virtual Machines & InterpretersBytecode execution, stack‑based vs register‑based VMs, just‑in‑time (JIT) compilation.
Floating‑Point RepresentationIEEE‑754 single and double precision layout, rounding modes, overflow/underflow handling.
Boolean Algebra & Karnaugh MapsSimplification of logic expressions, design of combinational circuits, minimisation using K‑maps.
Encryption & CryptographySymmetric algorithms (DES, AES basics), public‑key concepts (RSA), hash functions, digital signatures.
Artificial Intelligence FoundationsSearch algorithms (BFS, DFS, A*), basic machine‑learning concepts, neural‑network structure, ethical considerations.
Advanced Programming ParadigmsObject‑oriented concepts (classes, inheritance, polymorphism), functional programming ideas (higher‑order functions, recursion), exception handling.
Recursion & Stack ManagementRecursive algorithm design, call stack behaviour, tail recursion optimisation.
Concurrency & ParallelismThreads, processes, synchronization primitives (locks, semaphores), race conditions, deadlock.

9. Worked Programme – From High‑Level Idea to Machine Code

Problem: Increment each element of a 4‑byte array stored at address 0x1000 by the constant #5.

  1. Pseudocode (AO2)

    for i = 0 to 3

    array[i] = array[i] + 5

  2. Assembly (AO2)

    LDI R0, #0x1000 ; base address of array

    LDI R1, #4 ; loop counter

    LOOP: LDR R2, (R0) ; load array[i]

    ADD R2, R2, #5 ; add constant 5 (using immediate mode)

    STR R2, (R0) ; store back

    ADD R0, R0, #1 ; next byte address

    SUB R1, R1, #1

    BNE LOOP ; branch if R1 ≠ 0

    HLT

  3. Machine Code (AO3 – evaluation of design)

    • LDI R0, #0x10000x3000 (illustrative)
    • LDI R1, #40x3104
    • LDR R2, (R0)0x5200
    • ADD R2, R2, #50x7205
    • STR R2, (R0)0x6300
    • ADD R0, R0, #10x7001
    • SUB R1, R1, #10x8101
    • BNE LOOP0xE0F9 (offset –7 bytes)
    • HLT0xF000

  4. Evaluation (AO3)

    • Uses only the core instruction set – suitable for an exam.
    • Immediate mode avoids an extra load for the constant.
    • Loop counter stored in a register reduces memory traffic.
    • For larger arrays a block‑copy or DMA would be more efficient – an example of design critique.

10. Summary

  • Assembly language provides a mnemonic layer over machine code; each line maps to a fixed‑length binary instruction.
  • The two‑pass assembler resolves symbols and produces the final object code.
  • Five addressing modes (immediate, direct, indirect, indexed, relative) determine how operand addresses are formed.
  • A representative instruction set illustrates the direct link between opcode, operand format and binary encoding.
  • Bit‑manipulation (shifts, masks, overflow detection) is essential for low‑level control and appears frequently in exam questions.
  • All of these concepts fit into the CPU’s fetch‑decode‑execute cycle and the Von Neumann architecture.
  • The notes also outline the broader AS topics (hardware, networking, security, databases, etc.) and the A‑Level extensions (RISC/CISC, virtual machines, encryption, AI, concurrency) required for full syllabus coverage.