Show understanding of the relationship between assembly language and machine code

Published by Patrick Mutisya · 14 days ago

Cambridge A-Level Computer Science 9618 – Assembly Language

Assembly Language and Machine Code

Learning Objective

Show understanding of the relationship between assembly language and machine code.

Key Concepts

  • Machine code is the binary representation that the CPU executes directly.
  • Assembly language is a human‑readable mnemonic representation of machine code.
  • Each assembly instruction corresponds to a single machine instruction (one‑to‑one mapping).
  • An assembler translates assembly language into machine code.
  • Machine code consists of an opcode field and one or more operand fields.

Typical Instruction Format

Most simple CPUs use a fixed‑length instruction format. The diagram below shows a generic 16‑bit instruction.

Suggested diagram: 16‑bit instruction split into opcode (bits 15‑12) and three 4‑bit operand fields (bits 11‑0).

FieldSize (bits)Purpose
Opcode4Identifies the operation (e.g., ADD, SUB, LOAD).
Operand 14Usually a register identifier or part of an address.
Operand 24Second register or address component.
Operand 34Third register, immediate value, or unused.

Example: Translating an Assembly Instruction

Consider a simple hypothetical CPU with the following opcode assignments:

  • ADD = \$0010\$ (binary)
  • Registers R0–R15 are encoded as 4‑bit binary numbers (R0 = \$0000\$, R1 = \$0001\$, …).

Assembly instruction:

ADD R1, R2, R3

Step‑by‑step translation:

  1. Opcode for ADD is \$0010\$.
  2. Encode the three registers:

    • R1 → \$0001\$
    • R2 → \$0010\$
    • R3 → \$0011\$

  3. Concatenate the fields: \$0010\;0001\;0010\;0011\$.
  4. Group into 4‑bit nibbles for readability: \$0010\;0001\;0010\;0011\$.
  5. Convert each nibble to hexadecimal:

    • \$00102 = 2{16}\$
    • \$00012 = 1{16}\$
    • \$00102 = 2{16}\$
    • \$00112 = 3{16}\$

  6. Resulting machine code: 0x2123 (binary \$0010\,0001\,0010\,0011\$).

Thus the single assembly line ADD R1, R2, R3 becomes the 16‑bit machine instruction 0010 0001 0010 0011 (hexadecimal 2123).

Translation Process Overview

  1. Read the assembly source line by line.
  2. Identify the mnemonic (operation) and look up its opcode.
  3. Encode each operand according to the instruction format (register number, address, immediate value).
  4. Assemble the fields into a binary word of the required length.
  5. Optionally convert the binary word to hexadecimal for easier viewing or storage.
  6. Write the resulting machine code to the output object file.

Why Use Assembly Language?

  • Provides a clear, readable representation of machine instructions.
  • Allows programmers to control hardware resources precisely (e.g., registers, flags).
  • Facilitates debugging at the instruction level.
  • Essential for writing low‑level routines such as interrupt handlers, boot loaders, and performance‑critical code.

Summary

Assembly language serves as a symbolic bridge between human understanding and the binary machine code that a CPU executes. Each assembly instruction maps directly to a fixed‑length machine instruction consisting of an opcode and operand fields. The assembler performs a systematic translation, converting mnemonics and symbolic operands into binary (or hexadecimal) values that the processor can interpret. Mastery of this relationship is fundamental for low‑level programming, debugging, and understanding how high‑level code is ultimately executed by hardware.