Describe the different stages of the assembly process for a two-pass assembler

Published by Patrick Mutisya · 14 days ago

Cambridge A-Level Computer Science 9618 – Assembly Language

4.2 Assembly Language

Objective

Describe the different stages of the assembly process for a two‑pass assembler.

Why a Two‑Pass Assembler?

A two‑pass assembler is used because, during the first scan of the source program, it may not yet know the final addresses of symbols that appear later in the code. By separating symbol collection from code generation, the assembler can resolve forward references without requiring back‑patching during the first pass.

Stages of the Assembly Process

  1. First Pass – Symbol Table Construction

    • Read the source file line by line.
    • Identify and record all labels (symbols) and their corresponding location counters.
    • Handle assembler directives that affect the location counter (e.g., .ORG, .ALIGN).
    • Expand macros, if the assembler supports macro processing.
    • Store each label in a symbol table with its provisional address.
    • Detect duplicate label definitions and report errors.

  2. Second Pass – Code Generation

    • Re‑read the source file using the completed symbol table.
    • Translate each instruction mnemonic into its binary opcode.
    • Replace operand symbols with the actual addresses or offsets obtained from the symbol table.
    • Evaluate expressions and constants, applying any required relocation.
    • Generate the final object code (machine code) and, optionally, a listing file.
    • Report any undefined symbols that were not found during the first pass.

Summary Table

PassPrimary ActivitiesOutputs Produced
First Pass

  • Scan source line by line
  • Build symbol table (label → address)
  • Handle directives affecting location counter
  • Macro expansion (if applicable)

Symbol table, updated location counters, error list for duplicate labels
Second Pass

  • Translate mnemonics to op‑codes
  • Resolve operand addresses using symbol table
  • Evaluate expressions and generate relocation information
  • Produce final object code

Object code, listing file, error list for undefined symbols

Key Concepts

  • Location Counter (LC): Keeps track of the address that will be assigned to the next instruction or data item.
  • Symbol Table: A data structure (often a hash table) mapping each label to its address.
  • Forward Reference: Use of a label before its definition; resolved in the second pass.
  • Relocation: Adjusting addresses when the program is loaded at a different base address.

Suggested diagram: Flowchart showing the two passes – first pass builds the symbol table, second pass generates machine code using that table.

Example Workflow

Consider the following simple assembly fragment:

START: LDA \cdot ALUE

STA RESULT

JMP END \cdot ALUE: .WORD 5

RESULT: .WORD 0

END: HLT

During the first pass, the assembler records the addresses of START, VALUE, RESULT, and END. In the second pass, it replaces VALUE and RESULT with their numeric addresses and produces the final machine code.

Common Errors in Two‑Pass Assembly

  • Duplicate label definitions – caught in the first pass.
  • Undefined symbols – reported after the second pass.
  • Incorrect directive usage (e.g., mismatched .ORG values).
  • Macro expansion errors if macros reference undefined symbols.