Cambridge A-Level Computer Science 9618 – Assembly Language
4.2 Assembly Language
Objective
Describe the different stages of the assembly process for a two‑pass assembler.
Why a Two‑Pass Assembler?
A two‑pass assembler is used because, during the first scan of the source program, it may not yet know the final addresses of symbols that appear later in the code. By separating symbol collection from code generation, the assembler can resolve forward references without requiring back‑patching during the first pass.
Stages of the Assembly Process
First Pass – Symbol Table Construction
Read the source file line by line.
Identify and record all labels (symbols) and their corresponding location counters.
Handle assembler directives that affect the location counter (e.g., .ORG, .ALIGN).
Expand macros, if the assembler supports macro processing.
Store each label in a symbol table with its provisional address.
Detect duplicate label definitions and report errors.
Second Pass – Code Generation
Re‑read the source file using the completed symbol table.
Translate each instruction mnemonic into its binary opcode.
Replace operand symbols with the actual addresses or offsets obtained from the symbol table.
Evaluate expressions and constants, applying any required relocation.
Generate the final object code (machine code) and, optionally, a listing file.
Report any undefined symbols that were not found during the first pass.
Summary Table
Pass
Primary Activities
Outputs Produced
First Pass
Scan source line by line
Build symbol table (label → address)
Handle directives affecting location counter
Macro expansion (if applicable)
Symbol table, updated location counters, error list for duplicate labels
Second Pass
Translate mnemonics to op‑codes
Resolve operand addresses using symbol table
Evaluate expressions and generate relocation information
Produce final object code
Object code, listing file, error list for undefined symbols
Key Concepts
Location Counter (LC): Keeps track of the address that will be assigned to the next instruction or data item.
Symbol Table: A data structure (often a hash table) mapping each label to its address.
Forward Reference: Use of a label before its definition; resolved in the second pass.
Relocation: Adjusting addresses when the program is loaded at a different base address.
Suggested diagram: Flowchart showing the two passes – first pass builds the symbol table, second pass generates machine code using that table.
Example Workflow
Consider the following simple assembly fragment:
START: LDA \cdot ALUE
STA RESULT
JMP END \cdot ALUE: .WORD 5
RESULT: .WORD 0
END: HLT
During the first pass, the assembler records the addresses of START, VALUE, RESULT, and END. In the second pass, it replaces VALUE and RESULT with their numeric addresses and produces the final machine code.
Common Errors in Two‑Pass Assembly
Duplicate label definitions – caught in the first pass.
Undefined symbols – reported after the second pass.