Show understanding of the various stages in the compilation of a program

Published by Patrick Mutisya · 14 days ago

Cambridge A-Level Computer Science 9618 – 16.2 Translation Software

16.2 Translation Software

Objective

Show understanding of the various stages in the compilation of a program.

What is Translation Software?

Translation software converts source code written in a high‑level programming language into a form that can be executed by a computer. The most common type is a compiler, which performs a series of systematic transformations.

Stages of Compilation

The compilation process can be divided into distinct stages, each with a specific purpose and output. The figure below shows the typical flow.

Suggested diagram: Flow of compilation stages from source code to executable.

  1. Lexical Analysis (Scanning)

    • Input: Raw source code characters.
    • Output: Stream of tokens (identifiers, keywords, literals, operators).
    • Key data structure: Symbol table (stores identifiers).

  2. Syntax Analysis (Parsing)

    • Input: Token stream.
    • Output: Parse tree or abstract syntax tree (AST) representing the grammatical structure.
    • Typical technique: Context‑free grammar; parsing methods include LL(k) and LR(k).

  3. Semantic Analysis

    • Input: AST.
    • Output: Annotated AST with type information and scope resolution.
    • Checks performed: Type compatibility, declaration before use, array bounds, etc.

  4. Intermediate Code Generation

    • Input: Annotated AST.
    • Output: Platform‑independent intermediate representation (IR) such as three‑address code.
    • Example of a three‑address instruction: \$t_1 = a + b\$

  5. Optimization

    • Input: IR.
    • Output: Optimised IR.
    • Goals: Reduce execution time, memory usage, or power consumption.
    • Common techniques: Constant folding, dead‑code elimination, loop invariant code motion.

  6. Code Generation

    • Input: Optimised IR.
    • Output: Target machine code or assembly language.
    • Tasks: Register allocation, instruction selection, instruction scheduling.

  7. Linking and Loading

    • Input: Object modules produced by the code generator.
    • Output: Executable program ready to run.
    • Linker resolves external references; loader places the program in memory.

Summary Table of Compilation Stages

StageInputOutputPrimary Purpose
Lexical AnalysisSource code charactersToken streamIdentify lexical units and build symbol table
Syntax AnalysisToken streamParse tree / ASTCheck grammatical structure
Semantic AnalysisASTAnnotated ASTEnforce language semantics (types, scopes)
Intermediate Code GenerationAnnotated ASTIntermediate Representation (IR)Provide a machine‑independent code form
OptimizationIROptimised IRImprove performance and resource usage
Code GenerationOptimised IRTarget machine code / assemblyTranslate to executable instructions
Linking & LoadingObject modulesExecutable programResolve external references and place in memory

Key Concepts to Remember

  • Each stage produces an intermediate form that serves as the input for the next stage.
  • Errors detected early (lexical, syntax) are easier to locate than those found later (semantic, runtime).
  • Optimization can be performed at several levels: source‑code, intermediate, and machine code.
  • The separation of compilation stages allows different components (e.g., front‑end vs. back‑end) to be reused for multiple target architectures.

Typical Exam Question

Explain the role of the symbol table during compilation and indicate at which stages it is accessed.

Answer outline:

  1. Created during lexical analysis when identifiers are first encountered.
  2. Used in syntax analysis for checking declarations.
  3. Accessed in semantic analysis for type checking and scope resolution.
  4. May be consulted during code generation for allocating storage locations.