Cambridge Notes, Past Papers, Revision Questions

16.2 Translation Software – Interpreters and Compilers

Learning objective

Explain how an interpreter can execute a program without producing a separate translated version, and compare this with the compilation process used by a compiler.

1. Interpreters – Executing Directly from Source

1.1 What an interpreter does

Reads the source program, translates each statement (or a small group of statements) on the fly, and executes it immediately.

The original source text is never turned into an object file or a standalone executable.

Typical environments: scripting languages, REPLs (Read‑Eval‑Print Loops), educational tools, and many web‑based languages.

1.2 Execution model (statement‑by‑statement)

Read next line (or token stream) from the source file.

Lexical analysis – characters → tokens (identifiers, literals, operators …).

Syntax analysis (parsing) – build a tiny abstract syntax tree (AST) for that statement.

Semantic checks – type rules, scope rules, constant folding, etc.

Immediate execution – evaluate the AST node and perform the required action (assignment, I/O, function call …).

Repeat until the end of the file or a run‑time termination condition is met.

1.3 Pseudocode illustration


while not EOF:
line   = read_line()          # raw source
tokens = lexer(line)          # lexical analysis
ast    = parser(tokens)       # syntax analysis
execute(ast)                  # direct execution

1.4 Key characteristics

Instant feedback – ideal for rapid prototyping and debugging.

Source line numbers are preserved, so run‑time error messages refer to the original code.

Supports dynamic features such as eval(), run‑time code generation, and interactive input.

Highly portable: the same source runs on any machine that provides the interpreter.

1.5 Typical examples (relevant to A‑Level)

Python, Ruby, PHP, JavaScript (in a browser)

Interactive consoles – e.g. python REPL, sqlite3 shell

Educational tools that allow a single statement to be tested without a compile‑link cycle.

1.6 Diagram – interpreter flow

Flowchart: source line → lexical analysis → parsing → semantic checks → immediate execution → next line — Typical interpreter processing pipeline.

2. Compilers – Translating the Whole Program Once

2.1 Compilation stages (Cambridge syllabus)

Lexical analysis – source characters → tokens.

Syntactic analysis (parsing) – tokens → parse tree / abstract syntax tree.

Semantic analysis – type checking, scope resolution, constant folding.

Intermediate code generation – usually three‑address code or byte‑code.

Optimisation – dead‑code elimination, loop optimisation, register allocation, etc.

Target code generation – machine code or assembly for the chosen architecture.

Linking (optional) – combine object modules into a single executable.

2.2 One‑time work vs. repeated work

Phase	Interpreter	Compiler
Lexical analysis	Repeated for every statement at run‑time	Performed once for the whole program
Parsing	Repeated for each statement	Done once for the complete source file
Code generation	Direct execution – no separate code file	Produces object code / executable file
Optimisation	Usually none (or very limited JIT)	Extensive optimisation possible

2.3 Performance model

For an interpreter the total run‑time is the sum of the costs of analysing and executing each statement:

\(T{\text{total}} = \displaystyle\sum{i=1}^{n}\bigl(T{\text{lex}}(i) + T{\text{parse}}(i) + T_{\text{exec}}(i)\bigr)\)

For a compiled program the lexical, parsing and code‑generation costs are incurred once, after which the execution time \(T_{\text{exec}}\) is usually much smaller.

2.4 Diagram – compiler pipeline

Pipeline: whole source → lexical → parsing → semantic → optimisation → code generation → object file → linking → executable — Standard compilation flow.

3. Describing Language Syntax – BNF and Railroad Diagrams

3.1 Backus‑Naur Form (BNF)

Formal notation for a context‑free grammar.

Each rule: <non‑terminal> ::= <definition>.

Terminals are written in quotes, non‑terminals in angle brackets.

3.2 Simple BNF example – arithmetic expressions


<expr>   ::= <term> { ("+" | "-") <term> }
<term>   ::= <factor> { ("*" | "/") <factor> }
<factor> ::= <number> | "(" <expr> ")"
<number> ::= <digit> { <digit> }
<digit>  ::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"

3.3 Railroad (syntax‑diagram) representation

Railroad diagram for <term> — Railroad diagram for the rule `<term>`. Start → `<factor>` → loop ( “*” or “/” → `<factor>` ) → End.

3.4 Why the grammar matters

Both interpreters and compilers use the same grammar to recognise legal programs.

BNF is the basis for parser generators such as lex/yacc, ANTLR, or modern IDE syntax checkers.

Understanding the grammar helps students predict where syntax errors will be reported during interpretation or compilation.

4. Evaluating Expressions – Reverse Polish Notation (RPN)

4.1 What is RPN?

Also called post‑fix notation; operators appear after their operands.

No parentheses are required – the order of evaluation is encoded by the position of the operators.

Many virtual machines (e.g., the Java Virtual Machine) use a stack‑based model that mirrors RPN.

4.2 Infix → RPN conversion (Shunting‑Yard algorithm – syllabus outline)

Read the infix token stream left‑to‑right.

Output operands (numbers, identifiers) immediately.

Push operators onto a stack, respecting precedence and associativity.

If a lower‑precedence operator is encountered, pop higher‑precedence operators to the output.

When a right parenthesis is read, pop operators until the matching left parenthesis is removed.

After the input is exhausted, pop any remaining operators to the output.

4.3 Worked example

Infix expression:

3 + 4 * 2 / ( 1 - 5 ) ^ 2 ^ 3

Step‑by‑step conversion yields the RPN string:

3 4 2 * 1 5 - 2 3 ^ ^ / +

4.4 RPN evaluation using a stack


stack = []
for token in RPN:
if token is operand:
stack.push(token)
else:                     # token is an operator
b = stack.pop()
a = stack.pop()
result = apply(token, a, b)
stack.push(result)
final_result = stack.pop()

Applying the algorithm to the example above produces the numeric result 3.00012207… (rounded).

4.5 Diagram – stack evaluation of RPN

Stack diagram showing push/pop operations for the RPN expression — Stack‑based evaluation of the RPN expression.

4.6 Syllabus relevance

Shows how an interpreter can evaluate expressions without building a full program‑wide AST.

Provides the conceptual link to stack‑based byte‑code execution in virtual machines.

5. Comparative Summary – Interpreter vs. Compiler

Aspect	Interpreter	Compiler
Translation strategy	Analyse & execute each statement at run‑time	Analyse the whole program once, then generate executable code
Output produced	No separate executable; runs inside the interpreter	Object code / executable (may need linking)
Execution speed	Generally slower – lexical and parsing work repeated for each statement	Usually faster – code already in machine language, heavy optimisation possible
Development cycle	Immediate feedback; ideal for debugging and exploratory coding	Longer compile‑link cycle; allows optimisation and static error checking
Portability	Source runs wherever the interpreter is available	Executable must be rebuilt for each target CPU/OS
Typical languages	Python, Ruby, PHP, JavaScript (browser), Bash	C, C++, Pascal, Java (compiled to byte‑code), Fortran
Memory usage	Interpreter itself occupies memory; source remains in memory	Generated code occupies memory; source can be discarded after compilation

6. Suggested Classroom Diagrams (for teacher use)

Interpreter flowchart – source line → lexical analysis → parsing → semantic checks → immediate execution → next line.

Compiler pipeline – whole source → lexical → parsing → semantic → optimisation → code generation → object file → linking → executable.

BNF / Railroad diagram for a simple arithmetic grammar (as shown above).

RPN evaluation stack diagram – illustrate tokens being pushed and popped during evaluation.

7. Key take‑aways

An interpreter translates and executes each statement on the fly; no separate translated version is ever produced.

A compiler performs the same translation steps once for the whole program, producing object code that can be run repeatedly without re‑analysis.

Both rely on the same underlying grammar (BNF or railroad diagrams) to recognise valid statements.

Reverse Polish Notation demonstrates a simple, stack‑based way of evaluating expressions – a technique used by many interpreters and virtual machines.

Trade‑offs:

Interpreters – rapid feedback and high portability, but slower execution.

Compilers – faster run‑time performance and optimisation, but require a separate compile step and are less portable without recompilation.

Show understanding of how an interpreter can execute programs without producing a translated version

16.2 Translation Software – Interpreters and Compilers