Show understanding of how an interpreter can execute programs without producing a translated version

16.2 Translation Software – Interpreters and Compilers

Learning objective

Explain how an interpreter can execute a program without producing a separate translated version, and compare this with the compilation process used by a compiler.


1. Interpreters – Executing Directly from Source

1.1 What an interpreter does

  • Reads the source program, translates each statement (or a small group of statements) on the fly, and executes it immediately.
  • The original source text is never turned into an object file or a standalone executable.
  • Typical environments: scripting languages, REPLs (Read‑Eval‑Print Loops), educational tools, and many web‑based languages.

1.2 Execution model (statement‑by‑statement)

  1. Read next line (or token stream) from the source file.
  2. Lexical analysis – characters → tokens (identifiers, literals, operators …).
  3. Syntax analysis (parsing) – build a tiny abstract syntax tree (AST) for that statement.
  4. Semantic checks – type rules, scope rules, constant folding, etc.
  5. Immediate execution – evaluate the AST node and perform the required action (assignment, I/O, function call …).
  6. Repeat until the end of the file or a run‑time termination condition is met.

1.3 Pseudocode illustration

while not EOF:

line = read_line() # raw source

tokens = lexer(line) # lexical analysis

ast = parser(tokens) # syntax analysis

execute(ast) # direct execution

1.4 Key characteristics

  • Instant feedback – ideal for rapid prototyping and debugging.
  • Source line numbers are preserved, so run‑time error messages refer to the original code.
  • Supports dynamic features such as eval(), run‑time code generation, and interactive input.
  • Highly portable: the same source runs on any machine that provides the interpreter.

1.5 Typical examples (relevant to A‑Level)

  • Python, Ruby, PHP, JavaScript (in a browser)
  • Interactive consoles – e.g. python REPL, sqlite3 shell
  • Educational tools that allow a single statement to be tested without a compile‑link cycle.

1.6 Diagram – interpreter flow

Flowchart: source line → lexical analysis → parsing → semantic checks → immediate execution → next line

Typical interpreter processing pipeline.


2. Compilers – Translating the Whole Program Once

2.1 Compilation stages (Cambridge syllabus)

  1. Lexical analysis – source characters → tokens.
  2. Syntactic analysis (parsing) – tokens → parse tree / abstract syntax tree.
  3. Semantic analysis – type checking, scope resolution, constant folding.
  4. Intermediate code generation – usually three‑address code or byte‑code.
  5. Optimisation – dead‑code elimination, loop optimisation, register allocation, etc.
  6. Target code generation – machine code or assembly for the chosen architecture.
  7. Linking (optional) – combine object modules into a single executable.

2.2 One‑time work vs. repeated work

PhaseInterpreterCompiler
Lexical analysisRepeated for every statement at run‑timePerformed once for the whole program
ParsingRepeated for each statementDone once for the complete source file
Code generationDirect execution – no separate code fileProduces object code / executable file
OptimisationUsually none (or very limited JIT)Extensive optimisation possible

2.3 Performance model

For an interpreter the total run‑time is the sum of the costs of analysing and executing each statement:

\(T{\text{total}} = \displaystyle\sum{i=1}^{n}\bigl(T{\text{lex}}(i) + T{\text{parse}}(i) + T_{\text{exec}}(i)\bigr)\)

For a compiled program the lexical, parsing and code‑generation costs are incurred once, after which the execution time \(T_{\text{exec}}\) is usually much smaller.

2.4 Diagram – compiler pipeline

Pipeline: whole source → lexical → parsing → semantic → optimisation → code generation → object file → linking → executable

Standard compilation flow.


3. Describing Language Syntax – BNF and Railroad Diagrams

3.1 Backus‑Naur Form (BNF)

  • Formal notation for a context‑free grammar.
  • Each rule: <non‑terminal> ::= <definition>.
  • Terminals are written in quotes, non‑terminals in angle brackets.

3.2 Simple BNF example – arithmetic expressions

<expr> ::= <term> { ("+" | "-") <term> }

<term> ::= <factor> { ("*" | "/") <factor> }

<factor> ::= <number> | "(" <expr> ")"

<number> ::= <digit> { <digit> }

<digit> ::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"

3.3 Railroad (syntax‑diagram) representation

Railroad diagram for <term>

Railroad diagram for the rule <term>. Start → <factor> → loop ( “*” or “/” → <factor> ) → End.

3.4 Why the grammar matters

  • Both interpreters and compilers use the same grammar to recognise legal programs.
  • BNF is the basis for parser generators such as lex/yacc, ANTLR, or modern IDE syntax checkers.
  • Understanding the grammar helps students predict where syntax errors will be reported during interpretation or compilation.


4. Evaluating Expressions – Reverse Polish Notation (RPN)

4.1 What is RPN?

  • Also called post‑fix notation; operators appear after their operands.
  • No parentheses are required – the order of evaluation is encoded by the position of the operators.
  • Many virtual machines (e.g., the Java Virtual Machine) use a stack‑based model that mirrors RPN.

4.2 Infix → RPN conversion (Shunting‑Yard algorithm – syllabus outline)

  1. Read the infix token stream left‑to‑right.
  2. Output operands (numbers, identifiers) immediately.
  3. Push operators onto a stack, respecting precedence and associativity.
  4. If a lower‑precedence operator is encountered, pop higher‑precedence operators to the output.
  5. When a right parenthesis is read, pop operators until the matching left parenthesis is removed.
  6. After the input is exhausted, pop any remaining operators to the output.

4.3 Worked example

Infix expression:

3 + 4 * 2 / ( 1 - 5 ) ^ 2 ^ 3

Step‑by‑step conversion yields the RPN string:

3 4 2 * 1 5 - 2 3 ^ ^ / +

4.4 RPN evaluation using a stack

stack = []

for token in RPN:

if token is operand:

stack.push(token)

else: # token is an operator

b = stack.pop()

a = stack.pop()

result = apply(token, a, b)

stack.push(result)

final_result = stack.pop()

Applying the algorithm to the example above produces the numeric result 3.00012207… (rounded).

4.5 Diagram – stack evaluation of RPN

Stack diagram showing push/pop operations for the RPN expression

Stack‑based evaluation of the RPN expression.

4.6 Syllabus relevance

  • Shows how an interpreter can evaluate expressions without building a full program‑wide AST.
  • Provides the conceptual link to stack‑based byte‑code execution in virtual machines.


5. Comparative Summary – Interpreter vs. Compiler

AspectInterpreterCompiler
Translation strategyAnalyse & execute each statement at run‑timeAnalyse the whole program once, then generate executable code
Output producedNo separate executable; runs inside the interpreterObject code / executable (may need linking)
Execution speedGenerally slower – lexical and parsing work repeated for each statementUsually faster – code already in machine language, heavy optimisation possible
Development cycleImmediate feedback; ideal for debugging and exploratory codingLonger compile‑link cycle; allows optimisation and static error checking
PortabilitySource runs wherever the interpreter is availableExecutable must be rebuilt for each target CPU/OS
Typical languagesPython, Ruby, PHP, JavaScript (browser), BashC, C++, Pascal, Java (compiled to byte‑code), Fortran
Memory usageInterpreter itself occupies memory; source remains in memoryGenerated code occupies memory; source can be discarded after compilation


6. Suggested Classroom Diagrams (for teacher use)

  • Interpreter flowchart – source line → lexical analysis → parsing → semantic checks → immediate execution → next line.
  • Compiler pipeline – whole source → lexical → parsing → semantic → optimisation → code generation → object file → linking → executable.
  • BNF / Railroad diagram for a simple arithmetic grammar (as shown above).
  • RPN evaluation stack diagram – illustrate tokens being pushed and popped during evaluation.


7. Key take‑aways

  • An interpreter translates and executes each statement on the fly; no separate translated version is ever produced.
  • A compiler performs the same translation steps once for the whole program, producing object code that can be run repeatedly without re‑analysis.
  • Both rely on the same underlying grammar (BNF or railroad diagrams) to recognise valid statements.
  • Reverse Polish Notation demonstrates a simple, stack‑based way of evaluating expressions – a technique used by many interpreters and virtual machines.
  • Trade‑offs:
    • Interpreters – rapid feedback and high portability, but slower execution.
    • Compilers – faster run‑time performance and optimisation, but require a separate compile step and are less portable without recompilation.