Describe compilers and interpreters and how they operate

Compilers and Interpreters

1. Why Translation Is Needed

  • Computers can only execute machine language – binary instructions specific to a CPU.
  • High‑level languages (e.g. Python, Java, C++) use a syntax that is easy for people to read and write.
  • A translation process converts the high‑level source code into a form the CPU can execute.

2. High‑Level, Low‑Level and Assembly Language

In the Cambridge IGCSE syllabus a distinction is made between:

  • High‑level languages – portable, abstracted from hardware.
  • Low‑level languages – closer to the hardware; the most common example is assembly language.

An assembler is a specialised compiler that translates assembly language into machine code. It performs a single, straightforward pass: each assembly instruction becomes the corresponding binary opcode. This step is required before a compiled programme can run on the processor.

3. Where Translation Fits in the Development Life‑Cycle

During programme development the typical stages are:

  1. Analysis & design (flowcharts, pseudocode)
  2. Writing source code (high‑level language)
  3. Translation – either compilation or interpretation
  4. Testing & debugging
  5. Deployment (distribution of an executable or source + interpreter)

The translation stage (section 3) is the only point where the high‑level description is turned into something the computer can execute.

4. Compilation – Translating Before Execution

A compiler reads the whole source programme, translates it, and stores the result (object code, machine code or byte‑code) for later execution.

4.1 Main Compiler Phases (Cambridge‑level)

  1. Lexical analysis (scanning) – breaks the source text into tokens (identifiers, keywords, literals, operators).
    Example: int total = a + b; → tokens: int, total, =, a, +, b, ;.
  2. Syntactic analysis (parsing) – checks tokens against the language grammar and builds a parse tree / abstract syntax tree (AST).
  3. Semantic analysis – ensures meaning is correct (type checking, scope rules, declaration before use). Errors such as assigning a string to an integer are caught here.
  4. Optimisation (optional) – may improve speed or reduce memory use, but the IGCSE does not test specific optimisation techniques.
  5. Code generation – produces the final output:
    • Machine code – binary instructions for a specific CPU (e.g. x86).
    • Object code – relocatable machine code that is later linked into an executable.
    • Byte‑code – an intermediate, platform‑independent form (e.g. Java .class files).

4.2 What Happens After Compilation?

  • The executable can be run many times without recompiling (unless the source changes).
  • Execution is fast because the CPU runs native machine instructions directly.
  • Compiled programmes are usually platform‑specific; a Windows executable will not run on a Mac unless it is re‑compiled for that platform.

5. Interpretation – Translating While Running

An interpreter reads the source programme and executes it directly, translating each statement (or small block) on the fly.

5.1 Typical Interpreter Steps

  1. Read next statement – fetches a line or complete statement from the source file.
  2. Parse and evaluate – tokenises and parses exactly as a compiler’s early phases, then performs the required actions immediately (e.g., arithmetic, I/O).
  3. Repeat – continues until the programme ends or an error stops execution.

5.2 Characteristics of Interpreted Languages

  • Source code must be present each time the programme runs (no separate executable).
  • Execution is slower because translation occurs at run‑time.
  • Highly portable – the same source runs on any machine that has a compatible interpreter.
  • Excellent for rapid prototyping, scripting, educational environments, and interactive use.

6. Error Detection – Compile‑time vs Run‑time

  • Compile‑time errors (found by the compiler):
    • Syntax errors – e.g., missing semicolon, mismatched parentheses.
    • Semantic errors – e.g., type mismatches, use of undeclared variables.
    The compiler lists all such errors before any part of the programme is executed.
  • Run‑time errors (found by an interpreter or during execution of compiled code):
    • Division by zero, array‑index out of bounds, file‑not‑found, etc.
    These are reported only when the offending statement is reached.

7. Impact on Portability

Translation method Portability of the resulting programme
Compiled to native machine code Platform‑specific – must be re‑compiled for each target CPU/OS.
Compiled to byte‑code (e.g., Java) Portable across any system with the appropriate virtual machine.
Interpreted Portable as long as an interpreter for the language exists on the target machine.

8. Hybrid Approaches (Optional Enrichment)

Beyond the core syllabus
  • Byte‑code interpreters – Java source is compiled to byte‑code, which the JVM either interprets or JIT‑compiles.
  • Just‑In‑Time (JIT) compilation – modern implementations of Python (e.g., PyPy) compile frequently executed code paths to machine code at run‑time for speed.
These concepts are useful for extending knowledge but are not required for the IGCSE exam.

9. Comparison of Compiler and Interpreter

Aspect Compiler Interpreter
When translation occurs Before execution – the whole programme is translated in one step. During execution – each statement is translated just before it runs.
Output produced Executable file (machine code, object code, or byte‑code). No separate file; the interpreter executes the source directly.
Execution speed Fast after compilation (native instructions). Slower; translation overhead on each execution.
Portability Platform‑specific unless compiled to an intermediate form (e.g., Java byte‑code). Highly portable – same source runs on any system with the interpreter.
Typical use‑cases System software, performance‑critical applications, large commercial programmes. Scripting, education, rapid prototyping, small utilities.
Error detection All syntax and many semantic errors are reported before the programme runs. Errors are reported only when the offending line is reached at run‑time.

10. Examples Aligned to the Cambridge Syllabus

  • C (compiled)gcc example.c -o example produces example.exe. The executable can be run repeatedly without the source.
  • Java (hybrid)javac Example.java creates Example.class (byte‑code). The JVM interprets or JIT‑compiles this on any platform.
  • Python (interpreted)python script.py reads each line of script.py and executes it immediately.

11. Suggested Classroom Diagram

Compilation flow: Source code → Compiler → Object/Executable code → Run (CPU executes).

Interpretation flow: Source code → Interpreter → Immediate execution (no separate file).

12. Pseudocode Example – Tokenising a Simple Statement


// Pseudocode for the lexical‑analysis step of a compiler
procedure Tokenise(line)
    tokens ← empty list
    i ← 0
    while i < length(line) do
        if line[i] is a letter then
            start ← i
            while i < length(line) and (letter or digit) do i ← i+1
            tokens.add(IDENTIFIER, line[start..i-1])
        else if line[i] is a digit then
            start ← i
            while i < length(line) and digit do i ← i+1
            tokens.add(NUMBER, line[start..i-1])
        else if line[i] is one of + - * / = ; then
            tokens.add(OPERATOR, line[i])
            i ← i+1
        else
            i ← i+1          // ignore whitespace or unknown characters
    return tokens

This simple routine shows how a compiler breaks a line of source code into the tokens that the later phases will use.

13. Quick Revision Checklist

  • Explain the difference between high‑level, low‑level and assembly languages.
  • List and describe the five main compiler phases.
  • State why compiled programmes generally run faster than interpreted ones.
  • Identify the two kinds of errors a compiler can detect (syntax and semantic) and when they are reported.
  • Compare portability of compiled native code, compiled byte‑code and interpreted programmes.
  • Give an example of a compiled language (C), an interpreted language (Python) and a hybrid language (Java).
  • Use the comparison table to answer typical exam questions on advantages and disadvantages.

Create an account or Login to take a Quiz

32 views
0 improvement suggestions

Log in to suggest improvements to this note.