Show awareness of what a compiler has to do to translate recursive programming code

19.2 Recursion – How a Compiler Translates Recursive Code

This section explains, in a way that meets the Cambridge AS/A‑Level assessment objective 2 (show awareness of what a compiler has to do), the steps a compiler follows when it encounters a recursive function. The material is deliberately scoped to the depth required for the syllabus, with an optional “Enrichment” box for students who wish to explore the topic further.

1 Compilation Phases that Affect Recursion

  1. Lexical analysis & parsing
    • Tokenises the source program.
    • Builds an abstract syntax tree (AST) that records the structure of the function, including any self‑references.
  2. Semantic analysis
    • Ensures the function is declared (forward references are allowed where the language permits).
    • Checks type consistency for parameters and return values.
    • Detects illegal constructs such as calling a function before it is in scope (in languages that require prototypes).
    • May perform a simple static check for obvious infinite recursion (rare in exam‑level compilers).
  3. Activation‑record (stack‑frame) design
    • Decides what information each call must store: return address, saved frame pointer, parameters, local variables, temporaries, and any saved registers.
    • Determines which items can be kept in registers and which must be placed on the run‑time stack.
  4. Code generation
    • Emits instructions to evaluate arguments, pass them (via stack or registers), and invoke the function.
    • Generates prologue code that creates a new activation record and saves the caller’s context.
    • Generates epilogue code that restores the caller’s context and returns to the saved address.
  5. Optimisation (optional but common)
    • Tail‑call optimisation (TCO) – replaces a final recursive call with a jump, re‑using the current frame.
    • Inlining of small recursive calls when it does not change semantics.
    • Transformation of simple linear recursion into an equivalent loop.
  6. Target‑code emission
    • Produces machine code, assembly, or an intermediate representation (IR) that the linker/loader can turn into an executable.

2 Activation‑Record (Stack‑Frame) Layout

Each invocation of a recursive function receives its own activation record. The typical layout on a word‑addressable CPU is shown below.

Field Purpose Typical size (bytes)
Return address Instruction to continue after the call returns Word size (4 or 8)
Saved frame pointer (FP) Link to the caller’s activation record Word size
Parameters Values passed to the recursive call (or space for them) Variable
Local variables Variables declared inside the function Variable
Temporaries / saved registers Intermediate results and caller‑saved registers Variable

3 Managing the Call Stack

  • Every recursive call pushes a new activation record onto the stack.
  • Most CPUs grow the stack downwards (high‑address → low‑address), but the compiler must obey whatever convention the target architecture defines.
  • When a call returns, the epilogue restores the saved FP, removes the arguments (if required), and jumps to the saved return address.
  • Additional responsibilities:
    • Preserve caller‑saved registers according to the calling convention.
    • Insert run‑time checks (or rely on the OS) for stack‑overflow conditions.

4 Tail‑Call Optimisation (TCO)

If the recursive call is the last operation performed by a function, the compiler can replace the normal call/return sequence with a simple jump. This re‑uses the current activation record, turning an otherwise unbounded recursion into an iterative loop at the machine level.

A function f is tail‑recursive when it can be written as:

$$ f(x)= \begin{cases} g(x) & \text{if base case}\\[4pt] f\bigl(h(x)\bigr) & \text{otherwise (tail call)} \end{cases} $$

After TCO the stack depth is constant: O(1) instead of O(n) for n recursive calls.

5 Worked Example – Factorial

int factorial(int n) {
    if (n == 0) return 1;
    return n * factorial(n - 1);
}
  1. Parsing & AST – recognises the if statement, the multiplication, and the self‑call.
  2. Semantic analysis – confirms that n is an int, the return type matches, and the function is visible to its own body.
  3. Activation‑record layout – needs space for:
    • Return address
    • Saved FP
    • Parameter n
    • Temporary for the intermediate result of factorial(n‑1)
  4. Code generation (simplified pseudo‑assembly)
    ; ----- prologue -----
    push    FP                ; save caller's frame pointer
    mov     FP, SP           ; establish new frame pointer
    sub     SP, #locals      ; allocate space for locals
    
    ; ----- base case -----
    cmp     n, #0
    jne     recurse
    mov     R0, #1           ; return 1
    jmp     end
    
    ; ----- recursive case -----
    recurse:
        mov     R1, n        ; preserve original n
        dec     n, #1        ; n‑1 as argument
        push    n            ; pass argument
        call    factorial    ; recursive call
        add     SP, #4       ; clean argument
        mul     R0, R1       ; R0 = n_original * factorial(n‑1)
    
    ; ----- epilogue -----
    end:
        mov     SP, FP       ; de‑allocate locals
        pop     FP           ; restore caller's FP
        ret
    
    • The call instruction automatically stores the return address.
    • Each call creates a fresh activation record, giving the recursion its own independent data.
  5. Optimisation – TCO is not applicable because the multiplication occurs after the recursive call.

6 Summary of Compiler Responsibilities for Recursive Functions

Compilation phase What the compiler does Effect on recursion
Lexical analysis & parsing Creates tokens and an AST; recognises self‑references. Allows the compiler to identify a recursive call.
Semantic analysis Type‑checks arguments/return values; checks visibility. Prevents type‑related run‑time errors in recursive calls.
Activation‑record design Allocates stack‑frame layout for parameters, locals, temporaries, saved registers. Provides a separate context for each recursion level.
Code generation Emits instructions for argument evaluation, call/return handling, and stack manipulation. Implements the actual recursion mechanism at machine level.
Optimisation Applies tail‑call optimisation, inlining, or recursion‑to‑iteration transformation. Can reduce stack usage (O(1) instead of O(n)) or replace recursion with a loop.
Target‑code emission Outputs machine code, assembly, or IR ready for the linker/loader. Delivers an executable that correctly handles any depth of recursion (subject to hardware limits).

7 Diagram – Stack Growth During Recursion

Stack growth with successive recursive calls, each box representing an activation record containing return address, saved FP, parameters, locals, and temporaries.
Each recursive call pushes a new activation record; the stack is unwound as calls return.

8 Cambridge Computer Science Syllabus – Overview

To place the recursion topic in context, the table below lists the AS‑Level (units 1‑12) and A‑Level (units 13‑20) components of the Cambridge International Computer Science specification. The notes that follow give a brief “key concepts” snapshot for each unit, together with the assessment objectives (AO1–AO3) that the material supports.

Unit Title (AS) Key concepts Relevant AO(s)
1 Information Representation Binary, hexadecimal, two’s‑complement, floating‑point, character encodings, data compression. AO1, AO2
2 Communication and Networking OSI model, protocols (TCP/IP, HTTP), bandwidth, latency, security basics. AO1, AO2
3 Hardware Fundamentals CPU architecture, registers, ALU, cache, storage hierarchy, I/O devices. AO1, AO2
4 Processor Fundamentals Instruction set, pipelining, addressing modes, basic assembly language. AO1, AO2
5 System Software Operating system functions, file systems, compilers, interpreters, virtual machines. AO1, AO2
6 Security Encryption, authentication, threats, counter‑measures, ethical considerations. AO1, AO2, AO3
7 Ethics and Legal Issues Intellectual property, privacy, professional codes of practice. AO1, AO3
8 Databases Relational model, SQL, normalisation, indexing, data integrity. AO1, AO2, AO3
9 Algorithm Design Flowcharts, pseudocode, control structures, complexity (Big‑O), searching & sorting. AO1, AO2, AO3
10 Data Types & Structures Arrays, records, linked lists, trees, stacks, queues, abstract data types. AO1, AO2, AO3
11 Programming Procedural programming, recursion, iteration, debugging, testing. AO1, AO2, AO3
12 Software Development System life‑cycle, methodologies (waterfall, agile), documentation, version control. AO1, AO2, AO3

The A‑Level units (13‑20) extend these foundations with more depth (e.g., advanced data representation, virtual machines, artificial intelligence, computational thinking). The same structure—key concepts plus AO mapping—applies, and teachers can expand each unit into a dedicated set of notes.


9 Practical Component (Paper 4) – Quick Guide

  • Programming language: Java, Visual Basic, or Python (as specified by the exam board).
  • Typical tasks:
    • Implement a recursive algorithm (e.g., factorial, Fibonacci, binary search).
    • Write a non‑recursive version of the same algorithm.
    • Develop a simple database program using SQL.
    • Design a user‑interface that validates input and handles errors.
  • Testing checklist:
    • Identify normal‑case, boundary, and error inputs.
    • Use systematic testing (black‑box or white‑box) and record results.
    • Include comments that explain the purpose of each test.
  • Submission format:
    • Source code files with clear naming conventions.
    • Separate test‑plan document (max 1 page) showing inputs, expected outputs, and actual results.
    • Optional flowchart or pseudocode that demonstrates algorithmic thinking (AO2).

10 Enrichment Box – Deeper Compiler Insight (Optional)

For students who wish to go beyond the syllabus requirement, the following topics provide a glimpse of real‑world compiler engineering:

  • Intermediate representations (IR) – three‑address code, SSA form, and how recursive calls appear in IR.
  • Register allocation – graph‑colouring algorithms and the impact on recursive functions that use many temporaries.
  • Stack‑overflow detection – guard pages, canaries, and OS‑level signal handling.
  • Mutual recursion – handling two or more functions that call each other, and the need for forward declarations.
  • Garbage‑collected languages – how activation records are managed on the heap rather than the stack (e.g., in Java).

These topics are **not** required for the Cambridge exam but can enrich a student’s understanding and are useful for further study in computer science.

Create an account or Login to take a Quiz

87 views
0 improvement suggestions

Log in to suggest improvements to this note.