Show understanding of how Reverse Polish Notation (RPN) can be used to carry out the evaluation of expressions

16.2 Translation Software – Reverse Polish Notation (RPN)

1. Where RPN fits in the Cambridge Computer Science syllabus

The syllabus for AS & A‑Level Computer Science (Core 1) contains several topics under the heading Translation Software. RPN is only one part of this wider picture.

StageKey ideas
Lexical analysisTokenising source code – identifiers, literals, operators, delimiters.
ParsingBuilding a parse tree / abstract syntax tree (AST) using a grammar (e.g. BNF). Detecting syntax errors.
Syntax‑directed translationWalking the AST to produce an intermediate representation – often post‑fix (RPN).
Code generationTurning the intermediate form into target‑machine instructions or byte‑code.
Optimisation (optional)Eliminating redundant operations, constant folding, dead‑code removal.
Interpreter vs. compilerInterpreters evaluate the intermediate form directly; compilers translate it to native code.

All of the above stages are examined in the exam. The notes below concentrate on the post‑fix (RPN) representation and its evaluation, but the introductory roadmap shows how it connects to the rest of the translation pipeline.

2. What is Reverse Polish Notation?

  • Definition: RPN (also called post‑fix notation) writes the operator after its operands.
  • Infix example a + b × c → RPN a b c × +
  • All binary operators are written in the order operand₁ operand₂ operator.

3. Why use RPN?

  • No parentheses are required – the order of evaluation is implicit.
  • Evaluation can be performed with a very simple stack algorithm (O(n) time, O(n) space).
  • It matches the way most hardware instruction sets and virtual‑machine byte‑codes operate (e.g. JVM operand stack).
  • Widely used in calculators (HP, some scientific models), the stack‑oriented language Forth, and many compiler back‑ends.

4. Converting infix to postfix (Shunting‑Yard algorithm)

Dijkstra’s algorithm converts a fully parenthesised or un‑parenthesised infix expression into RPN while respecting precedence and associativity.

4.1 Algorithm steps

  1. Read the infix tokens from left to right.
  2. If the token is an operand, append it to the output list.
  3. If the token is an operator op₁:

    • While the stack top is an operator op₂ that has

      • higher precedence than op₁, or
      • the same precedence and op₁ is left‑associative

      pop op₂ to the output.

    • Push op₁ onto the stack.

  4. If the token is ‘(’, push it onto the stack.
  5. If the token is ‘)’, pop operators to the output until the matching ‘(’ is removed (discard the parentheses).
  6. When the input is exhausted, pop any remaining operators to the output.

4.2 Handling associativity

  • All binary operators required by the syllabus (+, –, ×, ÷, %) are left‑associative. Example: a – b – ca b – c –.
  • Right‑associative operators (e.g. exponentiation ^) are not in the core syllabus, but if required they are treated as having lower precedence when the same operator is on the stack.
  • Unary operators are excluded from the exam; a note is given to avoid confusion.

4.3 Worked example

Infix expression: ( 3 + 4 ) × 5 – 6 ÷ 2

TokenActionOutput (post‑fix)Stack (top → bottom)
(push(
3output3(
+push3+ (
4output3 4+ (
)pop until ‘(’3 4 +
×push3 4 +×
5output3 4 + 5×
pop higher precedence (×) → output, then push –3 4 + 5 ×
6output3 4 + 5 × 6
÷push (same precedence, left‑assoc → pop – first)3 4 + 5 × 6 –÷
2output3 4 + 5 × 6 – 2÷
Endpop remaining operators3 4 + 5 × 6 2 ÷ –

Resulting RPN: 3 4 + 5 × 6 2 ÷ –

5. Evaluating an RPN expression

The evaluation uses a single stack. For each token:

  • Operand – push onto the stack.
  • Operator – pop the required number of operands (for binary operators pop the right‑hand operand first), apply the operation, push the result.

5.1 Pseudocode (binary operators only)

function evaluateRPN(tokens):

stack ← empty list

for token in tokens:

if token is a number:

stack.push(token)

else if token is a binary operator:

// right operand is popped first

operand2 ← stack.pop()

operand1 ← stack.pop()

result ← applyOperator(token, operand1, operand2)

stack.push(result)

else:

raise Error("Invalid token")

if stack.size() ≠ 1:

raise Error("Malformed expression")

return stack.pop()

Extension notes (not required for the exam):

  • Unary operators (e.g. unary minus) would pop a single operand.
  • Functions such as sin or max can be handled by storing the arity with the token.

5.2 Error handling (useful for AO3)

  • Stack under‑flow – an operator is encountered with fewer than the required operands.
  • Extra operands – after processing all tokens, more than one value remains on the stack.
  • Invalid token – a symbol that is neither a recognised operand nor a supported operator.

In an exam answer a brief comment such as “If the stack under‑flows the expression is malformed” demonstrates understanding and gains marks.

5.3 Step‑by‑step evaluation of the example

TokenActionStack (top → bottom)
3push3
4push4, 3
+pop 4,3 → 3+4=7, push 77
5push5, 7
×pop 5,7 → 7×5=35, push 3535
6push6, 35
2push2, 6, 35
÷pop 2,6 → 6÷2=3, push 33, 35
pop 3,35 → 35–3=32, push 3232

Final result: 32

6. RPN in the wider translation pipeline

  1. Lexical analysis – source code is broken into tokens (identifiers, numbers, operators).
  2. Parsing – tokens are organised according to the language grammar; a parse tree/AST is produced.
  3. Syntax‑directed translation – walking the AST generates an intermediate representation. For arithmetic expressions the most convenient form is RPN because it can be emitted in a single left‑to‑right pass.
  4. Code generation – the RPN is translated into target‑machine instructions or byte‑code. Each operator becomes a single instruction that implicitly uses the processor’s operand stack.
  5. Optimisation (optional) – constant folding can be performed directly on the RPN (e.g., replace 3 4 + with 7 before code generation).

7. Real‑world relevance

  • HP and many scientific calculators evaluate user input directly in RPN.
  • The stack‑oriented language Forth uses postfix notation for all operations.
  • Java Virtual Machine (JVM) byte‑code, .NET IL, and many GPU shader languages rely on an operand stack that behaves exactly like RPN evaluation.

8. Key points for the examination (AO1‑AO3)

ConceptAO1 – KnowledgeAO2 – ApplicationAO3 – Design/Implementation
Definition of RPNState that operators follow their operands.Identify the RPN form of a given infix expression.Explain why RPN eliminates the need for parentheses.
Conversion (Shunting‑Yard)List the five algorithmic steps and the role of precedence/associativity.Convert an infix expression to RPN, showing the stack at each step.Write (or comment on) a short routine that implements the algorithm.
Evaluation using a stackDescribe push/pop actions for operands and operators.Trace a supplied RPN expression, recording the stack after each token.Provide pseudocode (or a real program fragment) that evaluates RPN.
Error handlingIdentify the three common malformed‑expression errors.Comment on what would happen if an under‑flow occurs.Include a simple error‑check in the evaluation routine.
Context within translation softwareRecall the stages: lexical analysis → parsing → RPN → code generation.Explain how RPN simplifies code generation for a stack‑based target.Design a mini‑compiler that translates arithmetic statements to RPN and then to a hypothetical stack machine.

9. Suggested Paper 4 mini‑project (practical component)

Write a small program (Python, Java or C++) that reads a line containing an arithmetic expression in infix form, converts it to RPN using the Shunting‑Yard algorithm, and then evaluates the RPN. Include:

  • Clear input validation (reject non‑numeric characters other than the allowed operators).
  • A test plan with at least five test cases covering:

    • simple binary operations,
    • mixed precedence,
    • parentheses,
    • division by zero (error handling),
    • malformed expressions (stack under‑flow).

  • Comments that link each part of the program to the translation‑software stages (lexical analysis, parsing, RPN generation, evaluation).

10. Quick revision checklist

  • RPN = post‑fix; operators after operands.
  • Stack algorithm: push numbers, pop for operators, push result.
  • Shunting‑Yard steps – remember the precedence/associativity rule.
  • Common errors: under‑flow, extra operands, illegal tokens.
  • RPN’s role: intermediate form that makes code generation trivial for stack‑based machines.

Flowchart of the RPN evaluation algorithm showing token reading, stack push/pop, operator handling, and final result output

Flowchart of the RPN evaluation algorithm (useful for AO2).