AI and Graphs – Cambridge International AS & A Level Computer Science (9618)
Learning Outcome
Show understanding of how graphs can be used to aid Artificial Intelligence (AI) and describe the main machine‑learning techniques that are part of the A‑Level syllabus.
Syllabus‑Map Overview
Syllabus Unit
Covered in these notes?
Key points to remember
1. Data representation, binary, data structures
✓ (brief reminder in Section 1)
Graphs model states (vertices) and actions (edges).
AI problems can often be expressed as states and the transitions that move between them.
These naturally form a graph \(G=(V,E)\):
\(V\) – vertices = possible states (board positions, robot poses, map locations, etc.)
\(E\) – edges = actions or moves that change one state into another.
Graph models enable systematic exploration, optimisation and reasoning about solutions.
1.2 Graph Representations (exam focus)
Representation
When to use
Exam‑relevant property
Adjacency matrix
Dense graphs (most vertices connected)
O(1) test for edge existence; uses O(|V|²) space.
Adjacency list
Sparse graphs (e.g., maps, knowledge graphs)
O(|V|+|E|) space; efficient iteration over neighbours.
Exam tip: For AI search questions you will usually be given a list of vertices and edges – think of this as an adjacency list unless the diagram is a dense matrix.
2. Graph‑Based Search Algorithms
2.1 Dijkstra’s Algorithm
Finds the shortest‑cost path from a single source to every other vertex.
All edge costs must be non‑negative.
Typical implementation uses a priority queue.
Time complexity: O(|E| log |V|)
Space complexity: O(|V|)
Used when no good heuristic is available (e.g., routing in communication networks).
2.2 A* Search
Combines Dijkstra’s cost‑to‑come g(n) with a heuristic estimate h(n) of the remaining cost.
Evaluation function: f(n) = g(n) + h(n)
Admissible heuristic – never over‑estimates the true cheapest cost.
Consistent (monotonic) heuristic – satisfies h(n) ≤ c(n,n′) + h(n′) for every edge; guarantees each node is expanded at most once.
Common heuristics:
Euclidean distance – straight‑line distance on a grid.
Manhattan distance – sum of horizontal and vertical moves.
Domain‑specific counts (e.g., minimum moves for a chess piece).
Complexity depends on heuristic quality:
Perfect heuristic → behaves like Dijkstra (explores many nodes).
Weak heuristic → approaches breadth‑first search.
Exam tip: To prove optimality of A* you must state that the heuristic is admissible (and, for single‑expansion, consistent).
2.3 Comparison at a Glance
Property
Dijkstra
A*
Goal‑directed?
No – explores all vertices
Yes – guided by h
Heuristic required?
No
Yes (admissible, preferably consistent)
Completeness
Yes (non‑negative costs)
Yes (admissible h)
Optimality
Yes (non‑negative weights)
Yes (admissible h)
Typical use‑case
Network routing, map services without a good heuristic
Robot navigation, game‑AI path‑finding, any problem with a useful distance estimate
3. Graph‑Based AI Applications
Application
Graph model
Typical algorithm
Path‑finding (games, robotics)
Weighted grid or navigation mesh; vertices = walkable positions, edges = moves with cost (distance, terrain)
Minimax with alpha‑beta pruning (depth‑first search variant)
4. Machine Learning – Enabling AI to Learn from Data
4.1 What is Machine Learning?
ML is a sub‑field of AI that allows a computer to improve its performance on a task through experience (data) rather than by being explicitly programmed.
4.2 Categories (syllabus focus)
Category
Data
Goal
Typical examples
Supervised learning
Labelled examples (input + desired output)
Learn a mapping \(f:X\rightarrow Y\) to predict labels for new inputs.
Exam tip: The syllabus only requires you to name the two categories and give one example of each; detailed algorithmic derivations are not required.
5. Artificial Neural Networks (ANNs)
5.1 Basic Structure
Layers: input, one or more hidden, and output.
Neurons (nodes) receive weighted inputs, add a bias, and apply an activation function (e.g., sigmoid, ReLU).
Connections have **weights** \(w\) that determine signal strength.
5.2 Training – Back‑Propagation (high‑level)
Forward pass: compute outputs from the inputs.
Compute error: compare with target (e.g., mean‑squared error for regression, cross‑entropy for classification).
Backward pass: propagate the error backwards to obtain the gradient \(\frac{\partial E}{\partial w}\) for each weight.
Weight update: \(w \leftarrow w - \eta \frac{\partial E}{\partial w}\) where \(\eta\) is the learning rate.
Repeat over many epochs until the error stabilises.
5.3 Key Points for the Exam
Learning rate controls step size – too large → divergence, too small → slow learning.
Training proceeds by gradient descent; it may get stuck in local minima.
Only a conceptual description is required – you do **not** need to derive the full gradient formulas.
Exam tip: When asked to explain back‑propagation, describe the forward pass, error calculation, backward pass (gradient), and weight update. No need for detailed calculus.
6. Deep Learning (DL)
6.1 What Makes a Network “Deep”?
Contains several hidden layers (often > 5). Each layer learns increasingly abstract features (edges → shapes → objects).
Requires large training sets and substantial computational resources (GPUs) to train effectively.
6.2 Common Deep Architectures (high‑level)
Architecture
Typical use‑case
Convolutional Neural Network (CNN)
Image and video recognition.
Recurrent Neural Network (RNN) / LSTM
Sequential data – speech, language modelling.
Deep Feed‑forward Network
General classification/regression on tabular data.
6.3 Why DL Matters for AI (exam focus)
Automates feature extraction – no need for hand‑crafted descriptors.
Achieves state‑of‑the‑art performance in vision, natural language processing and game playing (e.g., AlphaGo).
7. Reinforcement Learning (RL)
7.1 Core Idea
An agent interacts with an environment, receives a reward after each action, and learns a policy that maximises the cumulative reward.
7.2 Key Concepts (syllabus level)
State (s) – description of the current situation.
Action (a) – move the agent can take.
Policy (π) – rule mapping states to actions.
Reward (r) – immediate feedback.
Value function (V) / Q‑function – expected total future reward from a state (or state‑action pair).
7.3 Typical Algorithms (overview)
Algorithm
Approach
Typical application
Q‑learning
Model‑free; learns a table of Q‑values for state‑action pairs.
Simple grid‑world navigation, basic game agents.
Policy Gradient (e.g., REINFORCE)
Optimises the policy directly using gradient ascent.
Continuous‑control tasks such as robotic arm movement.
7.4 Link to Graphs
When the set of states and actions is known, they form a directed graph. Planning algorithms such as Dijkstra or A* can be applied if a model of the environment is available. RL, on the other hand, learns the optimal policy when the model is unknown.
Exam tip: In RL questions you may be asked to identify the components (state, action, reward, policy) or to explain why a graph representation is useful for planning.
8. Regression – Predicting Continuous Values
8.1 Linear Regression (core syllabus idea)
Model: \(y = \mathbf{w}\cdot\mathbf{x} + b\) where \(\mathbf{x}\) is the input vector.
Training finds \(\mathbf{w}\) and \(b\) that minimise the sum of squared errors (SSE):
\[
\text{SSE} = \sum_{i=1}^{n} (y_i - \hat{y}_i)^2
\]
Used for simple forecasting (e.g., predicting temperature from time of day).
8.2 Non‑Linear Regression (brief)
If data are not well‑fit by a straight line, polynomial or kernel‑based methods can be used. The syllabus expects only the idea that a more complex model can reduce error.
Exam tip: You may be asked to write the linear regression equation and state that the goal is to minimise SSE.
9. Ethical & Societal Considerations in AI
Bias & fairness – training data may reflect societal prejudices; models can amplify them.
Privacy – AI systems often require large personal datasets; handling must comply with data‑protection laws.
Transparency & accountability – black‑box models (deep nets) can be hard to interpret; important for safety‑critical applications.
Environmental impact – training large deep models consumes significant energy.
Exam tip: The Cambridge exam may ask you to list two ethical issues associated with AI and suggest a mitigation strategy (e.g., bias auditing, data anonymisation).
10. Summary of Key Points
Graphs model AI problems as states (vertices) and actions (edges), enabling systematic search.
Dijkstra finds optimal paths without a heuristic; A* adds an admissible (and preferably consistent) heuristic to guide the search and usually expands fewer nodes.
Artificial neural networks consist of layers of weighted connections; training uses back‑propagation to reduce prediction error.
Deep learning extends neural networks with many hidden layers, achieving top performance in vision, language and game playing.
Reinforcement learning trains agents to maximise cumulative reward; the state‑action space can be viewed as a graph for planning.
Linear regression fits a straight line to data by minimising the sum of squared errors; more complex regression models handle non‑linear patterns.
Ethical issues such as bias, privacy and transparency must be considered when deploying AI systems.
Suggested Diagram
Left: a weighted graph showing start node S, goal node G, edge costs, the optimal path found by Dijkstra (blue) and the heuristic‑guided path found by A* (red). Right: a simple three‑layer neural network with arrows indicating forward data flow and backward error‑gradient flow.
Your generous donation helps us continue providing free Cambridge IGCSE & A-Level resources,
past papers, syllabus notes, revision questions, and high-quality online tutoring to students across Kenya.