Cambridge Syllabus Notes

AI and Graphs – Cambridge International AS & A Level Computer Science (9618)

Learning Outcome

Show understanding of how graphs can be used to aid Artificial Intelligence (AI) and describe the main machine‑learning techniques that are part of the A‑Level syllabus.

Syllabus‑Map Overview

Syllabus Unit	Covered in these notes?	Key points to remember
1. Data representation, binary, data structures	✓ (brief reminder in Section 1)	Graphs model states (vertices) and actions (edges).
2. System software, operating systems	✗ (outside AI focus)	–
3. Hardware & networks	✗	–
4. Communication & protocols (e.g., SSL/TLS)	✗	–
5. Databases	✗	–
6. Security & ethics	✓ (Section 9)	AI bias, data privacy, responsible use.
7. Algorithm design & analysis	✓ (Sections 1‑3)	Graph search, complexity, heuristics.
8. Programming paradigms & software development	✗	–
9. Testing & quality assurance	✗	–
10. Artificial Intelligence (graph‑based search, neural networks, deep learning, reinforcement learning, regression)	✓ (Sections 1‑8)	All core AI topics required for the exam.

1. Graphs – The Foundation of Many AI Problems

1.1 Why Graphs are Central to AI

AI problems can often be expressed as states and the transitions that move between them.
These naturally form a graph \(G=(V,E)\):
- \(V\) – vertices = possible states (board positions, robot poses, map locations, etc.)
- \(E\) – edges = actions or moves that change one state into another.
Graph models enable systematic exploration, optimisation and reasoning about solutions.

1.2 Graph Representations (exam focus)

Representation	When to use	Exam‑relevant property
Adjacency matrix	Dense graphs (most vertices connected)	O(1) test for edge existence; uses O(\|V\|²) space.
Adjacency list	Sparse graphs (e.g., maps, knowledge graphs)	O(\|V\|+\|E\|) space; efficient iteration over neighbours.

Exam tip: For AI search questions you will usually be given a list of vertices and edges – think of this as an adjacency list unless the diagram is a dense matrix.

2. Graph‑Based Search Algorithms

2.1 Dijkstra’s Algorithm

Finds the shortest‑cost path from a single source to every other vertex.
All edge costs must be non‑negative.
Typical implementation uses a priority queue.
- Time complexity: O(|E| log |V|)
- Space complexity: O(|V|)
Used when no good heuristic is available (e.g., routing in communication networks).

2.2 A* Search

Combines Dijkstra’s cost‑to‑come g(n) with a heuristic estimate h(n) of the remaining cost.
Evaluation function: f(n) = g(n) + h(n)
Admissible heuristic – never over‑estimates the true cheapest cost.
Consistent (monotonic) heuristic – satisfies h(n) ≤ c(n,n′) + h(n′) for every edge; guarantees each node is expanded at most once.
Common heuristics:
- Euclidean distance – straight‑line distance on a grid.
- Manhattan distance – sum of horizontal and vertical moves.
- Domain‑specific counts (e.g., minimum moves for a chess piece).
Complexity depends on heuristic quality:
- Perfect heuristic → behaves like Dijkstra (explores many nodes).
- Weak heuristic → approaches breadth‑first search.

Exam tip: To prove optimality of A* you must state that the heuristic is admissible (and, for single‑expansion, consistent).

2.3 Comparison at a Glance

Property	Dijkstra	A*
Goal‑directed?	No – explores all vertices	Yes – guided by h
Heuristic required?	No	Yes (admissible, preferably consistent)
Completeness	Yes (non‑negative costs)	Yes (admissible h)
Optimality	Yes (non‑negative weights)	Yes (admissible h)
Typical use‑case	Network routing, map services without a good heuristic	Robot navigation, game‑AI path‑finding, any problem with a useful distance estimate

3. Graph‑Based AI Applications

Application	Graph model	Typical algorithm
Path‑finding (games, robotics)	Weighted grid or navigation mesh; vertices = walkable positions, edges = moves with cost (distance, terrain)	A* with Euclidean/Manhattan heuristic
Knowledge representation (semantic networks)	Directed labelled graph; vertices = concepts, edges = relationships (is‑a, part‑of)	DFS/BFS for inference; shortest‑path for similarity measures
Robot motion planning (configuration space)	Graph of robot poses; edges connect collision‑free transitions	Dijkstra or A* with obstacle‑avoidance heuristic
Game‑tree search (e.g., chess, Go)	Tree‑like directed graph; node = board state, edge = legal move	Minimax with alpha‑beta pruning (depth‑first search variant)

4. Machine Learning – Enabling AI to Learn from Data

4.1 What is Machine Learning?

ML is a sub‑field of AI that allows a computer to improve its performance on a task through experience (data) rather than by being explicitly programmed.

4.2 Categories (syllabus focus)

Category	Data	Goal	Typical examples
Supervised learning	Labelled examples (input + desired output)	Learn a mapping \(f:X\rightarrow Y\) to predict labels for new inputs.	Spam detection, image classification, house‑price regression.
Unsupervised learning	Unlabelled data	Discover hidden structure – clusters, associations, or reduced dimensions.	Customer segmentation, market‑basket analysis, feature extraction.

Exam tip: The syllabus only requires you to name the two categories and give one example of each; detailed algorithmic derivations are not required.

5. Artificial Neural Networks (ANNs)

5.1 Basic Structure

Layers: input, one or more hidden, and output.
Neurons (nodes) receive weighted inputs, add a bias, and apply an activation function (e.g., sigmoid, ReLU).
Connections have **weights** \(w\) that determine signal strength.

5.2 Training – Back‑Propagation (high‑level)

Forward pass: compute outputs from the inputs.
Compute error: compare with target (e.g., mean‑squared error for regression, cross‑entropy for classification).
Backward pass: propagate the error backwards to obtain the gradient \(\frac{\partial E}{\partial w}\) for each weight.
Weight update: \(w \leftarrow w - \eta \frac{\partial E}{\partial w}\) where \(\eta\) is the learning rate.
Repeat over many epochs until the error stabilises.

5.3 Key Points for the Exam

Learning rate controls step size – too large → divergence, too small → slow learning.
Training proceeds by gradient descent; it may get stuck in local minima.
Only a conceptual description is required – you do **not** need to derive the full gradient formulas.

Exam tip: When asked to explain back‑propagation, describe the forward pass, error calculation, backward pass (gradient), and weight update. No need for detailed calculus.

6. Deep Learning (DL)

6.1 What Makes a Network “Deep”?

Contains several hidden layers (often > 5). Each layer learns increasingly abstract features (edges → shapes → objects).
Requires large training sets and substantial computational resources (GPUs) to train effectively.

6.2 Common Deep Architectures (high‑level)

Architecture	Typical use‑case
Convolutional Neural Network (CNN)	Image and video recognition.
Recurrent Neural Network (RNN) / LSTM	Sequential data – speech, language modelling.
Deep Feed‑forward Network	General classification/regression on tabular data.

6.3 Why DL Matters for AI (exam focus)

Automates feature extraction – no need for hand‑crafted descriptors.
Achieves state‑of‑the‑art performance in vision, natural language processing and game playing (e.g., AlphaGo).

7. Reinforcement Learning (RL)

7.1 Core Idea

An agent interacts with an environment, receives a reward after each action, and learns a policy that maximises the cumulative reward.

7.2 Key Concepts (syllabus level)

State (s) – description of the current situation.
Action (a) – move the agent can take.
Policy (π) – rule mapping states to actions.
Reward (r) – immediate feedback.
Value function (V) / Q‑function – expected total future reward from a state (or state‑action pair).

7.3 Typical Algorithms (overview)

Algorithm	Approach	Typical application
Q‑learning	Model‑free; learns a table of Q‑values for state‑action pairs.	Simple grid‑world navigation, basic game agents.
Policy Gradient (e.g., REINFORCE)	Optimises the policy directly using gradient ascent.	Continuous‑control tasks such as robotic arm movement.

7.4 Link to Graphs

When the set of states and actions is known, they form a directed graph. Planning algorithms such as Dijkstra or A* can be applied if a model of the environment is available. RL, on the other hand, learns the optimal policy when the model is unknown.

Exam tip: In RL questions you may be asked to identify the components (state, action, reward, policy) or to explain why a graph representation is useful for planning.

8. Regression – Predicting Continuous Values

8.1 Linear Regression (core syllabus idea)

Model: \(y = \mathbf{w}\cdot\mathbf{x} + b\) where \(\mathbf{x}\) is the input vector.
Training finds \(\mathbf{w}\) and \(b\) that minimise the sum of squared errors (SSE): \[ \text{SSE} = \sum_{i=1}^{n} (y_i - \hat{y}_i)^2 \]
Used for simple forecasting (e.g., predicting temperature from time of day).

8.2 Non‑Linear Regression (brief)

If data are not well‑fit by a straight line, polynomial or kernel‑based methods can be used. The syllabus expects only the idea that a more complex model can reduce error.

Exam tip: You may be asked to write the linear regression equation and state that the goal is to minimise SSE.

9. Ethical & Societal Considerations in AI

Bias & fairness – training data may reflect societal prejudices; models can amplify them.
Privacy – AI systems often require large personal datasets; handling must comply with data‑protection laws.
Transparency & accountability – black‑box models (deep nets) can be hard to interpret; important for safety‑critical applications.
Environmental impact – training large deep models consumes significant energy.

Exam tip: The Cambridge exam may ask you to list two ethical issues associated with AI and suggest a mitigation strategy (e.g., bias auditing, data anonymisation).

10. Summary of Key Points

Graphs model AI problems as states (vertices) and actions (edges), enabling systematic search.
Dijkstra finds optimal paths without a heuristic; A* adds an admissible (and preferably consistent) heuristic to guide the search and usually expands fewer nodes.
Machine learning lets computers improve from data: supervised learning uses labelled examples; unsupervised learning discovers hidden structure.
Artificial neural networks consist of layers of weighted connections; training uses back‑propagation to reduce prediction error.
Deep learning extends neural networks with many hidden layers, achieving top performance in vision, language and game playing.
Reinforcement learning trains agents to maximise cumulative reward; the state‑action space can be viewed as a graph for planning.
Linear regression fits a straight line to data by minimising the sum of squared errors; more complex regression models handle non‑linear patterns.
Ethical issues such as bias, privacy and transparency must be considered when deploying AI systems.

Suggested Diagram

Left: a weighted graph showing start node S, goal node G, edge costs, the optimal path found by Dijkstra (blue) and the heuristic‑guided path found by A* (red). Right: a simple three‑layer neural network with arrows indicating forward data flow and backward error‑gradient flow.