Show understanding of Deep Learning, Machine Learning and Reinforcement Learning and the reasons for using these methods

Published by Patrick Mutisya · 14 days ago

Cambridge A-Level Computer Science 9618 – Artificial Intelligence (AI)

18.1 Artificial Intelligence (AI)

Artificial Intelligence (AI) is the field of computer science that aims to create systems capable of performing tasks that normally require human intelligence. Modern AI is largely driven by three inter‑related approaches:

  • Machine Learning (ML)
  • Deep Learning (DL)
  • Reinforcement Learning (RL)

Machine Learning (ML)

Machine Learning is a set of algorithms that enable a computer to improve its performance on a specific task through experience, without being explicitly programmed for every possible scenario.

  • Supervised learning: The algorithm is trained on labelled data pairs \$(x, y)\$, learning a mapping \$y = f(x)\$.
  • Unsupervised learning: The algorithm discovers structure in unlabelled data, e.g., clustering or dimensionality reduction.
  • Semi‑supervised & self‑training: Combines small amounts of labelled data with large amounts of unlabelled data.

Deep Learning (DL)

Deep Learning is a sub‑field of Machine Learning that uses artificial neural networks with many hidden layers (deep neural networks) to model complex, hierarchical patterns in data.

  • Each neuron computes a weighted sum of its inputs and applies a non‑linear activation function, e.g., \$\sigma(z)=\frac{1}{1+e^{-z}}\$.
  • Training is performed by minimising a loss function \$L\$ using gradient‑based optimisation, typically back‑propagation:

    \$\delta = \frac{\partial L}{\partial w}\$

  • Common architectures:

    • Convolutional Neural Networks (CNNs) – excel at image and spatial data.
    • Recurrent Neural Networks (RNNs) and Long Short‑Term Memory (LSTM) – suited for sequential data such as text or speech.
    • Transformers – state‑of‑the‑art for natural language processing.

Reinforcement Learning (RL)

Reinforcement Learning is a learning paradigm where an agent interacts with an environment, taking actions to maximise a cumulative reward.

  • The problem is formalised as a Markov Decision Process (MDP) defined by the tuple \$(S, A, P, R, \gamma)\$:

    • \$S\$ – set of states
    • \$A\$ – set of actions
    • \$P(s'|s,a)\$ – transition probability
    • \$R(s,a)\$ – immediate reward
    • \$\gamma\$ – discount factor (\$0 \le \gamma < 1\$)

  • The goal is to learn a policy \$\pi(a|s)\$ that maximises the expected return:

    \$Gt = \sum{k=0}^{\infty} \gamma^{k} R_{t+k+1}\$

  • Key algorithms:

    • Q‑learning – learns an action‑value function \$Q(s,a)\$.
    • Policy Gradient methods – directly optimise the policy \$\pi\$.
    • Actor‑Critic – combines value‑based and policy‑based approaches.

Reasons for Using These Methods

  1. Ability to handle large, complex data sets: Deep Learning can automatically extract hierarchical features from raw data, reducing the need for manual feature engineering.
  2. Adaptability and continual improvement: Machine Learning models improve as more data becomes available, making them suitable for dynamic environments.
  3. Decision‑making under uncertainty: Reinforcement Learning enables agents to learn optimal strategies through trial‑and‑error, even when the environment is stochastic.
  4. Generalisation across domains: The same underlying algorithms can be applied to vision, speech, text, games, robotics, and finance.
  5. Automation of repetitive tasks: AI methods can perform classification, prediction, and control tasks faster and more consistently than humans.
  6. Scalability with hardware advances: GPUs and specialised AI accelerators make training deep networks feasible on very large data sets.

Comparison of ML, DL and RL

AspectMachine Learning (ML)Deep Learning (DL)Reinforcement Learning (RL)
Primary GoalLearn a mapping from inputs to outputs using labelled or unlabelled data.Learn hierarchical representations automatically via deep neural networks.Learn a policy that maximises cumulative reward through interaction.
Typical DataStructured tables, feature vectors.High‑dimensional raw data (images, audio, text).Sequences of states, actions, and rewards.
Training ParadigmSupervised, unsupervised, semi‑supervised.Supervised (often) with large labelled data; also unsupervised pre‑training.Trial‑and‑error, often with simulated environments.
Key AlgorithmsDecision trees, S \cdot M, k‑NN, linear regression.CNN, RNN, Transformer, Autoencoder.Q‑learning, Policy Gradient, Actor‑Critic.
StrengthsInterpretability, works with modest data.State‑of‑the‑art performance on perception tasks.Handles sequential decision problems and delayed rewards.
LimitationsFeature engineering required; limited on raw high‑dimensional data.Data‑hungry; computationally intensive; less interpretable.Requires many interactions; exploration‑exploitation trade‑off.

Suggested diagram: Flowchart showing the relationship between AI, Machine Learning, Deep Learning, and Reinforcement Learning, with example applications for each.