Continuous Random Variables – Probability & Statistics 2 (S2)
1. Quick Review of Prerequisite Material
1.1 Representation of Data (Syllabus 5.1)
Stem‑and‑leaf plot: splits each observation into a “stem’’ (all but the last digit) and a “leaf’’ (the last digit). Example for the data 12, 13, 14, 21, 22:
12 | 3 4
21 | 2
Box‑plot (five‑number summary):
Minimum, Q₁, Median, Q₃, Maximum.
Shows centre (median) and spread (IQR = Q₃‑Q₁).
Histogram:
Class intervals on the horizontal axis, frequency (or relative frequency) on the vertical axis.
Area of each bar = proportion of observations in that class.
Cumulative frequency / ogive: plots cumulative proportion against the upper class boundary; useful for estimating percentiles.
1.2 Permutations & Combinations (Syllabus 5.2)
Counting formulas that appear in discrete‑distribution questions:
Tree diagrams are a quick way to organise multi‑stage experiments.
1.4 Discrete Random Variables (Syllabus 5.4 & 6.1)
Binomial distribution \(B(n,p)\):
\[
P(X=k)=\binom{n}{k}p^{k}(1-p)^{\,n-k},\qquad k=0,\dots ,n.
\]
Mean \(E[X]=np\), variance \(\operatorname{Var}(X)=np(1-p)\).
Geometric distribution (number of trials until the first success):
\[
P(X=k)=(1-p)^{k-1}p,\qquad k=1,2,\dots
\]
Mean \(E[X]=\dfrac1p\), variance \(\operatorname{Var}(X)=\dfrac{1-p}{p^{2}}\).
Poisson distribution (continuous‑time) – see Section 2.6.
2. Continuous Random Variables
2.1 Definition
A random variable \(X\) is continuous when its possible values form an interval (or a union of intervals) on the real line and
\[
P(X=x)=0\qquad\text{for every single value }x.
\]
2.2 Probability Density Function (pdf)
The pdf \(f(x)\) satisfies
\[
P(a\le X\le b)=\int_{a}^{b} f(x)\,dx .
\]
\(f(x)\ge 0\) for all \(x\).
\(\displaystyle\int_{-\infty}^{\infty} f(x)\,dx =1\) (total area = 1).
2.5 The Normal Distribution – Recap (Syllabus 5.5)
Standard normal \(Z\sim N(0,1)\) with pdf \(\displaystyle \phi(z)=\frac{1}{\sqrt{2\pi}}e^{-z^{2}/2}\).
Standardisation:
\[
Z=\frac{X-\mu}{\sigma}\quad\Longrightarrow\quad Z\sim N(0,1)
\]
for any \(X\sim N(\mu,\sigma^{2})\).
Reading a Z‑table (excerpt):
z
0.00
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.0
0.5000
0.5040
0.5080
0.5120
0.5160
0.5199
0.5239
0.5279
0.5319
0.5359
0.1
0.5398
0.5438
0.5478
0.5517
0.5557
0.5596
0.5636
0.5675
0.5714
0.5753
0.2
0.5793
0.5832
0.5871
0.5910
0.5948
0.5987
0.6026
0.6064
0.6103
0.6141
To find \(P(Z\le 1.23)\): look up row 1.2, column 0.03 → 0.8907.
Continuity correction when a normal distribution approximates a discrete count:
\[
P(X\le k)\approx P\!\left(Z\le\frac{k+0.5-\mu}{\sigma}\right),\qquad
P(X\ge k)\approx P\!\left(Z\ge\frac{k-0.5-\mu}{\sigma}\right).
\]
Normal approximation to binomial (when \(np\ge5\) and \(n(1-p)\ge5\)):
\[
X\sim B(n,p)\;\approx\;N\bigl(np,\;np(1-p)\bigr).
\]
Normal approximation to Poisson (when \(\lambda\ge10\)):
\[
X\sim\text{Poisson}(\lambda)\;\approx\;N\bigl(\lambda,\;\lambda\bigr).
\]
2.6 Poisson Distribution (Continuous‑time) – Syllabus 6.1
If events occur independently at a constant average rate \(\lambda\) per unit time, the number of events \(X\) in a fixed interval follows a Poisson distribution:
Mean and variance are equal: \(\displaystyle E[X]=\operatorname{Var}(X)=\lambda.\)
The time between successive events has an exponential distribution \(\operatorname{Exp}(\lambda)\) (memoryless property).
Derivation as the limit of a binomial:
\[
\lim_{n\to\infty}\! \binom{n}{k}\Bigl(\frac{\lambda}{n}\Bigr)^{k}
\Bigl(1-\frac{\lambda}{n}\Bigr)^{n-k}= \frac{e^{-\lambda}\lambda^{k}}{k!}.
\]
2.7 Linear Combinations of Random Variables – Syllabus 6.2
If \(X\) and \(Y\) are independent,
\[
E[aX+bY]=aE[X]+bE[Y],\qquad
\operatorname{Var}(aX+bY)=a^{2}\operatorname{Var}(X)+b^{2}\operatorname{Var}(Y).
\]
For normal variables, any linear combination is also normal:
\[
X\sim N(\mu_X,\sigma_X^{2}),\;Y\sim N(\mu_Y,\sigma_Y^{2})\;\Longrightarrow\;
aX+bY\sim N\bigl(a\mu_X+b\mu_Y,\;a^{2}\sigma_X^{2}+b^{2}\sigma_Y^{2}\bigr).
\]
3. Expectation and Variance for Continuous Variables
3.1 Expectation (Mean)
\[
E[X]=\int_{-\infty}^{\infty} x\,f(x)\,dx
\qquad\text{(provided the integral converges).}
\]
Linearity: \(E[aX+b]=aE[X]+b\) for constants \(a,b\).
If \(X\) and \(Y\) are independent, \(E[XY]=E[X]E[Y]\).
4. Sampling, Estimation and Hypothesis Testing (Syllabus 5.6 & 5.7)
4.1 Sampling Distributions
If a sample of size \(n\) is taken from a population with mean \(\mu\) and known standard deviation \(\sigma\), the sampling distribution of the sample mean \(\bar{X}\) is
\[
\bar{X}\sim N\!\left(\mu,\;\frac{\sigma^{2}}{n}\right)
\]
(Central Limit Theorem) when \(n\) is large (or the population is normal).
For a proportion \(p\) based on a binomial sample, \(\hat{p}\) is approximately normal with
\[
\hat{p}\sim N\!\left(p,\;\frac{p(1-p)}{n}\right)
\]
provided \(np\ge5\) and \(n(1-p)\ge5\).
4.2 Point & Interval Estimates
Unbiased estimator: an estimator \(\hat\theta\) with \(E[\hat\theta]=\theta\). Example: \(\bar{X}\) is an unbiased estimator of the population mean \(\mu\).
Confidence interval for a mean (σ known):
\[
\bar{x}\;\pm\;z_{\alpha/2}\,\frac{\sigma}{\sqrt{n}},
\]
where \(z_{\alpha/2}\) is the critical value from the standard normal table.
Example (95 % CI): \(\bar{x}=12,\; \sigma=3,\; n=36\).
\[
12\pm1.96\frac{3}{6}=12\pm0.98\;\Rightarrow\;(11.02,\;12.98).
\]
4.3 Hypothesis Testing (Syllabus 5.7)
State hypotheses:
Null hypothesis \(H_{0}\): the statement to be tested (e.g. \(\mu=10\)).
Alternative hypothesis \(H_{1}\): what we would conclude if \(H_{0}\) is rejected (e.g. \(\mueq10\)).
Select significance level \(\alpha\) (commonly 0.05 or 0.01).
Choose test statistic:
For a mean with known \(\sigma\): \(Z=\dfrac{\bar{X}-\mu_{0}}{\sigma/\sqrt{n}}\).
For a mean with unknown \(\sigma\): \(t=\dfrac{\bar{X}-\mu_{0}}{s/\sqrt{n}}\) (use \(t\)-distribution with \(n-1\) df).
For a proportion: \(Z=\dfrac{\hat{p}-p_{0}}{\sqrt{p_{0}(1-p_{0})/n}}\).
Determine critical region (using Z‑ or t‑tables) or compute the p‑value.
Decision:
If test statistic lies in the critical region (or p‑value < α), reject \(H_{0}\).
Otherwise, do not reject \(H_{0}\).
Interpretation in the context of the problem.
Decision‑flowchart for a hypothesis test
5. Summary Checklist (Paper 6 – S2)
Identify the type of random variable (discrete vs continuous) and write down its support.
For a continuous variable, write the pdf \(f(x)\) and verify \(\displaystyle\int f(x)\,dx=1\).
Compute the mean:
\[
E[X]=\int x\,f(x)\,dx.
\]
Find the second moment \(E[X^{2}]\) (or use a known formula) and obtain the variance:
\[
\operatorname{Var}(X)=E[X^{2}]-(E[X])^{2}.
\]
Apply linearity, scaling, and independence rules for transformed variables or sums.
For normal‑related questions:
Standardise to a \(Z\)‑score.
Read the required area from a Z‑table (or use a calculator).
Apply continuity correction when approximating a discrete count.
Recall the binomial, geometric and Poisson distributions, their means and variances, and when normal approximation is appropriate.
For sampling problems, use the appropriate sampling distribution (often normal by the CLT).
When required, construct a point estimate, a confidence interval, or carry out a hypothesis test using the steps in Section 4.3.
For sums or linear combinations of independent variables, add means and add variances (or use the normal‑sum rule).
Sketches of three common pdfs (Uniform, Exponential, Normal)
Your generous donation helps us continue providing free Cambridge IGCSE & A-Level resources,
past papers, syllabus notes, revision questions, and high-quality online tutoring to students across Kenya.