Probability & Statistics 1 (S1) – Cambridge AS & A Level Mathematics 9709
1. Data Representation & Summary Measures
1.1 Graphical displays
- Stem‑and‑leaf plot – retains the original values; useful for small to medium raw data sets.
- Box‑and‑whisker diagram – shows minimum, Q₁, median, Q₃ and maximum (and outliers); ideal for comparing several groups.
- Histogram – frequency of class intervals; the area of each bar is proportional to the frequency.
- Cumulative frequency graph (ogive) – displays the number of observations ≤ a given value; useful for estimating percentiles.
| When to use which display | Data type / size | Key information shown |
| Stem‑and‑leaf |
Raw (ungrouped) data, n ≈ 10–30 |
Exact values, shape of distribution |
| Box‑and‑whisker |
Grouped or raw data, any n; especially for several data sets |
Five‑number summary, outliers, comparison |
| Histogram |
Grouped data, n ≥ 20, many classes |
Shape, modality, skewness, approximate density |
| Ogive |
Grouped data, need cumulative information |
Percentiles, median, quartiles by reading off the graph |
1.2 Numerical summary measures
| Measure | Formula (ungrouped data) | Formula (grouped data) |
| Mean \(\bar{x}\) |
\(\displaystyle \bar{x}=\frac{1}{n}\sum_{i=1}^{n}x_i\) |
\(\displaystyle \bar{x}=\frac{\sum f_i\,c_i}{\sum f_i}\) (\(c_i\)= class midpoint) |
| Median |
Middle value after ordering (or average of the two middle values if \(n\) is even) |
Locate the class containing the \(\frac{n}{2}\)‑th observation and interpolate: |
| Mode |
Most frequent value (or values) |
Class with highest frequency (modal class) |
| Range |
\(\displaystyle \text{Range}=x_{\max}-x_{\min}\) |
Same formula using class limits |
| Inter‑quartile range (IQR) |
\(\displaystyle \text{IQR}=Q_3-Q_1\) |
Interpolate within class intervals for \(Q_1\) and \(Q_3\) |
| Standard deviation (sample) \(s\) |
\(\displaystyle s=\sqrt{\frac{1}{n-1}\sum_{i=1}^{n}(x_i-\bar{x})^2}\) |
\(\displaystyle s=\sqrt{\frac{1}{\sum f_i-1}\sum f_i(c_i-\bar{x})^2}\) |
1.3 Worked example – class test scores
Scores (out of 20): 12, 15, 15, 16, 17, 18, 19, 19, 20 (n = 9)
- Mean: \(\displaystyle \bar{x}= \frac{12+15+15+16+17+18+19+19+20}{9}=16.33\)
- Median: 17 (5th value in the ordered list)
- Mode: 15 and 19 (both appear twice)
- Range: \(20-12=8\)
- Standard deviation: \(s\approx2.30\)
- Probability of scoring at least 18: \(\displaystyle P(X\ge18)=\frac{3}{9}=\frac13\)
2. Counting Techniques – Permutations & Combinations
2.1 Basic definitions
- Permutation – ordered arrangement of distinct objects.
- Combination – selection of objects where order does not matter.
2.2 Formulas
\[
\begin{aligned}
\text{Permutations of }n\text{ objects taken }r\text{ at a time:}&\qquad {}^{n}P_{r}= \frac{n!}{(n-r)!} \\
\text{Combinations of }n\text{ objects taken }r\text{ at a time:}&\qquad {}^{n}C_{r}= \binom{n}{r}= \frac{n!}{r!\,(n-r)!}
\end{aligned}
\]
2.3 Typical examples
- Ordering three different books on a shelf: \({}^{3}P_{3}=3!=6\) ways.
- Choosing a committee of 3 from 8 pupils: \({}^{8}C_{3}= \binom{8}{3}=56\) ways.
- Forming a 4‑digit code using the digits 0–9 without repetition: \({}^{10}P_{4}=5040\) codes.
- Number of ways to draw 2 red cards from a pack of 26 red cards: \({}^{26}C_{2}=325\).
3. Probability – Basic Concepts & Notation
- Sample space \(S\) – the set of all possible outcomes.
- Event \(A\) – any subset of \(S\) (including \(\varnothing\) and \(S\)).
- Probability function \(P\) satisfies:
- \(0\le P(A)\le1\)
- \(P(S)=1\)
- \(P(\varnothing)=0\)
| Symbol | Meaning |
| \(P(A)\) | Probability of event \(A\) |
| \(P(A^{c})\) or \(P(A')\) | Probability of the complement of \(A\) |
| \(P(A\cup B)\) | Probability that at least one of \(A\) or \(B\) occurs |
| \(P(A\cap B)\) | Probability that both \(A\) and \(B\) occur |
| \(P(A\mid B)\) | Conditional probability of \(A\) given that \(B\) has occurred |
4. Core Probability Rules
4.1 Complement rule
\[
P(A^{c}) = 1 - P(A).
\]
4.2 Addition rule (general form)
\[
P(A\cup B)=P(A)+P(B)-P(A\cap B).
\]
- If \(A\) and \(B\) are mutually exclusive (\(A\cap B=\varnothing\)):
\[
P(A\cup B)=P(A)+P(B).
\]
4.3 Multiplication rule (general form)
\[
P(A\cap B)=P(A)\,P(B\mid A)=P(B)\,P(A\mid B).
\]
- If \(A\) and \(B\) are independent (\(P(B\mid A)=P(B)\)):
\[
P(A\cap B)=P(A)P(B).
\]
4.4 Inclusion–exclusion for three events
\[
\begin{aligned}
P(A\cup B\cup C)=&\;P(A)+P(B)+P(C)\\
&-\bigl[P(A\cap B)+P(A\cap C)+P(B\cap C)\bigr]\\
&+P(A\cap B\cap C).
\end{aligned}
\]
4.5 Worked three‑event example (standard deck of 52 cards)
Let
- \(A\): “the card is a heart’’ \(P(A)=\frac{13}{52}\).
- \(B\): “the card is a face card’’ \(P(B)=\frac{12}{52}\).
- \(C\): “the card is a red ace’’ \(P(C)=\frac{2}{52}\).
Intersections
- \(P(A\cap B)=\frac{3}{52}\) (J, Q, K of hearts).
- \(P(A\cap C)=\frac{1}{52}\) (ace of hearts).
- \(P(B\cap C)=0\) (a red ace is not a face card).
- \(P(A\cap B\cap C)=0\).
Therefore
\[
P(A\cup B\cup C)=\frac{13}{52}+\frac{12}{52}+\frac{2}{52}
-\left(\frac{3}{52}+\frac{1}{52}+0\right)+0
=\frac{23}{52}.
\]
5. Conditional Probability & Bayes’ Theorem
5.1 Definition
\[
P(B\mid A)=\frac{P(A\cap B)}{P(A)}\qquad\text{(provided }P(A)>0\text{)}.
\]
5.2 Interpretation
- The denominator “renormalises’’ the sample space to the outcomes where \(A\) occurs.
- If \(A\) and \(B\) are independent, \(P(B\mid A)=P(B)\).
5.3 Law of total probability
If \(\{A_1,\dots,A_k\}\) form a partition of \(S\) (mutually exclusive and exhaustive), then for any event \(B\):
\[
P(B)=\sum_{i=1}^{k}P(B\mid A_i)\,P(A_i).
\]
5.4 Bayes’ theorem
\[
P(A_i\mid B)=\frac{P(B\mid A_i)\,P(A_i)}{\displaystyle\sum_{j=1}^{k}P(B\mid A_j)\,P(A_j)}\qquad(P(B)>0).
\]
5.5 Worked Bayes example – medical test
Population prevalence of a disease: \(P(D)=0.01\).
Test characteristics: sensitivity \(P(T\mid D)=0.95\); specificity \(P(T^{c}\mid D^{c})=0.90\) ⇒ \(P(T\mid D^{c})=0.10\).
Step 1 – total probability of a positive test:
\[
P(T)=P(T\mid D)P(D)+P(T\mid D^{c})P(D^{c})
=0.95(0.01)+0.10(0.99)=0.1085.
\]
Step 2 – apply Bayes:
\[
P(D\mid T)=\frac{0.95\times0.01}{0.1085}\approx0.088\;(8.8\%).
\]
Interpretation: even with a positive result the chance of actually having the disease is still below 10 % because the disease is rare.
6. Mutually Exclusive vs. Independent Events
6.1 Definitions
- Mutually exclusive (disjoint): \(A\cap B=\varnothing\). The occurrence of one prevents the other.
- Independent: \(P(A\cap B)=P(A)P(B)\) (equivalently \(P(B\mid A)=P(B)\) and \(P(A\mid B)=P(A)\)).
6.2 Comparison table
| Property | Mutually exclusive | Independent |
| Definition |
\(A\cap B=\varnothing\) |
\(P(A\cap B)=P(A)P(B)\) |
| Addition rule |
\(P(A\cup B)=P(A)+P(B)\) |
\(P(A\cup B)=P(A)+P(B)-P(A)P(B)\) |
| Conditional probability |
\(P(B\mid A)=0\) (if \(P(A)>0\)) |
\(P(B\mid A)=P(B)\) |
| Can both occur? |
No (unless one has probability 0) |
Yes (unless one has probability 0) |
| Typical example |
Rolling a 1 and rolling a 2 on a single die |
Two successive tosses of a fair coin |
6.3 Independence for more than two events
Events \(A,B,C\) are mutually independent** if:
- All pairwise products hold:
\[
P(A\cap B)=P(A)P(B),\;P(A\cap C)=P(A)P(C),\;P(B\cap C)=P(B)P(C);
\]
- and the joint product holds:
\[
P(A\cap B\cap C)=P(A)P(B)P(C).
\]
Pairwise independence alone does **not** guarantee joint independence. A classic counter‑example uses the outcomes of tossing two fair coins and defining:
- \(A\): “first coin is heads’’
- \(B\): “second coin is heads’’
- \(C\): “the two coins show the same face’’
Each pair is independent, but \(P(A\cap B\cap C)=\tfrac14eq\tfrac12\cdot\tfrac12\cdot\tfrac12\), so the three are not jointly independent.
7. Venn Diagrams – Visualising Relationships
- Two non‑overlapping circles → mutually exclusive events.
- Two overlapping circles → possible dependence; the overlap area represents \(P(A\cap B)\).
- Three‑circle Venn diagram → illustrates the inclusion–exclusion formula; each of the seven regions corresponds to a distinct combination of the three events occurring or not.
8. Common Pitfalls & How to Avoid Them
- Confusing mutually exclusive with independent. Mutually exclusive events can never occur together, so they are independent only when one has probability 0.
- Omitting the intersection term in the addition rule. Forgetting \(-P(A\cap B)\) can produce a total > 1.
- Applying the two‑event multiplication rule to three or more events without checking joint independence. Verify all pairwise and the full‑joint products.
- Conditioning on an event of probability 0. The definition of \(P(B\mid A)\) is undefined; re‑examine the problem.
- Skipping the law of total probability before using Bayes. Compute \(P(B)\) first, otherwise the denominator is missing.
9. Checklist for Solving Probability Problems
- Identify the sample space \(S\) and list all relevant events.
- Write the required probability using correct notation (e.g., \(P(A\cup B),\;P(A\mid B)\)).
- Classify the relationship between events:
- Mutually exclusive?
- Independent?
- Neither?
- Select the appropriate rule:
- Complement – \(1-P(A)\).
- Addition – use the general formula; simplify if mutually exclusive.
- Multiplication – use the conditional form; simplify if independent.
- Inclusion–exclusion – for three or more events.
- If a conditional probability is required, first find the intersection \(P(A\cap B)\) then divide by \(P(\text{given event})\).
- For “reverse’’ conditioning, apply Bayes’ theorem together with the law of total probability.
- Check that the final answer lies between 0 and 1 and that any exhaustive set of outcomes sums to 1.
10. Quick Reference – Probability Rules
| Rule | Formula | When applicable |
| Complement |
\(P(A^{c})=1-P(A)\) |
Any event |
| Addition (general) |
\(P(A\cup B)=P(A)+P(B)-P(A\cap B)\) |
Any two events |
| Addition (mutually exclusive) |
\(P(A\cup B)=P(A)+P(B)\) |
\(A\cap B=\varnothing\) |
| Multiplication (general) |
\(P(A\cap B)=P(A)P(B\mid A)\) |
Any two events |
| Multiplication (independent) |
\(P(A\cap B)=P(A)P(B)\) |
\(P(B\mid A)=P(B)\) |
| Inclusion–exclusion (three events) |
\(P(A\cup B\cup C)=P(A)+P(B)+P(C)-[P(A\cap B)+P(A\cap C)+P(B\cap C)]+P(A\cap B\cap C)\) |
Three events (any relationship) |
| Conditional probability |
\(P(B\mid A)=\dfrac{P(A\cap B)}{P(A)}\) |
\(P(A)>0\) |
| Bayes’ theorem |
\(P(A_i\mid B)=\dfrac{P(B\mid A_i)P(A_i)}{\sum_j P(B\mid A_j)P(A_j)}\) |
Reversing conditioning; events \(\{A_i\}\) form a partition |