use the t-test to compare the means of two different samples (the formula for the t-test will be provided, as shown in the Mathematical requirements)

Statistical Analysis of Biological Variation (Optional Supplement)

Learning Objectives (with Assessment Objectives)

  • AO1 – Knowledge: understand the biological reasons for measuring variation and the statistical concepts behind the t‑test.
  • AO2 – Data handling: select, organise, analyse and evaluate data using the independent‑samples t‑test (or Welch’s t‑test) and report results correctly.
  • AO3 – Experimentation: plan and carry out a simple experiment, check assumptions, and evaluate the reliability and limitations of the statistical conclusions.

Why Study Variation in Biology?

  • Variation is the raw material for natural selection (Topics 17 & 18).
  • Differences in genotype, phenotype, disease susceptibility or treatment response are quantified using statistical tools.
  • Statistical tests tell us whether an observed difference is likely to be a real biological effect or just random chance.

Link to the Cambridge 9700 Syllabus

Syllabus AreaRequirementHow the Supplement Meets It
Mathematical requirements (Statistical tools)Use appropriate statistical tests; interpret p‑values; recognise limitations.Independent‑samples t‑test, Welch’s t‑test, effect‑size (Cohen’s d), checklist of assumptions, non‑parametric alternative (Mann‑Whitney U), brief overview of chi‑square, ANOVA and regression.
Topic 10 – Infectious diseasesExplain how variation in pathogen load influences disease outcome.Example: compare mean bacterial counts in two treatment groups using a t‑test.
Topic 12 – Energy & respirationAnalyse experimental data on enzyme activity or metabolic rate.Worked example on enzyme activity (see below) links directly to this topic.
Topic 17 & 18 – Selection & evolutionUse statistical evidence to support arguments about evolutionary change.Classroom activity on seed‑germination time demonstrates how a t‑test can test hypotheses about adaptive traits.
Data handling (AO2)Select, organise, present, evaluate and interpret data.Step‑by‑step procedure, data‑visualisation templates (box‑plot, dot‑plot), and a summary table of what students should know, do and evaluate.

Statistical Foundations (AO1)

p‑value: the probability of obtaining a test statistic at least as extreme as the observed value, assuming the null hypothesis is true. A small p‑value (≤ α, usually 0.05) leads to rejection of H₀.

Type I & Type II errors:

  • Type I – false positive (rejecting a true H₀). Risk = α.
  • Type II – false negative (failing to reject a false H₀). Risk = β; power = 1 – β.

Confidence interval (CI): a range of values that is likely to contain the true population mean difference with a given confidence (usually 95 %). Reporting a CI alongside a p‑value gives information about the magnitude and precision of the effect.

Effect size (Cohen’s d): quantifies the biological importance of a difference.

\[

d = \frac{\bar{x}1-\bar{x}2}{s_{\text{pooled}}}

\]

where \(s{\text{pooled}}=\sqrt{sp^{2}}\). Interpretation: small (≈0.2), medium (≈0.5), large (≥0.8).

Multiple‑testing correction: when several hypotheses are tested on the same data set, the chance of a Type I error rises. Simple methods such as the Bonferroni correction (divide α by the number of tests) can be mentioned.

Non‑parametric alternative: if normality or homogeneity of variance cannot be satisfied, the Mann‑Whitney U test (also called Wilcoxon rank‑sum) may be used.

Data visualisation: box‑plots or dot‑plots give a quick visual check of normality, spread and outliers before running a t‑test.

When to Use the Independent‑Samples t‑Test (AO2)

  • Two independent groups (e.g., two genotypes, two treatments, two populations).
  • The response variable is continuous and approximately normally distributed in each group.
  • Variances are equal (homoscedastic). If not, use Welch’s t‑test.
  • Sample sizes are moderate (≥ 5 per group) – the test is robust when assumptions are met.

Statistical Checklist (Before Running the Test) – AO2

  1. Independence: Ensure observations in one group do not influence the other.
  2. Normality: Examine histograms, Q‑Q plots, or perform Shapiro‑Wilk/Kolmogorov‑Smirnov tests.
  3. Equality of variances: Conduct Levene’s test or an F‑test.
  4. Choose the test:

    • Equal variances → Student’s (pooled) t‑test.
    • Unequal variances → Welch’s t‑test.
    • Severe non‑normality → Mann‑Whitney U.

Statistical Toolbox Overview (AO1)

  • Student’s (pooled) independent‑samples t‑test – equal variances.
  • Welch’s t‑test – unequal variances.
  • Chi‑square test – association between categorical variables.
  • One‑way ANOVA – compare >2 group means.
  • Simple linear regression – relationship between two continuous variables.
  • Mann‑Whitney U test – non‑parametric alternative for two independent samples.

Formulae (AO1)

Student’s (pooled) independent‑samples t‑test

\[

t = \frac{\bar{x}1 - \bar{x}2}{\sqrt{sp^{2}\!\left(\frac{1}{n1}+\frac{1}{n_2}\right)}}

\]

\[

sp^{2}= \frac{(n1-1)s1^{2}+(n2-1)s2^{2}}{n1+n_2-2}

\]

Welch’s t‑test (unequal variances)

\[

t = \frac{\bar{x}1 - \bar{x}2}{\sqrt{\frac{s1^{2}}{n1}+\frac{s2^{2}}{n2}}}

\]

\[

df = \frac{\left(\frac{s1^{2}}{n1}+\frac{s2^{2}}{n2}\right)^{2}}

{\frac{(s1^{2}/n1)^{2}}{n1-1}+\frac{(s2^{2}/n2)^{2}}{n2-1}}

\]

Effect size (Cohen’s d)

\[

d = \frac{\bar{x}1-\bar{x}2}{\sqrt{s_p^{2}}}

\]

Step‑by‑Step Procedure (Student’s t‑test) – AO2 & AO3

  1. State hypotheses:


    \(H{0}:\mu{1}=\mu_{2}\) (no difference)


    \(H{A}:\mu{1}\neq\mu_{2}\) (two‑tailed) or \(>\) / \(<\) (one‑tailed).

  2. Collect data and enter into a spreadsheet.
  3. Calculate descriptive statistics – means \(\bar{x}1,\bar{x}2\) and variances \(s1^{2},s2^{2}\).
  4. Check assumptions using the checklist. If variances differ, repeat steps 5‑9 with Welch’s formula.
  5. Compute pooled variance \(s_p^{2}\) (or skip this for Welch’s).
  6. Calculate the t‑value** using the appropriate formula.
  7. Determine degrees of freedom:


    Student’s: \(df = n1+n2-2\)


    Welch’s: use the approximation formula above.

  8. Obtain the critical t (or p‑value) from a t‑distribution table or software for the chosen \(\alpha\) (usually 0.05) and the appropriate tail(s).
  9. Decision:

    • If \(|t| > t{\text{crit}}\) (or \(p \le \alpha\)) → reject \(H{0}\).

    • Otherwise, fail to reject \(H_{0}\).

  10. Report the result in the standard format, e.g.

    t(10) = -3.42, p = 0.008, d = 1.2.

  11. Interpretation (AO3): translate the statistical conclusion into a biological statement, discuss effect size, confidence interval and any limitations.

Worked Biological Example (AO1–AO3)

Biological context (Topic 12 – Enzymes): A mutant allele is suspected to increase the activity of a key metabolic enzyme.

Wild‑type (n = 6)Mutant (n = 6)
12.115.4
11.814.9
12.515.1
12.014.7
11.915.3
12.315.0

  • \(n1=n2=6\)
  • \(\bar{x}1 = 12.1\) U mg⁻¹, \(\bar{x}2 = 15.1\) U mg⁻¹
  • \(s1^{2}=0.07\), \(s2^{2}=0.09\)
  • Levene’s test: p = 0.62 → equal variances → use Student’s t‑test.
  • Pooled variance: \(s_p^{2}= \frac{5(0.07)+5(0.09)}{10}=0.08\)
  • \[

    t = \frac{12.1-15.1}{\sqrt{0.08\left(\frac{1}{6}+\frac{1}{6}\right)}}=

    \frac{-3.0}{\sqrt{0.08\times0.333}}=

    \frac{-3.0}{0.163}= -18.4

    \]

  • Degrees of freedom: \(df = 6+6-2 = 10\)
  • Critical t (two‑tailed, α = 0.05, df = 10) = 2.228.
  • \(|t| = 18.4 > 2.228\) → reject \(H_{0}\).
  • Effect size: \(d = \dfrac{-3.0}{\sqrt{0.08}} = -10.6\) (very large).
  • 95 % CI for the mean difference (using software) ≈ –3.2  to –2.8 U mg⁻¹.

Result reporting: t(10) = -18.4, p < 0.001, d = -10.6, 95 % CI = [-3.2, -2.8]

Interpretation & Evaluation (AO3)

  • Biological meaning: The mutant allele produces a markedly higher enzyme activity; the effect is both statistically significant and biologically large.
  • Assumption check: Shapiro‑Wilk p > 0.2 for both groups (normality); Levene’s test p = 0.62 (equal variances). Sample size meets the minimum requirement for a t‑test.
  • Limitations:

    • Only six replicates per group – limited power to detect smaller effects.
    • Experiment performed under a single set of growth conditions; results may not generalise to other environments.
    • Potential hidden confounders (e.g., slight differences in protein extraction efficiency).

  • Further work: increase n, test additional mutant alleles, perform a dose‑response assay, or use ANOVA to compare more than two genotypes.

Suggested Classroom Activity (AO3)

Objective: Test whether two seed genotypes differ in mean germination time.

  1. Formulate a hypothesis (e.g., “Genotype A germinates faster than Genotype B”).
  2. Plant at least 8 seeds of each genotype under identical conditions.
  3. Record the time (hours) each seed takes to germinate.
  4. Enter data into a spreadsheet; produce a box‑plot to visualise spread and check normality.
  5. Run the appropriate t‑test (Student’s or Welch’s) or Mann‑Whitney U if assumptions fail.
  6. Report the result in the format t(df) = value, p = …, d = … and discuss the biological implication.

Key Take‑aways (AO1 & AO2)

  • The independent‑samples t‑test (or Welch’s version) is the standard method for comparing two group means in biological research.
  • Checking assumptions (independence, normality, equal variances) is essential before interpreting the test.
  • Report the test statistic, degrees of freedom, p‑value, effect size and, where possible, a confidence interval.
  • Always link the statistical conclusion back to the underlying biological question and evaluate the reliability of the result.

Summary Table – What Students Should Be Able to Do

AOWhat to Know (AO1)What to Do (AO2)What to Evaluate (AO3)
AO1Concept of variation, normal distribution, p‑value, Type I/II errors, effect size, when to use parametric vs non‑parametric tests.
AO2Select the correct test, calculate means, variances, t‑value, df, p‑value and effect size; produce a box‑plot.
AO3Assess whether assumptions are met, discuss biological relevance of the effect size, comment on sample size, possible biases and suggestions for further investigation.

Suggested Flowchart (to be drawn on board or slide)

Hypothesis → Data collection → Visual check (box‑plot) → Test assumptions (normality, equal variance) → Choose Student’s, Welch’s or Mann‑Whitney U → Calculate test statistic & df → Obtain p‑value (or CI) → Decision (reject/fail to reject H₀) → Biological interpretation & evaluation.