Cambridge A-Level Biology – Evolution: DNA Sequence Data and Evolutionary Relationships
Evolution – Using DNA Sequence Data to Infer Relationships
Learning Objective
Discuss how DNA sequence data can show evolutionary relationships between species.
Key Concepts
DNA as a molecular record of evolutionary history.
Homology vs. analogy at the molecular level.
Sequence similarity and divergence.
Phylogenetic trees constructed from sequence data.
Why DNA Sequences are Useful
DNA is composed of four nucleotides (A, T, C, G). Mutations accumulate over time, providing a clock‑like record of lineage splitting. By comparing the order of nucleotides (or derived amino‑acid sequences) between species, we can estimate how closely related they are.
Steps in Using DNA Data for Phylogeny
Choose a gene or genomic region that is present in all taxa of interest (e.g., cytochrome c oxidase I, ribosomal RNA genes).
Extract, amplify (PCR) and sequence the region from each species.
Align the sequences to identify homologous positions.
Calculate a measure of similarity/distance (e.g., % identity, substitution rates).
Construct a phylogenetic tree using an algorithm (Neighbour‑Joining, Maximum Likelihood, etc.).
Interpret the tree in an evolutionary context.
Measuring Sequence Similarity
Two common metrics are:
Percentage identity: proportion of identical nucleotides after alignment.
Genetic distance\$d\$, often estimated using models such as Jukes‑Cantor:
\$\$
d = -\frac{3}{4}\ln\left(1-\frac{4}{3}p\right)
\$\$
where \$p\$ is the proportion of observed differences.
Example Data Table
Species
Sequence (5'‑3')
Length (bp)
Human (Homo sapiens)
ATGCCGTAGC…
658
Chimpanzee (Pan troglodytes)
ATGCCGTAGT…
658
Mouse (Mus musculus)
ATGTCGTAGC…
658
Frog (Rana temporaria)
ATGACGTAGC…
658
Interpreting the Table
From the example, the human and chimpanzee sequences differ at only 2 positions (≈99.7 % identity), indicating a recent common ancestor. The mouse differs at 12 positions (≈98.2 % identity), and the frog at 30 positions (≈95.4 % identity), reflecting deeper divergence.
Constructing a Phylogenetic Tree
Using the genetic distances calculated from the table, a simple neighbour‑joining tree might be:
Suggested diagram: A rooted phylogenetic tree showing Human and Chimpanzee as sister taxa, with Mouse branching earlier, and Frog as the outgroup.
Factors That Can Mislead Molecular Phylogenies
Horizontal gene transfer (common in bacteria).
Gene duplication and paralogy – comparing non‑orthologous genes can give false relationships.
Variable mutation rates among lineages (rate heterogeneity).
Incomplete lineage sorting.
Linking Molecular Data to Classical Evidence
DNA‑based trees often corroborate morphological and fossil evidence, but can also reveal hidden relationships. For example, molecular data placed whales within the order Artiodactyla (close to hippos), a relationship not obvious from anatomy alone.
Summary Checklist
Identify an appropriate genetic marker.
Obtain high‑quality sequences from each species.
Perform accurate multiple sequence alignment.
Choose a suitable model of sequence evolution.
Construct and test phylogenetic trees.
Compare molecular results with morphological/fossil data.
Exam‑Style Question
Q: Explain how the percentage identity of a mitochondrial gene can be used to infer the evolutionary relationship between two species. Include a brief description of how a phylogenetic tree would be derived from the data.
Answer Outline
State that higher percentage identity indicates more recent common ancestry.
Describe alignment and calculation of % identity.
Explain conversion of % identity to genetic distance using a model (e.g., Jukes‑Cantor).
Outline tree construction (e.g., neighbour‑joining) using the distance matrix.
Interpret the resulting topology in evolutionary terms.
Further Reading (Suggested)
“Molecular Evolution” – Nei & Kumar (chapters on phylogenetic methods).
Cambridge International AS & A Level Biology syllabus – Section on Evolution.