Cambridge Notes, Past Papers, Revision Questions

8.1 Database Concepts – Normalisation (Cambridge AS/A Level 9618)

Learning Objective

Show a thorough understanding of the normalisation process, how it is used to design efficient, anomaly‑free relational databases and how it connects with other parts of the syllabus – DBMS functions, SQL, ER‑diagrams, data‑integrity and security.

1. Core Relational Terminology

Term	Definition (Cambridge)
Relation (table)	A set of tuples (rows) that share the same attributes (columns).
Tuple (record)	A single row of a relation.
Attribute (field)	A column; a property of the entity the relation represents.
Domain	The set of all possible values an attribute may take.
Primary key	A minimal set of attributes that uniquely identifies each tuple.
Candidate key	Any minimal set of attributes that could serve as a primary key.
Prime attribute	An attribute that belongs to a candidate key.
Foreign key	An attribute (or set) that references the primary key of another relation.
Determinant	An attribute (or set) that appears on the left‑hand side of a functional dependency.
Functional dependency (FD)	Written `X → Y`; the value of `X` uniquely determines the value of `Y`.
Partial dependency	An FD in which a proper subset of a composite key determines a non‑prime attribute.
Transitive dependency	An FD in which a non‑prime attribute determines another non‑prime attribute.

2. DBMS Responsibilities (Syllabus 8.2)

Data definition & storage – data dictionary, schema management.

Data manipulation – DML commands (SELECT, INSERT, UPDATE, DELETE).

Security & access control – user accounts, privileges, authentication.

Integrity enforcement – primary‑key, foreign‑key, unique, check constraints.

Backup & recovery – regular backups, transaction logs, restore procedures.

Concurrency control – locking, isolation levels.

3. Data Definition Language (DDL) – Sample SQL

-- three tables used in the normalisation example
CREATE TABLE Student (
StudentID   CHAR(3) PRIMARY KEY,
StudentName VARCHAR(30) NOT NULL
);
CREATE TABLE Course (
CourseCode  CHAR(4) PRIMARY KEY,
CourseTitle VARCHAR(40) NOT NULL,
Lecturer    VARCHAR(30) NOT NULL
);
CREATE TABLE Enrolment (
StudentID   CHAR(3),
CourseCode  CHAR(4),
Semester    VARCHAR(10) NOT NULL,
PRIMARY KEY (StudentID, CourseCode),
FOREIGN KEY (StudentID) REFERENCES Student(StudentID),
FOREIGN KEY (CourseCode) REFERENCES Course(CourseCode)
);

4. Data Manipulation Language (DML) – Sample SQL

INSERT INTO Student VALUES ('101','Alice'),('102','Bob');
INSERT INTO Course VALUES 
('C101','Programming','Dr. Smith'),
('C202','Databases','Prof. Lee');
INSERT INTO Enrolment VALUES
('101','C101','Fall'),
('102','C101','Fall'),
('101','C202','Spring');
-- list all courses taken by Alice
SELECT e.CourseCode, c.CourseTitle, e.Semester
FROM Enrolment e
JOIN Student s   ON e.StudentID = s.StudentID
JOIN Course c    ON e.CourseCode = c.CourseCode
WHERE s.StudentName = 'Alice';

5. ER‑Diagram Notation (Syllabus 8.3)

Entity – rectangle; attributes – ovals (key attributes underlined).

Relationship – diamond; cardinality shown by (1, N, M) or crow’s‑foot notation.

Mapping rules:
- One entity → one table.
- A many‑to‑many relationship → an associative table containing the primary keys of the participating entities as foreign keys.

ER diagram for the example (Student 1‑* Enrolment *‑1 Course).

6. Normalisation – Theory and Forms

6.1 Why Normalise?

Eliminate redundancy – each fact stored once.

Prevent the three classic anomalies:
- Update anomaly – a change must be made in many places.
- Insertion anomaly – some data cannot be added without other data.
- Deletion anomaly – removal of one fact unintentionally deletes another.

Make the schema easier to maintain, extend and query.

6.2 Functional Dependencies (FDs)

An FD X → Y holds in a relation R if, for any two tuples of R, equality of the X‑values guarantees equality of the Y‑values.

Key concepts:

Closure – the set of attributes that can be functionally derived from a given set.

Minimal cover – a simplified, equivalent set of FDs used for decomposition.

Lossless‑join decomposition – after splitting a relation, joining the resulting tables on their common attributes reproduces the original relation without spurious tuples.

Dependency preservation – all original FDs can be enforced by constraints on the decomposed tables.

6.3 Normal Forms up to BCNF

Normal Form	Requirements	Typical Anomalies Removed
1NF	All attribute values are atomic; no repeating groups; a primary key is defined.	None (pre‑condition for higher forms).
2NF	1NF + every non‑prime attribute is fully functionally dependent on the whole primary key (no partial dependencies).	Partial‑dependency update anomalies.
3NF	2NF + no transitive dependencies of non‑prime attributes on other non‑prime attributes.	Transitive‑dependency update anomalies.
BCNF	Every determinant is a candidate key (stronger than 3NF).	Remaining anomalies where a non‑key determinant exists.

6.4 Higher Normal Forms (optional for A‑Level)

4NF – eliminates multi‑valued dependencies.

5NF (PJ/NF) – eliminates join dependencies that are not implied by candidate keys.

7. Step‑by‑Step Normalisation Process (Checklist)

Ensure 1NF – verify atomicity and define a primary key.

Identify all functional dependencies – use domain knowledge, business rules or ER‑diagram cardinalities.

Determine candidate keys and prime attributes.

Apply 2NF – if a composite key exists, remove any partial dependencies by creating separate tables.

Apply 3NF – eliminate transitive dependencies; each non‑prime attribute must depend directly on a candidate key.

Check BCNF – if a determinant is not a candidate key, decompose further (ensure lossless join and, where possible, dependency preservation).

Validate the design – test with sample data, verify that the three anomalies are gone, and confirm that all original FDs are enforceable.

Implement – write DDL, add primary‑key, foreign‑key and CHECK constraints, and grant appropriate privileges.

8. Detailed Normalisation Example

8.1 Original (Unnormalised) Table

StudentID	StudentName	CourseCode	CourseTitle	Lecturer	Semester
101	Alice	C101	Programming	Dr. Smith	Fall
102	Bob	C101	Programming	Dr. Smith	Fall
101	Alice	C202	Databases	Prof. Lee	Spring

8.2 Functional Dependencies

StudentID → StudentName

CourseCode → CourseTitle, Lecturer

(StudentID, CourseCode) → Semester

8.3 1NF Check

All columns contain single, indivisible values, so the table already satisfies 1NF.

8.4 2NF Decomposition (remove partial dependencies)

Because the primary key is composite (StudentID, CourseCode), the dependencies StudentID → StudentName and CourseCode → CourseTitle, Lecturer are partial. Split into three relations:

Student (PK = StudentID)
StudentID	StudentName
101	Alice
102	Bob

Course (PK = CourseCode)
CourseCode	CourseTitle	Lecturer
C101	Programming	Dr. Smith
C202	Databases	Prof. Lee

Enrolment (PK = StudentID, CourseCode)
StudentID	CourseCode	Semester
101	C101	Fall
102	C101	Fall
101	C202	Spring

All non‑prime attributes now depend on the whole primary key of their own table → 2NF achieved.

8.5 3NF Check (remove transitive dependencies)

In each of the three tables the only determinants are the primary keys, so there are no transitive dependencies. The design is therefore already in 3NF.

8.6 BCNF Verification

Student: determinant StudentID is the sole candidate key → BCNF.

Course: determinant CourseCode is the candidate key → BCNF.

Enrolment: determinant (StudentID, CourseCode) is the composite candidate key → BCNF.

8.7 Anomaly Elimination

Update anomaly – changing a lecturer’s name now requires a single update in Course instead of many rows.

Insertion anomaly – a new student can be added to Student without needing an enrolment record.

Deletion anomaly – removing the last enrolment of a course does not delete the course information, because it resides in Course.

9. Linking Normalisation to the Rest of the Syllabus

DBMS Functions – the DBMS enforces the primary‑key, foreign‑key and CHECK constraints that result from a normalised design.

SQL – DDL statements create the tables; DML statements manipulate data while preserving integrity.

ER‑Diagrams – a well‑drawn ER diagram makes functional dependencies explicit, guiding the normalisation process.

Data Integrity – normalisation reduces redundancy, making integrity constraints easier to maintain.

Security – fewer tables and clearer relationships simplify the assignment of privileges (GRANT/REVOKE) and auditing.

10. Summary of the Normalisation Process

Confirm the relation is in 1NF (atomic values, primary key).

List all functional dependencies; identify candidate keys and prime attributes.

Apply 2NF – split off tables to eliminate partial dependencies.

Apply 3NF – split further to remove transitive dependencies.

Check BCNF – if any determinant is not a candidate key, decompose while ensuring a lossless join.

Validate the design with sample data, confirm that update/insertion/deletion anomalies are gone, and that original FDs are enforceable.

Implement the schema with DDL, add integrity constraints, and secure the database using appropriate privileges.

Key Take‑aways

Normalisation is a systematic, syllabus‑driven method for eliminating redundancy and preventing the three classic anomalies.

Mastering functional dependencies, candidate keys, and the distinction between partial and transitive dependencies is essential for moving through the normal forms.

A well‑normalised schema maps cleanly from an ER diagram to a set of SQL tables, making future extensions, testing and maintenance straightforward.

DBMS features (DDL/DML, integrity constraints, security, backup) complement the design process and are part of the same assessment criteria.

Show understanding of the normalisation process

8.1 Database Concepts – Normalisation (Cambridge AS/A Level 9618)