Show understanding of the normalisation process

8.1 Database Concepts – Normalisation (Cambridge AS/A Level 9618)

Learning Objective

Show a thorough understanding of the normalisation process, how it is used to design efficient, anomaly‑free relational databases and how it connects with other parts of the syllabus – DBMS functions, SQL, ER‑diagrams, data‑integrity and security.

1. Core Relational Terminology

TermDefinition (Cambridge)
Relation (table)A set of tuples (rows) that share the same attributes (columns).
Tuple (record)A single row of a relation.
Attribute (field)A column; a property of the entity the relation represents.
DomainThe set of all possible values an attribute may take.
Primary keyA minimal set of attributes that uniquely identifies each tuple.
Candidate keyAny minimal set of attributes that could serve as a primary key.
Prime attributeAn attribute that belongs to a candidate key.
Foreign keyAn attribute (or set) that references the primary key of another relation.
DeterminantAn attribute (or set) that appears on the left‑hand side of a functional dependency.
Functional dependency (FD)Written X → Y; the value of X uniquely determines the value of Y.
Partial dependencyAn FD in which a proper subset of a composite key determines a non‑prime attribute.
Transitive dependencyAn FD in which a non‑prime attribute determines another non‑prime attribute.

2. DBMS Responsibilities (Syllabus 8.2)

  • Data definition & storage – data dictionary, schema management.
  • Data manipulation – DML commands (SELECT, INSERT, UPDATE, DELETE).
  • Security & access control – user accounts, privileges, authentication.
  • Integrity enforcement – primary‑key, foreign‑key, unique, check constraints.
  • Backup & recovery – regular backups, transaction logs, restore procedures.
  • Concurrency control – locking, isolation levels.

3. Data Definition Language (DDL) – Sample SQL

-- three tables used in the normalisation example
CREATE TABLE Student (
    StudentID   CHAR(3) PRIMARY KEY,
    StudentName VARCHAR(30) NOT NULL
);

CREATE TABLE Course (
    CourseCode  CHAR(4) PRIMARY KEY,
    CourseTitle VARCHAR(40) NOT NULL,
    Lecturer    VARCHAR(30) NOT NULL
);

CREATE TABLE Enrolment (
    StudentID   CHAR(3),
    CourseCode  CHAR(4),
    Semester    VARCHAR(10) NOT NULL,
    PRIMARY KEY (StudentID, CourseCode),
    FOREIGN KEY (StudentID) REFERENCES Student(StudentID),
    FOREIGN KEY (CourseCode) REFERENCES Course(CourseCode)
);

4. Data Manipulation Language (DML) – Sample SQL

INSERT INTO Student VALUES ('101','Alice'),('102','Bob');

INSERT INTO Course VALUES 
    ('C101','Programming','Dr. Smith'),
    ('C202','Databases','Prof. Lee');

INSERT INTO Enrolment VALUES
    ('101','C101','Fall'),
    ('102','C101','Fall'),
    ('101','C202','Spring');

-- list all courses taken by Alice
SELECT e.CourseCode, c.CourseTitle, e.Semester
FROM Enrolment e
JOIN Student s   ON e.StudentID = s.StudentID
JOIN Course c    ON e.CourseCode = c.CourseCode
WHERE s.StudentName = 'Alice';

5. ER‑Diagram Notation (Syllabus 8.3)

  • Entity – rectangle; attributes – ovals (key attributes underlined).
  • Relationship – diamond; cardinality shown by (1, N, M) or crow’s‑foot notation.
  • Mapping rules:
    • One entity → one table.
    • A many‑to‑many relationship → an associative table containing the primary keys of the participating entities as foreign keys.
ER diagram for the example (Student 1‑* Enrolment *‑1 Course).

6. Normalisation – Theory and Forms

6.1 Why Normalise?

  • Eliminate redundancy – each fact stored once.
  • Prevent the three classic anomalies:
    • Update anomaly – a change must be made in many places.
    • Insertion anomaly – some data cannot be added without other data.
    • Deletion anomaly – removal of one fact unintentionally deletes another.
  • Make the schema easier to maintain, extend and query.

6.2 Functional Dependencies (FDs)

An FD X → Y holds in a relation R if, for any two tuples of R, equality of the X‑values guarantees equality of the Y‑values.

Key concepts:

  • Closure – the set of attributes that can be functionally derived from a given set.
  • Minimal cover – a simplified, equivalent set of FDs used for decomposition.
  • Lossless‑join decomposition – after splitting a relation, joining the resulting tables on their common attributes reproduces the original relation without spurious tuples.
  • Dependency preservation – all original FDs can be enforced by constraints on the decomposed tables.

6.3 Normal Forms up to BCNF

Normal FormRequirementsTypical Anomalies Removed
1NF All attribute values are atomic; no repeating groups; a primary key is defined. None (pre‑condition for higher forms).
2NF 1NF + every non‑prime attribute is fully functionally dependent on the whole primary key (no partial dependencies). Partial‑dependency update anomalies.
3NF 2NF + no transitive dependencies of non‑prime attributes on other non‑prime attributes. Transitive‑dependency update anomalies.
BCNF Every determinant is a candidate key (stronger than 3NF). Remaining anomalies where a non‑key determinant exists.

6.4 Higher Normal Forms (optional for A‑Level)

  • 4NF – eliminates multi‑valued dependencies.
  • 5NF (PJ/NF) – eliminates join dependencies that are not implied by candidate keys.

7. Step‑by‑Step Normalisation Process (Checklist)

  1. Ensure 1NF – verify atomicity and define a primary key.
  2. Identify all functional dependencies – use domain knowledge, business rules or ER‑diagram cardinalities.
  3. Determine candidate keys and prime attributes.
  4. Apply 2NF – if a composite key exists, remove any partial dependencies by creating separate tables.
  5. Apply 3NF – eliminate transitive dependencies; each non‑prime attribute must depend directly on a candidate key.
  6. Check BCNF – if a determinant is not a candidate key, decompose further (ensure lossless join and, where possible, dependency preservation).
  7. Validate the design – test with sample data, verify that the three anomalies are gone, and confirm that all original FDs are enforceable.
  8. Implement – write DDL, add primary‑key, foreign‑key and CHECK constraints, and grant appropriate privileges.

8. Detailed Normalisation Example

8.1 Original (Unnormalised) Table

StudentIDStudentNameCourseCodeCourseTitleLecturerSemester
101AliceC101ProgrammingDr. SmithFall
102BobC101ProgrammingDr. SmithFall
101AliceC202DatabasesProf. LeeSpring

8.2 Functional Dependencies

  • StudentID → StudentName
  • CourseCode → CourseTitle, Lecturer
  • (StudentID, CourseCode) → Semester

8.3 1NF Check

All columns contain single, indivisible values, so the table already satisfies 1NF.

8.4 2NF Decomposition (remove partial dependencies)

Because the primary key is composite (StudentID, CourseCode), the dependencies StudentID → StudentName and CourseCode → CourseTitle, Lecturer are partial. Split into three relations:

Student (PK = StudentID)
StudentIDStudentName
101Alice
102Bob
Course (PK = CourseCode)
CourseCodeCourseTitleLecturer
C101ProgrammingDr. Smith
C202DatabasesProf. Lee
Enrolment (PK = StudentID, CourseCode)
StudentIDCourseCodeSemester
101C101Fall
102C101Fall
101C202Spring

All non‑prime attributes now depend on the whole primary key of their own table → **2NF achieved**.

8.5 3NF Check (remove transitive dependencies)

In each of the three tables the only determinants are the primary keys, so there are no transitive dependencies. The design is therefore already in **3NF**.

8.6 BCNF Verification

  • Student: determinant StudentID is the sole candidate key → BCNF.
  • Course: determinant CourseCode is the candidate key → BCNF.
  • Enrolment: determinant (StudentID, CourseCode) is the composite candidate key → BCNF.

8.7 Anomaly Elimination

  1. Update anomaly – changing a lecturer’s name now requires a single update in Course instead of many rows.
  2. Insertion anomaly – a new student can be added to Student without needing an enrolment record.
  3. Deletion anomaly – removing the last enrolment of a course does not delete the course information, because it resides in Course.

9. Linking Normalisation to the Rest of the Syllabus

  • DBMS Functions – the DBMS enforces the primary‑key, foreign‑key and CHECK constraints that result from a normalised design.
  • SQL – DDL statements create the tables; DML statements manipulate data while preserving integrity.
  • ER‑Diagrams – a well‑drawn ER diagram makes functional dependencies explicit, guiding the normalisation process.
  • Data Integrity – normalisation reduces redundancy, making integrity constraints easier to maintain.
  • Security – fewer tables and clearer relationships simplify the assignment of privileges (GRANT/REVOKE) and auditing.

10. Summary of the Normalisation Process

  1. Confirm the relation is in 1NF (atomic values, primary key).
  2. List **all** functional dependencies; identify candidate keys and prime attributes.
  3. Apply 2NF – split off tables to eliminate partial dependencies.
  4. Apply 3NF – split further to remove transitive dependencies.
  5. Check BCNF – if any determinant is not a candidate key, decompose while ensuring a lossless join.
  6. Validate the design with sample data, confirm that update/insertion/deletion anomalies are gone, and that original FDs are enforceable.
  7. Implement the schema with DDL, add integrity constraints, and secure the database using appropriate privileges.

Key Take‑aways

  • Normalisation is a systematic, syllabus‑driven method for eliminating redundancy and preventing the three classic anomalies.
  • Mastering functional dependencies, candidate keys, and the distinction between partial and transitive dependencies is essential for moving through the normal forms.
  • A well‑normalised schema maps cleanly from an ER diagram to a set of SQL tables, making future extensions, testing and maintenance straightforward.
  • DBMS features (DDL/DML, integrity constraints, security, backup) complement the design process and are part of the same assessment criteria.

Create an account or Login to take a Quiz

84 views
0 improvement suggestions

Log in to suggest improvements to this note.