Write an SQL script to query or modify data (DML) which are stored in (at most two) database tables

8.3 Data Definition Language (DDL) & Data Manipulation Language (DML)

Learning objective

Write an SQL script that queries or modifies data (DML) stored in no more than two database tables.


1. Key ideas you must know

  • DDL – statements that define or change the structure of database objects (tables, indexes, views, constraints, etc.).
  • DML – statements that retrieve or change the data stored in those objects.
  • Referential integrity – foreign‑key relationships that keep related tables consistent.
  • Normalisation – organising tables so they satisfy 1NF, 2NF and 3NF, thereby removing redundancy.
  • Relational‑algebra operators – the logical basis of SQL (selection, projection, join, aggregation).


2. Relational‑algebra reminder

AlgebraSQL equivalent
σcondition(R) (selection)SELECT … FROM R WHERE condition
πattributes(R) (projection)SELECT attribute1, attribute2 … FROM R
R ⋈condition S (join)FROM R JOIN S ON condition
γgroup, agg(R) (aggregation)SELECT … FROM R GROUP BY … HAVING …


3. Typical DDL statements

CommandPurposeExample
CREATE TABLEDefine a new table, its columns and constraints.
CREATE TABLE Student (

StudentID INT PRIMARY KEY,

Name VARCHAR(50) NOT NULL,

Age INT CHECK (Age >= 16),

Phone VARCHAR(15) DEFAULT 'UNKNOWN'

);

ALTER TABLEModify an existing table – add, drop or change columns / constraints.
ALTER TABLE Student

ADD Email VARCHAR(100) UNIQUE;

DROP TABLEDelete a table and all its data permanently.
DROP TABLE Student;
TRUNCATE TABLERemove all rows quickly without dropping the table definition.
TRUNCATE TABLE Enrolment;

Common constraint types

ConstraintPurposeSQL syntax
PRIMARY KEYUniquely identifies each row.StudentID INT PRIMARY KEY
UNIQUEEnsures all values in a column are different.Email VARCHAR(100) UNIQUE
NOT NULLDisallows NULL values.Name VARCHAR(50) NOT NULL
CHECKRestricts values to a Boolean expression.Age INT CHECK (Age >= 16)
DEFAULTProvides a default value when none is supplied.Phone VARCHAR(15) DEFAULT 'UNKNOWN'
FOREIGN KEY … REFERENCES … ON DELETE/UPDATE …Enforces referential integrity between tables.
FOREIGN KEY (StudentID)

REFERENCES Student(StudentID)

ON DELETE CASCADE

ON UPDATE CASCADE


4. Typical DML statements

CommandPurposeExample
SELECTRetrieve rows that satisfy a condition.
SELECT Name, Age FROM Student WHERE Age > 18;
INSERTAdd new rows to a table.
INSERT INTO Student (StudentID, Name, Age)

VALUES (101, 'Alice', 19);

UPDATEModify existing rows.
UPDATE Student

SET Age = Age + 1

WHERE StudentID = 101;

DELETERemove rows from a table.
DELETE FROM Student

WHERE StudentID = 101;

Advanced SELECT features required for Paper 4

FeatureSQL syntaxTypical use
JOIN types
-- INNER (default)

FROM Student s INNER JOIN Enrolment e

ON s.StudentID = e.StudentID

-- LEFT OUTER

FROM Student s LEFT JOIN Enrolment e

ON s.StudentID = e.StudentID

Combine rows from two tables; LEFT shows all students even if they have no enrolments.
Aggregate functions
SELECT CourseCode, COUNT(*) AS NumStudents

FROM Enrolment

GROUP BY CourseCode

HAVING COUNT(*) > 5;

Summarise data – e.g., number of students per course.
DISTINCT
SELECT DISTINCT Semester FROM Enrolment;
Remove duplicate rows.
IN / BETWEEN / LIKE
WHERE CourseCode IN ('CS101','MA102')

WHERE Age BETWEEN 18 AND 22

WHERE Name LIKE 'A%'

Various ways to filter rows.
SET operations (UNION, INTERSECT, EXCEPT)
SELECT StudentID FROM Enrolment WHERE Semester='Fall21'

UNION

SELECT StudentID FROM Enrolment WHERE Semester='Spring22';

Combine results from two queries.
Sub‑query in WHERE
SELECT Name FROM Student

WHERE StudentID IN (SELECT StudentID FROM Enrolment

WHERE CourseCode='CS101');

Find students enrolled in a particular course.


5. Example scenario – two tables only

The following design satisfies 3NF and demonstrates every DML operation you may be asked to write.

5.1 DDL – create the tables

CREATE TABLE Student (

StudentID INT PRIMARY KEY,

Name VARCHAR(50) NOT NULL,

Age INT CHECK (Age >= 16),

Phone VARCHAR(15) DEFAULT 'UNKNOWN'

);

CREATE TABLE Enrolment (

EnrolmentID INT PRIMARY KEY,

StudentID INT,

CourseCode VARCHAR(10) NOT NULL,

Semester VARCHAR(6) NOT NULL,

CONSTRAINT FKEnrolStudent

FOREIGN KEY (StudentID)

REFERENCES Student(StudentID)

ON DELETE CASCADE

ON UPDATE CASCADE

);

5.2 DML – a complete script (uses only the two tables)

-- 1. Insert new students

INSERT INTO Student (StudentID, Name, Age, Phone) VALUES

(201, 'Bob', 20, '01234 567890'),

(202, 'Clara', 19, '09876 543210');

-- 2. Enrol the students in courses

INSERT INTO Enrolment (EnrolmentID, StudentID, CourseCode, Semester) VALUES

(301, 201, 'CS101', 'Fall21'),

(302, 202, 'CS101', 'Fall21'),

(303, 202, 'MA102', 'Fall21');

-- 3. Update a student's age (Bob has a birthday)

UPDATE Student

SET Age = Age + 1

WHERE StudentID = 201;

-- 4. Delete an enrolment (Clara drops MA102)

DELETE FROM Enrolment

WHERE EnrolmentID = 303;

-- 5. List all students together with the courses they are currently enrolled in

SELECT s.Name, e.CourseCode, e.Semester

FROM Student s

JOIN Enrolment e ON s.StudentID = e.StudentID

ORDER BY s.Name;

-- 6. Find students who are not enrolled in any course

SELECT s.Name

FROM Student s

LEFT JOIN Enrolment e ON s.StudentID = e.StudentID

WHERE e.StudentID IS NULL;

-- 7. Number of students per course (aggregation)

SELECT CourseCode,

COUNT(*) AS NumStudents

FROM Enrolment

GROUP BY CourseCode

HAVING COUNT(*) > 0

ORDER BY NumStudents DESC;

5.3 Explanation of each step

  • Step 1‑2: INSERT adds data to both tables. The foreign key in Enrolment guarantees that only existing StudentID values can be used.
  • Step 3: UPDATE changes a single row because the WHERE clause uniquely identifies the student.
  • Step 4: DELETE removes exactly one enrolment; the WHERE clause prevents accidental loss of all rows.
  • Step 5: An INNER JOIN – only students with at least one enrolment appear.
  • Step 6: A LEFT JOIN plus IS NULL finds students with no enrolments.
  • Step 7: Demonstrates GROUP BY, COUNT and HAVING to produce a summary.


6. Normalisation note

The design separates Student (entity data) from Enrolment (relationship data). This achieves:

  • 1NF – each column holds atomic values; no repeating groups.
  • 2NF – the primary key of Enrolment is a single column (EnrolmentID), so no non‑key attribute depends only on part of a composite key.
  • 3NF – there are no transitive dependencies; all non‑key attributes depend directly on the primary key of their own table.

This is the type of model the Cambridge syllabus expects you to recognise and use when writing DML scripts.


7. Common pitfalls (and how to avoid them)

  • Forgetting NOT NULL or CHECK constraints – leads to illegal or meaningless data.
  • Omitting the foreign‑key REFERENCES clause – destroys referential integrity.
  • Using DELETE without a WHERE clause – deletes every row in the table.
  • Writing a too‑broad WHERE clause in UPDATE – may modify many rows unintentionally.
  • Not specifying ON DELETE/UPDATE CASCADE when you want dependent rows removed or updated automatically.
  • Relying on SELECT * in scripts – makes the query fragile if the table definition changes.


8. Suggested diagram

ER diagram (one‑to‑many) – Student (1)Enrolment (M). The foreign‑key from Enrolment to Student enforces referential integrity.


9. Self‑check questions

  1. What DDL command would you use to add a new column Phone to the Student table, and ensure it cannot contain NULL values?
  2. Write an SQL statement that lists the names of students who are not enrolled in any course.
  3. Explain why the ON DELETE CASCADE clause in the foreign‑key definition is useful.
  4. Give an example of a query that uses GROUP BY and HAVING to show courses with more than three students.
  5. Identify two normal‑form violations that would occur if the CourseCode were stored directly in the Student table.

Answers (teacher reference)

  1. ALTER TABLE Student ADD Phone VARCHAR(15) NOT NULL;
  2. SELECT s.Name

    FROM Student s

    LEFT JOIN Enrolment e ON s.StudentID = e.StudentID

    WHERE e.StudentID IS NULL;

  3. ON DELETE CASCADE automatically removes all enrolment rows that reference a student when that student is deleted, keeping the database consistent without manual clean‑up.
  4. SELECT CourseCode, COUNT(*) AS NumStudents

    FROM Enrolment

    GROUP BY CourseCode

    HAVING COUNT(*) > 3;

    • Storing CourseCode in Student creates a repeating group (violates 1NF) because a student can take many courses.
    • It also introduces a partial dependency on a non‑key attribute (violates 2NF) and a transitive dependency if additional course details are added (violates 3NF).