Perform database query operations (create, update, delete)

Topic 10 – Database and File Concepts

Learning Objectives

  • Explain the purpose, types and file‑access methods of Database Management Systems (DBMS).
  • Distinguish flat‑file storage from relational databases and justify the use of a relational model.
  • Design a relational schema from an Entity‑Relationship Diagram (ERD) – conceptual, logical and physical stages.
  • Apply appropriate data types, field sizes and constraints (PK, FK, composite, UNIQUE, NOT NULL, CHECK).
  • Normalise tables to Third Normal Form (3NF) and understand why normalisation matters.
  • Write and execute all CRUD statements (CREATE, INSERT, UPDATE, DELETE) and a range of query types required by the Cambridge 9626 syllabus.
  • Use transactions, validation and verification to maintain data integrity.
  • Describe the role of a data dictionary, file‑type taxonomy and common file‑access methods.
  • Compare the main DBMS models (relational, hierarchical, network, object‑oriented) and discuss their advantages/disadvantages.
  • Explain how databases support a Management Information System (MIS).

1. DBMS Overview

  • Database Management System (DBMS) – software that defines, stores, retrieves, secures and manipulates data efficiently.
  • Key functions: data definition (DDL), data manipulation (DML), security, backup/recovery, concurrency control, transaction management.

1.1 Types of DBMS (Cambridge Syllabus)

ModelStructureTypical Use‑caseAdvantagesDisadvantages
Relational Tables (rows & columns) Business applications, MIS, web‑based systems SQL, strong data integrity, widely supported Less efficient for deep hierarchies
Hierarchical Tree‑like parent‑child structure File systems, XML storage Fast parent‑child navigation Rigid, difficult many‑to‑many queries
Network Graph of records linked by sets Complex engineering data, CAD Efficient many‑to‑many navigation Complex design, limited portability
Object‑Oriented Objects, classes, inheritance Multimedia, software engineering Stores data + behaviour, aligns with OO programming Immature query language, steeper learning curve

2. Flat‑File vs. Relational Database

AspectFlat‑File (e.g., CSV)Relational DB
StructureSingle table, no relationshipsMultiple tables linked by keys
RedundancyHigh – same data repeatedLow – normalisation removes duplication
Query CapabilityLimited (text search, filters)Powerful SQL (joins, aggregates, sub‑queries)
Data IntegrityNone – no constraintsEnforced by primary/foreign keys, CHECK, UNIQUE
ScalabilityPoor for large data setsDesigned for large, concurrent users

Scenario: A school wants to store students, courses and enrolments. With a flat file each enrolment row would repeat student and course details, leading to inconsistencies. A relational design stores each entity once and links them via keys, guaranteeing consistency.


3. Data Types & Field Sizes (Cambridge Syllabus)

SQL TypeTypical UseSize / Precision
INTWhole numbers (e.g., StudentID)‑2 147 483 648 to 2 147 483 647
SMALLINTSmall whole numbers‑32 768 to 32 767
DECIMAL(p,s)Currency, precise valuesp = total digits, s = digits after decimal
FLOATApproximate numeric valuesVariable
CHAR(n)Fixed‑length text (e.g., gender code)n characters, padded with spaces
VARCHAR(n)Variable‑length text (e.g., names)up to n characters
DATECalendar datesYYYY‑MM‑DD
BOOLEANTrue/False valuesTRUE / FALSE
TEXTLong free‑form strings (e.g., comments)Unlimited length

4. Keys and Referential Integrity

4.1 Types of Keys

  • Primary Key (PK) – uniquely identifies each row; cannot be NULL.
  • Composite (or Compound) Key – two or more columns together form a unique identifier (e.g., (StudentID, CourseCode) in an enrolment table).
  • Foreign Key (FK) – a column (or set) that references the PK of another table, enforcing referential integrity.
  • Unique Key – ensures column values are distinct but may allow a single NULL.

4.2 Referential Integrity Example

CREATE TABLE Students (
    StudentID INT PRIMARY KEY,
    FirstName VARCHAR(30) NOT NULL,
    LastName  VARCHAR(30) NOT NULL,
    Email     VARCHAR(50) UNIQUE
);

CREATE TABLE Enrolments (
    StudentID   INT,
    CourseCode  CHAR(6),
    EnrolDate   DATE,
    PRIMARY KEY (StudentID, CourseCode),
    FOREIGN KEY (StudentID) REFERENCES Students(StudentID)
);

If we try to insert an enrolment for a non‑existent student, the DBMS rejects it:

INSERT INTO Enrolments (StudentID, CourseCode, EnrolDate)
VALUES (999, 'CS101', '2023-09-01');   -- ERROR: foreign key violation

5. Entity‑Relationship Diagrams (ERDs)

5.1 Three Levels of ERDs

  1. Conceptual ERD – shows entities and relationships only; no attributes, no keys.
  2. Logical ERD – adds attributes, identifies primary keys, and shows cardinalities.
  3. Physical ERD – converts logical design into a relational schema: data types, field sizes, constraints.

5.2 Sample ERD (Student‑Course‑Enrolment)

Conceptual: Entities = Student, Course, Enrolment; Relationships = Enrols (Student ↔ Enrolment ↔ Course).

Logical (textual description):

  • Student: StudentID (PK), FirstName, LastName, Email
  • Course: CourseCode (PK), CourseName, Lecturer, Credits
  • Enrolment: StudentID (FK), CourseCode (FK), EnrolDate – cardinality 1:M from Student to Enrolment and 1:M from Course to Enrolment.

Physical (SQL DDL):

CREATE TABLE Students (
    StudentID   INT PRIMARY KEY,
    FirstName   VARCHAR(30) NOT NULL,
    LastName    VARCHAR(30) NOT NULL,
    Email       VARCHAR(50) UNIQUE,
    CHECK (Email LIKE '%@%')
);

CREATE TABLE Courses (
    CourseCode  CHAR(6) PRIMARY KEY,
    CourseName  VARCHAR(50) NOT NULL,
    Lecturer    VARCHAR(30),
    Credits     SMALLINT CHECK (Credits BETWEEN 1 AND 6)
);

CREATE TABLE Enrolments (
    StudentID   INT,
    CourseCode  CHAR(6),
    EnrolDate   DATE,
    PRIMARY KEY (StudentID, CourseCode),
    FOREIGN KEY (StudentID) REFERENCES Students(StudentID),
    FOREIGN KEY (CourseCode) REFERENCES Courses(CourseCode)
);

5.3 Mini‑Exercise

Given the above conceptual diagram, list the three ERD stages and write the CREATE TABLE statements that include appropriate keys and constraints.


6. Normalisation to Third Normal Form (3NF)

6.1 Quick Reference

  • 1NF – atomic columns, no repeating groups.
  • 2NF – 1NF + every non‑key attribute fully depends on the whole primary key (eliminate partial dependencies).
  • 3NF – 2NF + no transitive dependencies (non‑key attributes depend only on the PK).

6.2 Worked Example

-- Un‑normalised table (student enrolments)
StudentCourses
--------------------------------------------------------------
StudentID | StudentName | CourseCode | CourseName | Lecturer | Credits
--------------------------------------------------------------
101       | Alice Brown | CS101      | Computing  | Dr. Lee  | 3
101       | Alice Brown | MA101      | Maths      | Dr. Kim  | 4
102       | Bob Smith   | CS101      | Computing  | Dr. Lee  | 3
  1. 1NF – already atomic.
  2. 2NF – remove partial dependency on StudentID:
    • Students(StudentID PK, StudentName)
    • Courses(CourseCode PK, CourseName, Lecturer, Credits)
    • Enrollments(StudentID FK, CourseCode FK)
  3. 3NF – check for transitive dependencies:
    • In Courses, Lecturer depends on CourseCode only, so 3NF satisfied.

The resulting schema eliminates duplicate student and course data and satisfies 3NF.


7. Data Dictionary

A data dictionary records the definition of every field used in a database. It is a key reference for developers and users.

Field NameData TypeSizeDescriptionConstraints
StudentIDINT4 bytesUnique identifier for each studentPK, NOT NULL
FirstNameVARCHAR30Student’s given nameNOT NULL
LastNameVARCHAR30Student’s family nameNOT NULL
EmailVARCHAR50Student’s email addressUNIQUE, CHECK (Email LIKE '%@%')
DOBDATEDate of birthCHECK (DOB < CURRENT_DATE)
CourseCodeCHAR6Code of the coursePK, NOT NULL
CreditsSMALLINT2 bytesCredit value of the courseCHECK (Credits BETWEEN 1 AND 6)

8. SQL – DDL and DML (All CRUD Operations)

8.1 Data Definition Language (DDL)

-- Create tables with full set of constraints
CREATE TABLE Students (
    StudentID   INT PRIMARY KEY,
    FirstName   VARCHAR(30) NOT NULL,
    LastName    VARCHAR(30) NOT NULL,
    DOB         DATE CHECK (DOB < CURRENT_DATE),
    Email       VARCHAR(50) UNIQUE CHECK (Email LIKE '%@%')
);

CREATE TABLE Courses (
    CourseCode  CHAR(6) PRIMARY KEY,
    CourseName  VARCHAR(50) NOT NULL,
    Lecturer    VARCHAR(30),
    Credits     SMALLINT CHECK (Credits BETWEEN 1 AND 6)
);

CREATE TABLE Enrolments (
    StudentID   INT,
    CourseCode  CHAR(6),
    EnrolDate   DATE DEFAULT CURRENT_DATE,
    PRIMARY KEY (StudentID, CourseCode),
    FOREIGN KEY (StudentID) REFERENCES Students(StudentID),
    FOREIGN KEY (CourseCode) REFERENCES Courses(CourseCode)
);

8.2 Data Manipulation Language (DML) – CRUD

OperationSQL StatementPurpose
Insert INSERT INTO Students (StudentID, FirstName, LastName, DOB, Email) VALUES (101, 'Alice', 'Brown', '2002-04-15', 'alice.brown@example.com'); Add a new record.
Update UPDATE Students SET Email = 'alice.b@example.org' WHERE StudentID = 101; Correct or change existing data.
Delete DELETE FROM Enrolments WHERE StudentID = 101 AND CourseCode = 'CS101'; Remove records that are no longer required.
Select (simple) SELECT FirstName, LastName, Email FROM Students WHERE DOB > '2000-01-01' ORDER BY LastName; Retrieve specific columns with a filter and sorting.

8.3 Advanced Query Types (Cambridge Requirements)

Static vs. Dynamic Queries
  • Static query – values are hard‑coded in the statement.
    SELECT * FROM Courses WHERE Credits = 3;
  • Dynamic (parameterised) query – placeholders are supplied at run‑time (prevents SQL injection).
    SELECT * FROM Courses WHERE Credits = ?;
Simple vs. Complex Queries
  • Simple – single table, no aggregation.
    SELECT FirstName FROM Students WHERE StudentID = 101;
  • Complex – joins, sub‑queries, aggregation, grouping, having.
    SELECT s.FirstName, c.CourseName FROM Students s JOIN Enrolments e ON s.StudentID = e.StudentID JOIN Courses c ON e.CourseCode = c.CourseCode WHERE c.Credits > 3 ORDER BY s.LastName;
Nested (Sub‑Query) Example
SELECT StudentID, FirstName
FROM Students
WHERE StudentID IN (
    SELECT StudentID
    FROM Enrolments
    WHERE CourseCode = 'CS101'
);
Summary (Aggregate) Queries
SELECT CourseCode, COUNT(*) AS EnrolCount
FROM Enrolments
GROUP BY CourseCode
HAVING COUNT(*) > 10;
Cross‑Tab / Pivot Query (summary by semester)
SELECT CourseCode,
       SUM(CASE WHEN Semester='Spring' THEN 1 ELSE 0 END) AS Spring,
       SUM(CASE WHEN Semester='Fall'   THEN 1 ELSE 0 END) AS Fall
FROM Enrolments
GROUP BY CourseCode;
Searching & Query Operators
OperatorPurposeExample
LIKEPattern matching with wildcardsWHERE Email LIKE '%@school.edu'
%Wildcard – any sequence of charactersLIKE 'A%'
_Wildcard – single characterLIKE 'A_'
AND, OR, NOTBoolean logicWHERE Credits > 3 AND Lecturer = 'Dr. Lee'
>, <, =, <>ComparisonWHERE Credits >= 4
Sorting (single & multi‑column)
SELECT CourseCode, CourseName, Credits
FROM Courses
ORDER BY Credits DESC, CourseName ASC;

9. Validation, Verification and Transactions

9.1 Validation vs. Verification

  • Validation – checks that data conforms to format, range or business rule (implemented with CHECK constraints or application‑side code).
    Example: CHECK (Email LIKE '%@%')
  • Verification – ensures that data is correct and consistent with other data (e.g., confirming a foreign‑key reference exists).
    Example: attempting to insert an enrolment for a non‑existent StudentID triggers a referential‑integrity error.

9.2 Transactions – ACID Principles

A transaction groups several DML statements so that they either all succeed (COMMIT) or all fail (ROLLBACK).

BEGIN TRANSACTION;

INSERT INTO Orders (OrderID, StudentID, OrderDate)
VALUES (5001, 101, CURRENT_DATE);

UPDATE Products
SET Stock = Stock - 1
WHERE ProductID = 200;

-- Validation step: stock must stay non‑negative
IF (SELECT Stock FROM Products WHERE ProductID = 200) < 0
    ROLLBACK;   -- abort whole transaction
ELSE
    COMMIT;     -- make all changes permanent

10. File‑Type Taxonomy & Access Methods

10.1 Generic vs. Proprietary File Formats

CategoryTypical ExtensionCharacteristics
Generic (plain text).txt, .csv, .xmlHuman‑readable, easy to import/export, platform independent.
Proprietary (application‑specific).mdb, .accdb, .dbf, .sqliteOptimised for a particular DBMS; may need that DBMS to read/write.

10.2 File‑Access Methods

  • Sequential Access – records are read in order; simple but slow for random look‑ups.
  • Indexed Sequential Access – an index (e.g., B‑tree) allows fast location of a record while still supporting range scans.
  • Direct (Random) Access – records can be retrieved directly via a physical address or hash; fastest for single‑record queries.

10.3 Import / Export (Cambridge Syllabus)

  • CSV / TXT – import/export of tabular data; widely supported by spreadsheets and DBMS tools.
  • XML – hierarchical data exchange; useful for web services.
  • PDF – export of reports or query results for printing; generated from reporting tools (e.g., JasperReports, Crystal Reports).
  • Graphs / Charts – visual representation of summary queries; often produced by business‑intelligence tools that read from the database.

11. Role of Databases in a Management Information System (MIS)

  • An MIS collects, stores and processes data to produce information that supports decision‑making.
  • Databases provide the central repository, ensuring data is consistent, secure and available to multiple users.
  • Typical MIS functions that rely on databases:
    • Student information system – enrolments, grades, timetables.
    • Inventory control – stock levels, purchase orders.
    • Financial reporting – income, expenses, budgets.
  • SQL queries retrieve the exact data required for reports, dashboards or ad‑hoc analysis.

12. Summary Checklist for Cambridge 9626 – Database and File Concepts

  • Define DBMS, differentiate flat‑file and relational storage.
  • Identify and create primary, composite, foreign and unique keys; explain referential integrity.
  • Design conceptual, logical and physical ERDs; convert an ERD into SQL CREATE TABLE statements.
  • Apply correct data types, field sizes and constraints (PK, FK, NOT NULL, UNIQUE, CHECK).
  • Normalise tables to 3NF and justify each step.
  • Write CRUD statements and at least three of the following query types:
    • Static & dynamic (parameterised) queries
    • Simple vs. complex (joins, sub‑queries)
    • Summary (GROUP BY, HAVING)
    • Cross‑tab / pivot
    • Search operators (LIKE, wildcards, Boolean)
    • Sorting on one or more columns
  • Demonstrate a transaction with BEGIN, COMMIT and ROLLBACK.
  • Distinguish validation (format/range) from verification (referential checks).
  • Describe generic vs. proprietary file formats and the three main file‑access methods.
  • Explain how a database underpins an MIS and supports reporting/analysis.

Create an account or Login to take a Quiz

46 views
0 improvement suggestions

Log in to suggest improvements to this note.