8.2 Database Management Systems (DBMS)
Learning Objective
Show understanding of the features provided by a Database Management System (DBMS) that address the shortcomings of a file‑based approach (Cambridge AS & A Level Computer Science 9618, AO1 & AO2).
1. Problems with a File‑Based Approach
- Data redundancy & inconsistency – The same fact is stored in several files; an update must be made in every copy or contradictory values appear.
- Lack of data independence – Application programmes are written to the exact physical layout of the files. Any change to file format or storage medium forces a programme rewrite.
- Limited data sharing (concurrency problems) – Only one programme can safely access a file at a time. Simultaneous users cause lost updates, dirty reads or uncommitted‑data anomalies.
- Poor security & integrity – No central mechanism to control who can view or modify data; integrity checks must be coded manually and are easy to bypass.
- Complex data retrieval – Procedural code is required to locate, join and filter data; any change to the file structure requires changes to the retrieval code.
- Difficulty in recovery – After a crash or corruption there is no systematic way to bring the files back to a consistent state.
- Backup & maintenance overhead – Each file has to be backed up individually; version control and documentation are manual.
2. DBMS Features that Resolve These Issues
| File‑Based Issue | DBMS Feature | How It Helps (exam‑relevant point) |
|---|
| Data redundancy & inconsistency | Normalization & Centralised storage | Each fact is stored once in a well‑designed table; normal forms eliminate duplicate copies and prevent update anomalies (AO1). |
| Lack of data independence | Logical & Physical Data Independence | Applications use a logical schema (DDL). Changes to physical storage (indexes, file organisation) do not require programme modification (AO2). |
| Limited data sharing (concurrency) | Concurrency control – locking, timestamping, MVCC | Multiple users can read/write safely; the DBMS prevents lost updates, dirty reads and uncommitted‑data problems (AO2). |
| Poor security & integrity | Access control & integrity constraints (PK, FK, UNIQUE, NOT NULL, CHECK) | Roles and privileges restrict SELECT/INSERT/UPDATE/DELETE; declarative constraints enforce data correctness automatically (AO2). |
| Complex data retrieval | High‑level, declarative query language – SQL | SELECT, JOIN, GROUP BY retrieve data without procedural code; exam questions often ask you to write or interpret simple SQL statements (AO1‑AO3). |
| Difficulty in recovery | Transaction management (ACID) & write‑ahead logging | Atomic, Consistent, Isolated, Durable transactions guarantee that either all changes are made or none are; logs enable rollback to the last committed state after a failure (AO2). |
| Backup & maintenance overhead | Automated backup & recovery utilities + data dictionary | System‑wide tools (full, incremental, online) and a central metadata repository simplify version control and restore operations (AO2). |
3. Core DBMS Capabilities – Summary
- Data independence – Logical schema separates applications from physical storage.
- Reduced redundancy – Normalised tables store each fact once.
- Concurrency control – Locks, timestamps or MVCC allow safe multi‑user access.
- Transaction support (ACID) – Guarantees reliability and recoverability of grouped operations.
- Security & integrity – Authentication, role‑based authorisation and declarative constraints protect data.
- Powerful query language (SQL) – Enables expressive, non‑procedural data manipulation.
- Data dictionary (metadata) – Stores information about tables, columns, constraints, users and privileges.
- Backup & recovery tools – Automated utilities and logging simplify data protection and restoration.
4. Illustrative Example – Normalisation
File‑based layout (single text file)
StudentID, StudentName, CourseID, CourseName, Lecturer, Marks
If a student enrols in three courses, StudentName and the student’s address are repeated three times – a classic redundancy.
After normalising to 3NF
- Student(StudentID, StudentName, …)
- Course(CourseID, CourseName, Lecturer)
- Enrolment(StudentID, CourseID, Marks)
Each fact now appears only once. Updating a lecturer’s name requires a change in a single row, eliminating inconsistency – a point frequently examined in Paper 1.
5. Transaction Management & ACID
| Property | Meaning | Relevance to File‑Based Issues |
|---|
| Atomicity | All parts of a transaction succeed or none do. | Prevents partially‑written files after a crash. |
| Consistency | Transaction leaves the database in a valid state. | Enforces integrity constraints that file programmes lack. |
| Isolation | Concurrent transactions do not interfere with each other. | Eliminates lost updates and dirty reads. |
| Durability | Committed changes survive system failures. | Ensures that data is not lost after a power failure. |
6. Concurrency Control Techniques (Brief)
- Lock‑based protocols – Shared (read) and exclusive (write) locks; two‑phase locking (2PL) guarantees serialisability.
- Timestamp ordering – Each transaction receives a unique timestamp; conflicts are resolved by the earlier timestamp winning.
- Multiversion Concurrency Control (MVCC) – The DBMS keeps multiple versions of a row, allowing readers to see a snapshot while writers create a new version.
7. Security & Integrity Mechanisms
- Authentication – Verification of user identity (user‑ID & password, biometrics, smart cards).
- Authorization (Roles & Privileges) – SELECT, INSERT, UPDATE, DELETE rights can be granted or revoked per user or role.
- Integrity constraints
- Primary Key – unique identifier for each row.
- Foreign Key – enforces referential integrity between tables.
- UNIQUE, NOT NULL, CHECK – additional business rules.
- Auditing & logging – Records of who performed which operation and when.
8. Data Dictionary & Metadata
The data dictionary is a system‑maintained repository that records:
- Table and column names.
- Data types, lengths, default values.
- Constraints (PK, FK, UNIQUE, CHECK).
- Indexes and storage parameters.
- Users, roles and granted privileges.
- Stored procedures, triggers and views.
Because this information is centralised, developers and administrators can understand the database structure without examining individual files – a point often asked in AO2 questions.
9. Backup & Recovery Utilities
- Full backup – Copies the entire database at a specific point in time.
- Incremental / differential backup – Stores only changes since the last backup, reducing storage requirements.
- Online (hot) backup – Performed while the DBMS remains available to users.
- Write‑ahead log (WAL) – Records every change before it is written to the data files; used to replay committed transactions and roll back uncommitted ones after a crash.
- Recovery tools (examples) –
mysqldump, exp/imp, Oracle RMAN, PostgreSQL pg_basebackup. Exam questions may ask you to name or briefly describe one such tool.
10. Assessment Alignment (Typical Exam Questions)
Paper 1 (non‑programming)
- Define data redundancy and give one disadvantage of a file‑based system.
- Explain logical data independence with an example of a change to the physical storage that does not affect an application programme.
- List and describe two ACID properties.
- State three security measures provided by a DBMS and explain why they are better than file‑level protection.
Paper 2 (programming)
11. Suggested Diagram for Classroom Use
Figure: Comparison of a file‑based system (many isolated files) with a DBMS architecture (central DBMS, applications, users, utilities communicating through SQL). The diagram should show:
- Multiple independent files on the left.
- Central DBMS block in the centre with arrows to:
- Application programmes (via SQL)
- Users (via client tools)
- Utilities (backup, recovery, security manager)
- Label the DBMS components: Query processor, Transaction manager, Storage manager, Data dictionary.