Show understanding of the features provided by a Database Management System (DBMS) that address the issues of a file-based approach

8.2 Database Management Systems (DBMS)

Learning Objective

Show understanding of the features provided by a Database Management System (DBMS) that address the shortcomings of a file‑based approach (Cambridge AS & A Level Computer Science 9618, AO1 & AO2).


1. Problems with a File‑Based Approach

  • Data redundancy & inconsistency – The same fact is stored in several files; an update must be made in every copy or contradictory values appear.
  • Lack of data independence – Application programmes are written to the exact physical layout of the files. Any change to file format or storage medium forces a programme rewrite.
  • Limited data sharing (concurrency problems) – Only one programme can safely access a file at a time. Simultaneous users cause lost updates, dirty reads or uncommitted‑data anomalies.
  • Poor security & integrity – No central mechanism to control who can view or modify data; integrity checks must be coded manually and are easy to bypass.
  • Complex data retrieval – Procedural code is required to locate, join and filter data; any change to the file structure requires changes to the retrieval code.
  • Difficulty in recovery – After a crash or corruption there is no systematic way to bring the files back to a consistent state.
  • Backup & maintenance overhead – Each file has to be backed up individually; version control and documentation are manual.


2. DBMS Features that Resolve These Issues

File‑Based IssueDBMS FeatureHow It Helps (exam‑relevant point)
Data redundancy & inconsistencyNormalization & Centralised storageEach fact is stored once in a well‑designed table; normal forms eliminate duplicate copies and prevent update anomalies (AO1).
Lack of data independenceLogical & Physical Data IndependenceApplications use a logical schema (DDL). Changes to physical storage (indexes, file organisation) do not require programme modification (AO2).
Limited data sharing (concurrency)Concurrency control – locking, timestamping, MVCCMultiple users can read/write safely; the DBMS prevents lost updates, dirty reads and uncommitted‑data problems (AO2).
Poor security & integrityAccess control & integrity constraints (PK, FK, UNIQUE, NOT NULL, CHECK)Roles and privileges restrict SELECT/INSERT/UPDATE/DELETE; declarative constraints enforce data correctness automatically (AO2).
Complex data retrievalHigh‑level, declarative query language – SQLSELECT, JOIN, GROUP BY retrieve data without procedural code; exam questions often ask you to write or interpret simple SQL statements (AO1‑AO3).
Difficulty in recoveryTransaction management (ACID) & write‑ahead loggingAtomic, Consistent, Isolated, Durable transactions guarantee that either all changes are made or none are; logs enable rollback to the last committed state after a failure (AO2).
Backup & maintenance overheadAutomated backup & recovery utilities + data dictionarySystem‑wide tools (full, incremental, online) and a central metadata repository simplify version control and restore operations (AO2).


3. Core DBMS Capabilities – Summary

  1. Data independence – Logical schema separates applications from physical storage.
  2. Reduced redundancy – Normalised tables store each fact once.
  3. Concurrency control – Locks, timestamps or MVCC allow safe multi‑user access.
  4. Transaction support (ACID) – Guarantees reliability and recoverability of grouped operations.
  5. Security & integrity – Authentication, role‑based authorisation and declarative constraints protect data.
  6. Powerful query language (SQL) – Enables expressive, non‑procedural data manipulation.
  7. Data dictionary (metadata) – Stores information about tables, columns, constraints, users and privileges.
  8. Backup & recovery tools – Automated utilities and logging simplify data protection and restoration.


4. Illustrative Example – Normalisation

File‑based layout (single text file)

StudentID, StudentName, CourseID, CourseName, Lecturer, Marks

If a student enrols in three courses, StudentName and the student’s address are repeated three times – a classic redundancy.

After normalising to 3NF

  • Student(StudentID, StudentName, …)
  • Course(CourseID, CourseName, Lecturer)
  • Enrolment(StudentID, CourseID, Marks)

Each fact now appears only once. Updating a lecturer’s name requires a change in a single row, eliminating inconsistency – a point frequently examined in Paper 1.


5. Transaction Management & ACID

PropertyMeaningRelevance to File‑Based Issues
AtomicityAll parts of a transaction succeed or none do.Prevents partially‑written files after a crash.
ConsistencyTransaction leaves the database in a valid state.Enforces integrity constraints that file programmes lack.
IsolationConcurrent transactions do not interfere with each other.Eliminates lost updates and dirty reads.
DurabilityCommitted changes survive system failures.Ensures that data is not lost after a power failure.


6. Concurrency Control Techniques (Brief)

  • Lock‑based protocols – Shared (read) and exclusive (write) locks; two‑phase locking (2PL) guarantees serialisability.
  • Timestamp ordering – Each transaction receives a unique timestamp; conflicts are resolved by the earlier timestamp winning.
  • Multiversion Concurrency Control (MVCC) – The DBMS keeps multiple versions of a row, allowing readers to see a snapshot while writers create a new version.


7. Security & Integrity Mechanisms

  • Authentication – Verification of user identity (user‑ID & password, biometrics, smart cards).
  • Authorization (Roles & Privileges) – SELECT, INSERT, UPDATE, DELETE rights can be granted or revoked per user or role.
  • Integrity constraints

    • Primary Key – unique identifier for each row.
    • Foreign Key – enforces referential integrity between tables.
    • UNIQUE, NOT NULL, CHECK – additional business rules.

  • Auditing & logging – Records of who performed which operation and when.


8. Data Dictionary & Metadata

The data dictionary is a system‑maintained repository that records:

  • Table and column names.
  • Data types, lengths, default values.
  • Constraints (PK, FK, UNIQUE, CHECK).
  • Indexes and storage parameters.
  • Users, roles and granted privileges.
  • Stored procedures, triggers and views.

Because this information is centralised, developers and administrators can understand the database structure without examining individual files – a point often asked in AO2 questions.


9. Backup & Recovery Utilities

  • Full backup – Copies the entire database at a specific point in time.
  • Incremental / differential backup – Stores only changes since the last backup, reducing storage requirements.
  • Online (hot) backup – Performed while the DBMS remains available to users.
  • Write‑ahead log (WAL) – Records every change before it is written to the data files; used to replay committed transactions and roll back uncommitted ones after a crash.
  • Recovery tools (examples)mysqldump, exp/imp, Oracle RMAN, PostgreSQL pg_basebackup. Exam questions may ask you to name or briefly describe one such tool.


10. Assessment Alignment (Typical Exam Questions)

Paper 1 (non‑programming)

  • Define data redundancy and give one disadvantage of a file‑based system.
  • Explain logical data independence with an example of a change to the physical storage that does not affect an application programme.
  • List and describe two ACID properties.
  • State three security measures provided by a DBMS and explain why they are better than file‑level protection.

Paper 2 (programming)

  • Write an SQL query to retrieve the names of students who scored > 80 in Mathematics.

    SELECT StudentName FROM Enrolment E JOIN Course C ON E.CourseID = C.CourseID WHERE C.CourseName = 'Mathematics' AND E.Marks > 80;

  • Given a set of unnormalised tables, normalise them to 3NF and explain how the new design reduces redundancy.
  • Describe how a DBMS would recover from a system crash using a transaction log (mention WAL, rollback of uncommitted transactions and replay of committed ones).


11. Suggested Diagram for Classroom Use

Figure: Comparison of a file‑based system (many isolated files) with a DBMS architecture (central DBMS, applications, users, utilities communicating through SQL). The diagram should show:

  • Multiple independent files on the left.
  • Central DBMS block in the centre with arrows to:

    • Application programmes (via SQL)
    • Users (via client tools)
    • Utilities (backup, recovery, security manager)

  • Label the DBMS components: Query processor, Transaction manager, Storage manager, Data dictionary.