File Organisation and Access – Cambridge A‑Level Computer Science 961813.2 File Organisation and Access
File organisation determines how records are stored on secondary storage (disk, tape, etc.).
The chosen organisation influences the speed and method of accessing data.
Understanding the different access methods is essential for designing efficient programs and databases.
Methods of File Access
Three principal methods are used to retrieve records from a file:
- Sequential Access – records are read in the order they are stored.
- Random (Direct) Access – any record can be retrieved directly without reading preceding records.
- Indexed Access – an index structure provides a map from a key value to the location of the record.
Sequential Access
In sequential access the file is processed from the beginning to the end.
It is simple to implement and works well when the program needs to process every record, e.g., generating a report.
Typical operations:
- Read the first record.
- Read the next record until the required record is found or the end of file is reached.
- Write records in order (append at the end).
Limitations:
- Finding a specific record may require reading many unrelated records – \$O(n)\$ time.
- Updating a record in the middle of a large file can be costly.
Random (Direct) Access
Random access uses a formula or a fixed record length to calculate the position of a record on the disk.
For a file with fixed‑length records, the address of record \$r\$ (starting from 0) can be computed as:
\$\text{byte\address} = \text{base\address} + r \times \text{record\_size}\$
Advantages:
- Immediate retrieval of any record – \$O(1)\$ time.
- Efficient for applications that need frequent look‑ups by record number.
Disadvantages:
- Only works when records are of uniform size.
- Insertion or deletion of records may require shifting large portions of the file.
Indexed Access
An index is a separate structure that stores pairs (key, address).
The index can be searched (often using binary search) to locate the address of the required record.
Two common types of indexes:
- Primary index – built on a unique key (e.g., student ID).
- Secondary index – built on a non‑unique attribute (e.g., surname).
Typical steps for indexed access:
- Search the index for the key value.
- Obtain the disk address of the record.
- Perform a direct read of the record at that address.
Benefits:
- Fast look‑up even for large files – \$O(\log m)\$ where \$m\$ is the number of index entries.
- Supports variable‑length records.
Drawbacks:
- Additional storage required for the index.
- Index must be maintained when records are added, deleted or updated.
Comparison of Access Methods
| Access Method | Typical Use‑Case | Time Complexity (search) | Record Size Requirement | Storage Overhead |
|---|
| Sequential | Processing entire file (e.g., batch reports) | \$O(n)\$ | Any (fixed or variable) | None |
| Random (Direct) | Frequent look‑ups by record number | \$O(1)\$ | Fixed length | None |
| Indexed | Searches by key value (e.g., customer ID) | \$O(\log m)\$ | Variable or fixed | Index structure (extra blocks) |
Choosing an Access Method
When selecting an access method, consider the following factors:
- Nature of queries: Are most operations scans, look‑ups, or updates?
- Record size: Fixed vs. variable length.
- File size: Large files benefit more from indexed or direct access.
- Update frequency: High update rates may favour sequential files with periodic re‑organisation.
- Storage constraints: Indexes require extra space.
Suggested diagram: Flowchart showing the three access methods (Sequential → read next, Direct → calculate address, Indexed → search index then read).