Show understanding of methods of file access

Published by Patrick Mutisya · 8 days ago

File Organisation and Access – Cambridge A‑Level Computer Science 9618

13.2 File Organisation and Access

File organisation determines how records are stored on secondary storage (disk, tape, etc.).

The chosen organisation influences the speed and method of accessing data.

Understanding the different access methods is essential for designing efficient programs and databases.

Methods of File Access

Three principal methods are used to retrieve records from a file:

  1. Sequential Access – records are read in the order they are stored.
  2. Random (Direct) Access – any record can be retrieved directly without reading preceding records.
  3. Indexed Access – an index structure provides a map from a key value to the location of the record.

Sequential Access

In sequential access the file is processed from the beginning to the end.

It is simple to implement and works well when the program needs to process every record, e.g., generating a report.

Typical operations:

  • Read the first record.
  • Read the next record until the required record is found or the end of file is reached.
  • Write records in order (append at the end).

Limitations:

  • Finding a specific record may require reading many unrelated records – \$O(n)\$ time.
  • Updating a record in the middle of a large file can be costly.

Random (Direct) Access

Random access uses a formula or a fixed record length to calculate the position of a record on the disk.

For a file with fixed‑length records, the address of record \$r\$ (starting from 0) can be computed as:

\$\text{byte\address} = \text{base\address} + r \times \text{record\_size}\$

Advantages:

  • Immediate retrieval of any record – \$O(1)\$ time.
  • Efficient for applications that need frequent look‑ups by record number.

Disadvantages:

  • Only works when records are of uniform size.
  • Insertion or deletion of records may require shifting large portions of the file.

Indexed Access

An index is a separate structure that stores pairs (key, address).

The index can be searched (often using binary search) to locate the address of the required record.

Two common types of indexes:

  • Primary index – built on a unique key (e.g., student ID).
  • Secondary index – built on a non‑unique attribute (e.g., surname).

Typical steps for indexed access:

  1. Search the index for the key value.
  2. Obtain the disk address of the record.
  3. Perform a direct read of the record at that address.

Benefits:

  • Fast look‑up even for large files – \$O(\log m)\$ where \$m\$ is the number of index entries.
  • Supports variable‑length records.

Drawbacks:

  • Additional storage required for the index.
  • Index must be maintained when records are added, deleted or updated.

Comparison of Access Methods

Access MethodTypical Use‑CaseTime Complexity (search)Record Size RequirementStorage Overhead
SequentialProcessing entire file (e.g., batch reports)\$O(n)\$Any (fixed or variable)None
Random (Direct)Frequent look‑ups by record number\$O(1)\$Fixed lengthNone
IndexedSearches by key value (e.g., customer ID)\$O(\log m)\$Variable or fixedIndex structure (extra blocks)

Choosing an Access Method

When selecting an access method, consider the following factors:

  • Nature of queries: Are most operations scans, look‑ups, or updates?
  • Record size: Fixed vs. variable length.
  • File size: Large files benefit more from indexed or direct access.
  • Update frequency: High update rates may favour sequential files with periodic re‑organisation.
  • Storage constraints: Indexes require extra space.

Suggested diagram: Flowchart showing the three access methods (Sequential → read next, Direct → calculate address, Indexed → search index then read).