Computer Science – 6.2 Data Integrity | e-Consult
6.2 Data Integrity (1 questions)
Data verification is crucial to ensure the accuracy and reliability of data entered into a system. Here are three methods, along with their advantages and disadvantages:
- Data Validation Rules: This involves defining rules that data must adhere to when entered. These rules can check for data type (e.g., ensuring a field only accepts numbers), range (e.g., age must be between 0 and 120), format (e.g., email address must follow a specific pattern), and required fields (e.g., a name field cannot be left blank).
- Double Data Entry: The same data is entered by two different people. The entries are then compared, and discrepancies are investigated and corrected.
- Checksums/Hashing: A checksum or hash value is calculated from the data and stored with the data. When the data is read, the checksum is recalculated and compared to the stored value. If they don't match, it indicates that the data has been corrupted or altered.
Advantages: Prevents invalid data from being entered in the first place, reducing the need for later correction. Can be automated within the software system. Improves data quality significantly.
Disadvantages: Requires careful planning and definition of rules. Can be bypassed if the software isn't robust. May frustrate users if rules are overly restrictive or poorly designed. Requires ongoing maintenance as data requirements change.
Advantages: Highly effective at catching errors, especially human errors. Simple to implement. Provides a level of accountability.
Disadvantages: Time-consuming and expensive. Doesn't catch inconsistencies between the two entries if the errors are different. Requires careful reconciliation process. Can lead to disagreements between data entry clerks.
Advantages: Detects accidental data corruption during transfer or storage. Relatively quick to calculate and verify. Useful for ensuring data integrity.
Disadvantages: Only detects data corruption, not errors in data entry. Doesn't identify the specific error. Requires a reliable checksum algorithm. Not suitable for all data types.