Describe how data validation and data verification help protect the integrity of data

6.2 Data Integrity

Learning Objective (Syllabus 6.2)

Describe how data validation and data verification help protect the integrity of data.

What Is Data Integrity?

Data integrity is the accuracy, consistency and reliability of data throughout its life‑cycle. It means that data is not altered in an unauthorised or accidental way.

Why Integrity Matters (Link to 6.1 Data Security & 8 Databases)

  • Integrity checks work with authentication (who you are) and encryption (confidentiality) to prevent tampering.
  • In databases, integrity is enforced by constraints such as NOT NULL, CHECK, UNIQUE, and foreign‑key rules.
  • In networking, integrity ensures that packets received are exactly those that were sent.

Data Validation (AO1 – Knowledge)

Validation is performed before data is stored or processed. It stops incorrect or malformed data from entering the system, thereby preserving logical integrity.

Technique (syllabus wording)What It ChecksTypical Example
Type checkingValue must be of the required data type.Age must be an integer.
Range checkingNumeric value lies within a permitted interval.Score 0 – 100.
Format checkingValue matches a pattern (regular expression).Email address user@example.com.
Length checkingNumber of characters/digits is within limits.Password 8‑12 characters.
Presence / Mandatory‑field checkingRequired fields cannot be left blank.Customer name must be entered.
Existence checkingReferenced data already exists (foreign key).Order must refer to an existing product ID.
Uniqueness checkingValue must be unique across records.Username must not duplicate an existing one.
Limit checkingRestricts number of occurrences.At most three active sessions per user.
Check‑digit verificationCalculated digit matches the supplied digit.ISBN‑13 Luhn check.
Cross‑field validationLogical relationship between two or more fields.Start‑date must be earlier than end‑date.

Pseudocode Example (AO3 – Design/Implementation)

function isValidEmail(email):

pattern = r'^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$'

if not match(pattern, email):

return false

if length(email) > 254:

return false

return true

This function demonstrates type, format and length checking for an email address.

Data Verification (AO1 – Knowledge)

Verification is performed after data has been stored or transmitted. It detects accidental corruption or deliberate tampering, thereby preserving physical integrity.

Technique (syllabus wording)How It WorksTypical Use‑case
Parity bitSingle‑bit indicating whether the number of 1‑bits is even or odd.Simple error detection in serial communication.
ChecksumSum of all bytes modulo 256 (or 2ⁿ).

Algorithm:

sum = 0;
for each byte b in data:
  sum = (sum + b) mod 256;
checksum = sum;

Verifying log files or small data blocks.
CRC (Cyclic Redundancy Check)Data treated as a binary polynomial; divided by a generator polynomial (e.g. 0x04C11DB7).

The remainder of this division is the CRC value. A matching remainder on receipt means no error.

Network packets, disk sectors, USB transfers.
Cryptographic hashFixed‑size digest (e.g. SHA‑256). Any change to the input produces a dramatically different hash.File‑integrity verification, version control.
Digital signatureHash of the data encrypted with the sender’s private key; the receiver decrypts with the public key and compares hashes.Legal documents, software distribution.
Message‑Authentication Code (MAC)Hash‑based code generated using a secret key (e.g. HMAC‑SHA‑256). Guarantees integrity and authenticity.Secure API messages, VPN tunnels.

Comparison of Validation and Verification

AspectData ValidationData Verification
When performedAt data entry, before storage or processingAfter storage, during transmission or on retrieval
Primary purposePrevent bad or malformed data from entering the systemDetect accidental or malicious alteration of data
Typical techniques (syllabus wording)Type, range, format, length, presence, existence, uniqueness, limit, check‑digit, cross‑fieldParity, checksum, CRC, cryptographic hash, digital signature, MAC
Where the checks are appliedClient‑side UI, server‑side logic, database constraints (NOT NULL, CHECK, UNIQUE, FOREIGN KEY)File systems, communication protocols, backup/restore processes, secure logs
Effect on data integrityMaintains logical integrity by ensuring only correct data is storedMaintains physical integrity by detecting corruption or tampering

Floating‑Point Rounding Errors (Additional AO2 Insight)

Binary floating‑point numbers can only approximate many decimal fractions. This can cause small rounding errors that accumulate in calculations, e.g.

0.1 + 0.2  →  0.30000000000000004  (not exactly 0.3)

Validation cannot prevent such errors; verification (e.g., using a checksum of the result) can detect unexpected changes after processing.

Why Both Validation and Verification Are Needed (AO2 – Analysis)

  • Validation stops malformed inputs that could cause wrong calculations, security breaches or database constraint violations.
  • Verification catches corruption that occurs after data has been stored, backed‑up, or transmitted across a network.
  • Together they provide a defence‑in‑depth approach, reducing the risk of:

    • Incorrect results caused by bad inputs.
    • Data loss or silent corruption during backup, transmission, or hardware failure.
    • Unauthorised modification of critical records.

Practical Implementation Tips (AO3 – Design/Implementation)

  1. Validate on both client and server sides; never rely solely on client‑side checks.
  2. Use built‑in database constraints (NOT NULL, CHECK, UNIQUE, foreign keys) as a second line of defence.
  3. Select a verification method appropriate to the data’s sensitivity:

    • Simple checksum for routine log files.
    • CRC for network packets or large storage blocks.
    • SHA‑256 or a digital signature for confidential or legally important records.
    • MAC when both integrity and authentication are required.

  4. Store verification values (checksums, hashes, signatures) in a secure, tamper‑evident location – e.g., a separate integrity‑log table.
  5. Audit validation and verification failures regularly; patterns may reveal misuse, software bugs, or hardware problems.

Suggested Diagram

Flowchart showing two parallel streams:

Validation – Input → Client‑side checks → Server‑side checks → Database constraints.

Verification – Data stored → Integrity‑value generation (checksum/CRC/hash) → Backup/Transmission → Re‑calculation & comparison on receipt.

Icons for client, server, DB, CRC module, and digital‑signature verification are recommended.

Assessment Objective Alignment

  • AO1 – Knowledge: Definitions, terminology, and lists of validation and verification techniques (including uniqueness checking and MAC).
  • AO2 – Analysis: Comparison table, explanation of why both are required, and discussion of floating‑point rounding errors.
  • AO3 – Design/Implementation: Practical tips, database constraint examples, pseudocode snippet, and diagram suggestion.

Summary

Data integrity is essential for reliable computing. Data validation ensures that only correctly formatted, logically consistent data enters the system, protecting logical integrity. Data verification confirms that data remains unchanged after storage or transmission, protecting physical integrity. Using both techniques together gives a robust defence‑in‑depth strategy that supports accurate processing, trustworthy reporting, and strong security.