Use verification methods (visual checking, double entry, parity check)

Data processing and information

1. Data, information and sources

  • Data – raw facts, figures or symbols that have not yet been interpreted.
  • Information – data that have been processed, organised or presented so that they are useful for decision‑making.

Sources of data

Source typeExamplesWhen to use
Direct Questionnaire, interview, sensor, on‑line form When the most up‑to‑date, specific data are required.
Indirect Census tables, published statistics, third‑party databases When cost or time constraints make primary collection impractical.

2. Quality of information

CriterionWhat it meansEffect on verification
AccuracyCorrectness of the dataNeeds rigorous checks (double entry, checksums, control totals).
RelevanceFit for the intended purposeIrrelevant fields can be omitted early.
AgeHow up‑to‑date the data areOlder data may require re‑validation or updating.
DetailLevel of granularity requiredMore fields → more validation rules.
CompletenessAll required items are presentControl totals, mandatory‑field checks.

3. Encryption (brief)

  • Symmetric – same key encrypts and decrypts (e.g., AES). Fast, good for bulk data.
  • Asymmetric – public key encrypts, private key decrypts (e.g., RSA). Used for key exchange and digital signatures.

Encryption is usually combined with an integrity check (MAC, checksum, hash) because encryption alone does not detect accidental or malicious corruption.

4. Checking the accuracy of data – validation & verification

4.1 Validation checks (required by the syllabus)

Check typePurposeSimple example
PresenceField must not be blankCustomer name entered.
RangeValue must lie between two limitsAge 0 – 120.
TypeCorrect data kindNumeric field cannot contain letters.
LengthExact number of characters/bytesPostcode exactly 6 characters.
FormatSpecific patternDD/MM/YYYY for dates.
Check‑digitMathematical digit that validates a numberISBN‑13 uses modulo‑10.
LookupValue must exist in a reference listCountry code found in ISO‑3166.
ConsistencyRelated fields must agreeStart‑date earlier than end‑date.
LimitMaximum number of records allowedNo more than 500 items per order.
Control total / Hash totalSum or hash of a column to detect missing/extra rowsTotal of invoice amounts = £12 345.67.
ChecksumSimple arithmetic total of a data block8‑bit sum modulo 256 for a packet.
Parity checkDetects single‑bit errors in binary dataEven parity added to each transmitted byte.

4.2 Verification methods (expanded)

Verification is the process of confirming that data entered into a system are exactly the same as the original source. The Cambridge syllabus expects students to be able to describe the following three methods in detail.

  1. Visual checking – the operator manually compares each entered item with the original document. Cheap, but only suitable for small, low‑risk data sets.
  2. Double entry – two independent operators input the same source data. The system automatically flags any mismatched fields. Provides a very high level of confidence for high‑value data (e.g., financial statements, census).
  3. Parity check – a single parity bit is added to a group of *n* bits so that the total number of 1’s is even (even parity) or odd (odd parity).
    p = ( Σi=1n bi ) mod 2
    Example (even parity, n = 8): data = 10110010 (four 1’s) → p = 0 → transmitted block = 10110010 0. Detects any single‑bit error; cannot correct the error.

4.3 Quick checklist for the exam

  • Visual checking – manual, low cost, limited scalability.
  • Double entry – two operators, automatic mismatch detection, high reliability, time‑consuming.
  • Parity check – hardware‑implemented, detects single‑bit errors, cannot detect two‑bit or burst errors.
  • Checksum – adds all bytes (or words) and sends the total; catches many error patterns.
  • Hash total – uses a cryptographic hash (MD5, SHA‑1, SHA‑256) to produce a fixed‑size “finger‑print”.
  • Control total – sum of a numeric field (e.g., total quantity) compared with a pre‑calculated value.
  • Check‑digit – calculated from other digits (mod‑10, Luhn) to verify numbers such as credit‑card or ISBN.

5. Data processing modes

ModeDescriptionTypical examplesPros / Cons
Batch processing Data are collected, stored and processed together at a later time. Payroll, end‑of‑day sales reports. Efficient for large volumes; results are not immediate.
Online (transaction‑oriented) processing Data are processed immediately as they are entered. Online banking, ticket booking. Fast feedback; requires continuous system availability.
Real‑time processing Data must be processed within a strict time limit. Air‑traffic control, industrial control systems. Critical for safety; higher hardware/software cost.

Topic 2 – Hardware & Software (Cambridge 9626)

2.1 Types of hardware

Hardware typeExamplesProsCons
Mainframe IBM Z series, UNIVAC Very high processing power, massive storage, reliable. Expensive, specialised staff required.
Mini‑computer / Mid‑range DEC VAX, HP 3000 Cheaper than mainframes, still multi‑user. Less scalable than mainframes.
Micro‑computer (PC) Desktop, laptop, tablet Low cost, widely available, easy to upgrade. Limited processing for very large data sets.

2.2 Types of software

Software categoryExamplesProsCons
System software Operating systems (Windows, Linux), device drivers Controls hardware, provides platform for applications. Complex; bugs can affect the whole system.
Utility software Antivirus, backup tools, disk defragmenter Supports system maintenance, improves security. May consume resources; occasional false positives.
Custom (in‑house) software Company‑specific inventory system Tailored exactly to business needs. Higher development cost; requires ongoing support.
Off‑the‑shelf software Microsoft Office, QuickBooks Ready to use, lower cost, often well‑documented. May not fit all requirements; licensing restrictions.

2.3 User‑interface types

UI typeCharacteristicsTypical use
Command‑lineText‑based commands; fast for expert users.System administration, programming.
Menu‑drivenHierarchical menus; reduces need to remember commands.Retail POS, ATMs.
Form‑basedFields for data entry; good for structured input.Online applications, surveys.
Graphical (GUI)Icons, windows, mouse interaction.Desktop applications, spreadsheets.

Topic 3 – Monitoring & Control

3.1 Sensors and transducers

  • Temperature (thermocouple, RTD)
  • Pressure (piezo‑electric, strain‑gauge)
  • Proximity (inductive, ultrasonic)
  • Light (photodiode, LDR)
  • Motion (accelerometer, PIR)

3.2 Control technologies

TechnologyHow it worksTypical application
On‑off controlDevice is either fully on or fully off.Heaters, simple alarms.
Proportional controlOutput varies in proportion to the error signal.Motor speed regulation.
PID controlCombines Proportional, Integral and Derivative actions for precise regulation.Industrial temperature control.

3.3 Calibration methods

MethodProcedureWhen to use
One‑point calibration Adjust sensor so that a single known reference value reads correctly. When the sensor’s response is linear and the operating range is narrow.
Two‑point calibration Set the sensor at two known values (low & high) and adjust slope and offset. Standard for most temperature, pressure and voltage sensors.
Multi‑point calibration Measure several points across the full range and fit a curve. Required for non‑linear devices or when high accuracy over a wide range is needed.

Topic 4 – Algorithms & Flowcharts

4.1 Required flowchart symbols (Cambridge approved)

SymbolNamePurpose
Terminator (Start/End)Marks the beginning and end of the process.
ProcessIndicates an operation or instruction.
DecisionShows a yes/no (true/false) test.
ConnectorLinks separate parts of a large flowchart.
Input/Output (Parallelogram)Data entry or display.

4.2 Common flowchart errors (quick checklist)

  • Missing terminator symbols.
  • Decision diamonds without two clearly labelled arrows.
  • Arrows that cross without a connector.
  • Using the wrong shape for an operation (e.g., a rectangle for input).
  • Unclear start‑point – the flow must begin with a single “Start”.

4.3 Example verification routine (pseudo‑code)

FOR each record R in input_file
    error_flag ← FALSE
    
    /* 1. Presence checks */
    IF R.name = "" OR R.id = "" THEN
        error_flag ← TRUE
    
    /* 2. Range checks */
    IF R.age < 0 OR R.age > 120 THEN
        error_flag ← TRUE
    
    /* 3. Check‑digit (Luhn) */
    IF NOT valid_check_digit(R.account_number) THEN
        error_flag ← TRUE
    
    /* 4. Parity check for binary field */
    IF parity(R.binary_field) ≠ EXPECTED_PARITY THEN
        error_flag ← TRUE
    
    /* 5. Update control total */
    total_amount ← total_amount + R.amount
    
    IF error_flag = TRUE THEN
        WRITE R TO error_log
    ELSE
        WRITE R TO good_file
    END IF
END FOR

/* Final control‑total comparison */
IF total_amount ≠ expected_total THEN
    WRITE "Control total mismatch" TO error_log
END IF

4.4 Flowchart for the routine (textual description)

  1. Start (terminator).
  2. Read next record – Input symbol.
  3. Decision: Presence check – if No → Write to error log.
  4. Decision: Range check – if No → Write to error log.
  5. Decision: Check‑digit – if No → Write to error log.
  6. Decision: Parity check – if No → Write to error log.
  7. Process: Add amount to control total.
  8. Output: Write record to good file (if all checks passed).
  9. Connector back to step 2 until end of file.
  10. Decision: Does calculated total = expected total?
  11. Output: Write “Control total mismatch” if required.
  12. End (terminator).

Topic 5 – eSecurity

5.1 Personal data & malware

  • Personal data – any information that can identify a living individual (name, DOB, ID number, etc.). Must be stored securely and processed lawfully.
  • Malware – software designed to damage, disrupt or gain unauthorised access (viruses, worms, trojans, ransomware).

5.2 Prevention methods – software vs. physical

Prevention methodHow it worksAdvantagesDisadvantages
Antivirus / anti‑malware software Scans files, monitors behaviour, updates signatures. Automatic, updates regularly, protects many devices. May miss zero‑day threats; consumes resources.
Firewalls (software or hardware) Filters incoming/outgoing network traffic based on rules. Blocks unauthorised connections, can be centrally managed. Improper configuration can block legitimate traffic.
Air‑gapped systems Physical separation from any network. Virtually eliminates remote malware infection. Inconvenient for data sharing; costly to maintain.
Hardware security tokens (e.g., smart cards) Store cryptographic keys; required for login. Strong two‑factor authentication. Lost or damaged tokens can lock users out.

Topic 6 – Digital Divide

6.1 Definition, causes and effects

  • Definition: The gap between individuals, households or regions that have access to modern information and communication technologies (ICT) and those that do not.
  • Causes: Cost of devices, lack of infrastructure, low digital literacy, geographic isolation, socioeconomic factors.
  • Effects: Unequal educational and employment opportunities, reduced civic participation, widening economic disparity.

6.2 Groups most affected

  • Rural communities
  • Low‑income families
  • Older adults
  • People with disabilities

6.3 Mitigation strategies (exam‑level bullet list)

  • Government‑funded broadband expansion programmes.
  • Community ICT centres and public‑access computers.
  • Subsidised devices or “bring‑your‑own‑device” schemes for schools.
  • Digital‑literacy training courses for adults and seniors.
  • Accessible design standards for software and websites.

Topic 7 – Expert Systems

7.1 Core components

  • Knowledge base – collection of facts and rules (IF‑THEN statements).
  • Inference engine – applies rules to the facts to draw conclusions.
  • User interface – allows the user to input data and receive advice.
  • Explanation facility – tells the user why a conclusion was reached.

7.2 Reasoning styles

StyleDirection of reasoningTypical use
Forward chaining (data‑driven)Starts with known facts, applies rules, moves forward to a conclusion.Diagnostic systems – e.g., “symptom → disease”.
Backward chaining (goal‑driven)Starts with a goal, works backwards to see if required facts exist.Troubleshooting – e.g., “Is the printer jammed? → check paper feed”.

7.3 Advantages and disadvantages

AdvantageDisadvantage
Provides consistent, expert‑level advice.Knowledge acquisition can be time‑consuming.
Can handle large rule sets quickly.May not cope well with ambiguous or incomplete data.
Explanation facility aids learning.Maintenance required when domain knowledge changes.

Topic 8 – Spreadsheets

8.1 Creating a spreadsheet

  • Plan the layout – decide on rows (records) and columns (fields).
  • Enter data – use appropriate data types (text, numbers, dates).
  • Apply formulas – e.g., =SUM(B2:B20), =AVERAGE(C2:C20).
  • Use absolute references ($A$1) when a constant is needed.
  • Format cells – number formats, conditional formatting for alerts.

8.2 Testing and validation

  • Check for missing or duplicate entries (use COUNTIF).
  • Validate ranges with Data → Data Validation (e.g., 0–100%).
  • Use IFERROR to trap calculation errors.

8.3 Using charts

  • Column chart – compare quantities (e.g., sales per month).
  • Line chart – show trends over time.
  • Pie chart – illustrate parts of a whole (budget percentages).
  • Scatter plot – display relationships between two variables.

8.4 Case brief – AO2 (design a solution)

Brief: A school wants a budgeting spreadsheet for its annual sports day. The spreadsheet must record each expense (item, quantity, unit cost), calculate total cost, compare it with a budget limit (£2 500) and highlight any overspend.

Required features:

  1. Input table with columns: Item, Quantity, Unit Cost, Sub‑total (=B2*C2).
  2. Grand total using =SUM(D2:D30).
  3. Cell showing “Within budget” or “Over budget” using =IF(E31<=2500,"Within budget","Over budget").
  4. Conditional formatting to colour the total cell red when over budget.
  5. A simple bar chart showing expense categories.

8.5 Limitations of spreadsheet models

  • Scalability – performance degrades with very large data sets.
  • Risk of hidden errors – formula mistakes can propagate unnoticed.
  • Version control – multiple copies can lead to inconsistent data.
  • Limited data‑validation compared with dedicated database systems.

Topic 9 – Modelling with Spreadsheets

9.1 What‑if analysis

  • Change a single input (e.g., price) and observe the effect on profit.
  • Use Data → What‑If → Scenario Manager for multiple scenarios.

9.2 Goal‑Seek

Goal‑Seek finds the input value that produces a desired result. Example: “What sales figure is needed to achieve a profit of £5 000?” – set the profit cell as the “Set cell”, £5 000 as the “To value”, and the sales figure cell as the “By changing cell”.

9.3 Simulation (Monte Carlo)

  • Generate random inputs (e.g., demand) using =RANDBETWEEN or =NORMINV(RAND(),mean,sd).
  • Run the model many times (via Data Table) to obtain a distribution of outcomes.
  • Analyse results with histograms or summary statistics.

9.4 Model validation and limitations (recap)

  • Validate by checking against known data or using control totals.
  • Document assumptions (e.g., linear cost behaviour).
  • Recognise that models simplify reality – they cannot predict unexpected events or complex interactions.

9.5 Key points to remember (overall)

  • Data → information conversion must be supported by both validation (type, range, format, etc.) and verification (visual checking, double entry, parity, checksum, control totals, check‑digits).
  • Select verification methods that match data volume, criticality and available resources.
  • Document any verification strategy in an algorithm and illustrate it with a correctly‑symbolised flowchart.
  • Hardware, software and UI choices each have distinct pros and cons – justify the most appropriate for the task.
  • Monitoring & control systems require suitable sensors, calibration and a clear control technique (on‑off, proportional, PID).
  • Expert systems use forward or backward chaining; understand when each is appropriate.
  • Spreadsheets are powerful for calculation, visualisation and simple modelling, but be aware of their limitations and test thoroughly.
  • Address the digital divide by promoting access, infrastructure and digital‑literacy programmes.

Create an account or Login to take a Quiz

42 views
0 improvement suggestions

Log in to suggest improvements to this note.