Create data visualization (pivot tables, charts)

Topic 17 – Data Analysis & Visualisation (Cambridge IGCSE/A‑Level IT 9626)

Learning Objectives

  • Create, modify and interpret pivot tables – including grouping, calculated fields, custom aggregations, multiple value fields and refreshing data sources.
  • Apply essential spreadsheet functions (SUM, AVERAGE, COUNT, COUNTIF, COUNTIFS, COUNT DISTINCT, MAX, MIN, LOOKUP, VLOOKUP, IF, SUMIFS, TEXT, CONCATENATE, etc.) to analyse data.
  • Select, design and evaluate the most appropriate chart type for a given question.
  • Assess the effectiveness of visualisations for decision‑making and avoid common pitfalls.
  • Prepare data (cleaning, validation, outlier handling) before visualising to ensure accuracy.

1. Syllabus Checklist – Quick Mapping

Syllabus Requirement Notes Section Action Needed (if any)
1.1 Create & modify pivot tables (rows, columns, values, filters) §4.2 – Building a Basic Pivot Table
1.2 Grouping (dates, numeric ranges) §4.3 – Grouping
1.3 Calculated fields & custom aggregations (e.g., % of total, distinct count) §4.3 – Calculated Fields; §4.3 – Custom Aggregations
1.4 Refreshing pivot tables after source changes §4.3 – Refreshing Data
1.5 Multiple value fields side‑by‑side §4.3 – Multiple Value Fields (expanded)
2.1 Use core spreadsheet functions for analysis §3 – Function Quick‑Reference (expanded)
2.2 Choose appropriate chart type for the question §5 – Chart Types – Matching Visual Form to Question Type
2.3 Evaluate visualisations (effectiveness, pitfalls) §6 – Designing an Effective Chart; §9 – Common Pitfalls
3.1 Prepare data (cleaning, validation, outlier detection) §2 – Preparing Data for Analysis; §2 – Extended Cleaning Steps

2. Why Visualise Data?

Raw tables often conceal patterns. Visualisation helps you to:

  1. Identify trends, outliers and relationships.
  2. Compare categories quickly and accurately.
  3. Communicate findings to non‑technical stakeholders.
  4. Support evidence‑based decision‑making.

3. Preparing Data for Analysis

  • Identify blanks or missing values – use IFERROR, ISBLANK or filter to locate them.
  • Remove duplicates – Excel: Data → Remove Duplicates; Google Sheets: Data → Data cleanup → Remove duplicates.
  • Validate entries – set data‑validation rules (list, whole number, date range, custom formula).
  • Detect outliers – conditional formatting with a Z‑score formula or MEDIAN/QUARTILE analysis.
  • Standardise formats – ensure dates, currencies and percentages share a consistent format before creating a pivot.

4. Spreadsheet Functions – Expanded Quick‑Reference

Function Purpose Example (sample data)
SUM(range)Adds all numbers in a range.SUM(F2:F20) – total revenue.
AVERAGE(range)Calculates the mean.AVERAGE(E2:E20) – average units sold.
COUNT(range)Counts numeric entries.COUNT(A2:A20) – number of orders.
COUNTIF(range,criteria)Counts entries that meet a single condition.COUNTIF(B2:B20,"North") – orders from North.
COUNTIFS(range1,crit1,range2,crit2,…)Counts entries that meet multiple conditions.COUNTIFS(B2:B20,"North",D2:D20,"Jan")
COUNTDISTINCT(range) (via Data Model)Counts unique values (e.g., distinct order IDs).Enable Data Model → add field → set aggregation to Distinct Count.
MAX(range)Largest value.MAX(F2:F20) – highest revenue.
MIN(range)Smallest value.MIN(F2:F20) – lowest revenue.
VLOOKUP(lookup_value,table_array,col_index,FALSE)Finds a value in the first column of a table.VLOOKUP("001",A2:F20,6,FALSE) – revenue for Order 001.
INDEX(range, row_num, [col_num])Returns a value from a specific row/column.INDEX(F2:F20,5) – revenue in the 5th row.
IF(logical_test, value_if_true, value_if_false)Creates conditional results.IF(E2>100,"High","Low") – flag high sales.
SUMIFS(sum_range, criteria_range1, criteria1, …)Sum with multiple conditions.SUMIFS(F2:F20,B2:B20,"North",D2:D20,"Jan")
TEXT(value,"format")Formats numbers as text.TEXT(F2,"£#,##0")
CONCATENATE(text1, text2, …) or &Joins strings.CONCATENATE(A2," – ",B2)

5. Pivot Tables – Summarising Large Data Sets

5.1 Core Components

  • Rows – vertical categories (e.g., Region).
  • Columns – horizontal categories (e.g., Month).
  • Values – numeric field(s) to aggregate (Sum, Average, Count, Distinct Count, % of Total, etc.).
  • Filters – optional criteria to limit the view.

5.2 Building a Basic Pivot Table

  1. Select the full data range (including headers).
  2. Insert ► PivotTable (Excel) or Data ► Pivot table (Google Sheets).
  3. Drag Region to the Rows area.
  4. Drag Month to the Columns area.
  5. Drag Revenue (£) to the Values area – set aggregation to Sum.

5.3 Advanced Features Required by the Syllabus

  • Grouping – Right‑click a date field → Group… → choose Months, Quarters, or custom intervals.
  • Calculated FieldsPivotTable Analyze ► Fields, Items & Sets ► Calculated Field. Example:
    PctRevenue = Revenue / TotalRevenue
  • Custom Aggregations (Distinct Count) – Add the data to the Excel Data Model (check “Add this data to the Data Model” when creating the pivot). Then, in the Values area, choose Value Field Settings → Distinct Count.
  • Multiple Value Fields Side‑by‑Side – Drag more than one numeric field into the Values area. Example:
    • Sum of Revenue
    • Average of Units Sold
    The pivot will display two columns for each row field (e.g., Region) – one for each aggregation.
  • Refreshing Data – After editing the source table, click Refresh (Excel) or Refresh all (Google Sheets) to update the pivot.

5.4 Sample Pivot Table – Total Revenue by Region & Month

Region \ Month Jan Feb Mar Total % of Total Revenue
North£3,600£3,300£6,90048.9 %
South£2,550£2,55018.1 %
East£2,850£2,85020.2 %
West£1,800£1,80012.8 %
Total£6,150£4,650£3,300£14,100100 %

5.5 Example: Multiple Value Fields

Using the same pivot, add both Sum of Revenue and Average of Units Sold to the Values area. The resulting layout (simplified) looks like:

RegionSum of RevenueAverage of Units Sold
North£6,900112
South£2,55095
East£2,85078
West£1,80064

6. Chart Types – Matching Visual Form to Question Type

Chart Type Best for… (question focus) Advantages Disadvantages / When to Avoid
Column / Bar Comparing quantities across discrete categories (e.g., revenue by region). Clear hierarchy; easy to read exact values. Cluttered with >10 categories; not ideal for trends.
Line Showing change over time or continuous data (trend analysis). Emphasises direction and rate of change. Hard to compare many series when they cross frequently.
Pie Illustrating part‑to‑whole relationships with few categories (≤5‑6). Intuitive for proportion. Distorts perception when slices are similar; avoid for many categories.
Stacked Column / Area Showing contribution of sub‑categories to a total over time (product mix each month). Combines total and breakdown in one chart. Difficult to compare individual sub‑category values across periods.
Scatter Plot Exploring correlation between two numeric variables (units sold vs. advertising spend). Reveals patterns, clusters, outliers. Requires clear axis labels; not suitable for categorical data.

7. Designing an Effective Chart – Checklist

  1. Define the purpose. What specific question are you answering?
  2. Select the appropriate chart type using the table above.
  3. Title, axis labels & units. Include a concise, descriptive title and label both axes with units.
  4. Colour scheme. Use a consistent palette; reserve one colour per data series; avoid overly bright or decorative colours.
  5. Data‑ink ratio. Remove unnecessary gridlines, background fills, 3‑D effects, and chartjunk.
  6. Readability. Ensure data points, markers and labels do not overlap; use data labels only when they aid clarity.
  7. Audience awareness. Tailor the level of detail and terminology to the intended viewers.
  8. Scale integrity. Y‑axis should start at zero unless a justified break is shown; keep scales consistent across comparable charts.

8. Example: Revenue Trend by Month (Line Chart)

Using the pivot‑table totals from §5.4, plot a line chart:

  • X‑axis: Month (Jan, Feb, Mar)
  • Y‑axis: Total Revenue (£)
  • Data points: (Jan, £6,150), (Feb, £4,650), (Mar, £3,300)
Line chart showing a downward revenue trend from Jan to Mar
Figure 1 – Revenue trend (line chart) derived from the pivot table.

Interpretation

The chart shows a clear decline of 46 % in revenue over the three‑month period. Possible reasons include seasonal demand, reduced marketing spend, or supply‑chain constraints. Management should investigate month‑by‑month product performance and compare with previous years to determine whether the dip is an anomaly or part of a longer trend.

9. Mathematical Considerations for Analysis

When interpreting pivot data and charts you may need to calculate:

  • Percentage contribution of a region: $$\text{Pct}_{\text{region}} = \frac{\text{Revenue}_{\text{region}}}{\text{Total Revenue}} \times 100\%$$
  • Month‑over‑Month growth: $$\text{Growth}_{\text{Feb→Mar}} = \frac{\text{Revenue}_{\text{Mar}} - \text{Revenue}_{\text{Feb}}}{\text{Revenue}_{\text{Feb}}} \times 100\%$$
  • Average units per order: $$\text{AvgUnits} = \frac{\sum \text{Units Sold}}{\text{Number of Orders}}$$
  • Distinct count of customers (using the Distinct Count aggregation): $$\text{UniqueCustomers} = \text{COUNTDISTINCT(CustomerID)}$$

10. Common Pitfalls & How to Avoid Them

  • Too many slices in a pie chart. Limit to 5–6; otherwise switch to a bar/column chart.
  • Misleading axes. Y‑axis should start at zero unless a break is clearly indicated and justified.
  • Over‑crowded legends. Prefer direct labeling on the chart where space permits.
  • Inconsistent scales. When comparing multiple charts, keep axes identical to avoid visual distortion.
  • Data‑ink overload. Remove decorative 3‑D effects, excessive shading, and background images.
  • Out‑of‑date pivot tables. Always refresh after editing the source data.
  • Using inappropriate chart types. Match the visual form to the question focus (refer to the chart‑type matrix).
  • Ignoring data cleaning. Unclean data leads to inaccurate summaries and misleading visualisations.

11. Practical Exercise (Exam‑Style)

Using the sample data set provided, complete the tasks below. Show all steps, screenshots (or hand‑drawn sketches), and brief written commentary. This mirrors the AO2/AO3 requirements of the Cambridge exam.

  1. Pivot Table – Units Sold by Product & Month
    • Rows: Product
    • Columns: Month
    • Values: Sum of Units Sold
  2. Stacked Column Chart
    • Base the chart on the pivot table from (1).
    • Each column represents a month; each colour segment represents a product.
    • Title the chart “Monthly Units Sold by Product”.
  3. Calculated Field – % of Total Revenue by Region
    • Add a calculated field PctRevenue = Revenue / TotalRevenue to the pivot table in Section 5.
    • Format the field as a percentage with one decimal place.
  4. Interpretation (150–200 words)
    • Describe the main trends shown in the stacked column chart.
    • Identify which product(s) drive the monthly changes.
    • Suggest two actionable recommendations for management (e.g., stock adjustments, promotional focus).
  5. Data‑validation Check
    • Apply data‑validation to the Region column to allow only “North, South, East, West”.
    • Highlight any cells that violate the rule and explain how you would correct them.

12. Summary

Mastering pivot tables and appropriate charting techniques enables you to:

  • Efficiently summarise large data sets without manual calculations.
  • Choose visualisations that directly answer the question posed.
  • Present findings clearly, accurately, and persuasively to a range of audiences.
  • Meet the AO2 (application) and AO3 (analysis) criteria of the Cambridge IT examination.

13. Further Reading & Resources

  • Cambridge International AS & A Level IT (9626) – Official syllabus and past papers.
  • Microsoft Office Support – “Create a PivotTable to analyze worksheet data”.
  • Google Workspace Learning Center – “Pivot tables in Google Sheets”.
  • Data‑visualisation best‑practice guides (e.g., “The Visual Display of Quantitative Information” by Edward Tufte).

Create an account or Login to take a Quiz

41 views
0 improvement suggestions

Log in to suggest improvements to this note.