Show understanding of how factors contribute to the performance of the computer system
4.1 Central Processing Unit (CPU) Architecture
Learning Objective
Show understanding of how various factors contribute to the overall performance of a computer system.
1. Von Neumann Model & Stored‑Program Concept
The CPU, memory and I/O share a single bus system and the same memory stores both data and the program instructions – the stored‑program principle.
Execution proceeds by repeatedly fetching the next instruction from memory, decoding it, executing it and writing back any result (the fetch‑decode‑execute‑write‑back cycle).
2. Core CPU Components (Von Neumann Model)
Control Unit (CU) – Decodes the opcode in the Instruction Register and generates the control signals that direct data movement, ALU operation and timing.
Arithmetic‑Logic Unit (ALU) – Performs arithmetic, logical and shift operations on the operands supplied by registers or memory.
Registers – Fast, on‑chip storage. They are divided into:
Register
Purpose (one‑line)
Program Counter (PC)
Holds the address of the next instruction to fetch.
Instruction Register (IR)
Holds the currently fetched instruction.
Memory Address Register (MAR)
Provides the address for a memory read or write.
Memory Data Register (MDR)
Temporarily stores data transferred to/from memory.
Accumulator (ACC)
Primary arithmetic result register.
General‑purpose registers (R0‑R7, etc.)
Hold operands and intermediate results.
Index Register (IX)
Used for address calculation in indexed addressing modes.
Separate stages of a pipeline, allowing overlapping execution.
Cache Memory – Multi‑level hierarchy (L1, L2, L3) of high‑speed memory that stores frequently accessed instructions and data, reducing main‑memory latency.
Clock Generator – Produces a regular timing pulse; the clock rate (Hz) determines the duration of each cycle – a higher clock rate shortens the denominator of the CPU‑time equation, improving performance.
Bus Architecture
Address bus – Carries memory addresses from the PC, MAR or other units to memory.
Data bus – Transfers the actual data between CPU, cache, main memory and I/O.
Control bus – Carries control signals such as read/write, interrupt request, and clock.
Suggested diagram: Block diagram of a CPU showing the CU, ALU, registers (including PC, IR, MAR, MDR, ACC), cache hierarchy (L1/L2/L3), clock generator, and the three buses (address, data, control). Highlight the stored‑program concept.
Instruction Count (IC) – Total number of instructions executed for a program.
CPI (Cycles per Instruction) – Average number of clock cycles required per instruction. It varies with instruction mix, pipeline depth, cache hit‑rate, branch‑prediction accuracy, etc.
Clock Rate – Frequency of the CPU clock (Hz). A higher clock rate reduces the time of each cycle, directly lowering CPU time.
6. Factors Influencing Performance
Factor
Effect on Performance
Typical Design / Mitigation
Clock Speed
Higher frequency shortens each cycle → lower CPU time.
Smaller transistor geometries, dynamic frequency scaling (Turbo Boost), improved cooling.
CPI (Cycles per Instruction)
Lower CPI → fewer cycles per instruction → faster execution.
This illustrates that, when a substantial portion of a program can be parallelised, adding cores yields a larger performance gain than a modest CPI improvement.
8. Summary Checklist
Define the Von Neumann model and the stored‑program principle.
Identify and describe all core CPU components: CU, ALU, the full set of special and general‑purpose registers, cache hierarchy, clock generator, and the three buses (address, data, control).
Draw and explain the fetch‑decode‑execute‑write‑back cycle using register‑transfer notation; describe how an interrupt modifies the flow.
Recall the CPU performance equation and explain why CPI varies with architecture and workload.
Discuss how clock speed, CPI, instruction count, cache design, pipelining, branch prediction, ISA choice, multi‑core and SIMD parallelism affect overall speed.
Perform calculations:
Compute CPU time before and after a CPI change.
Apply Amdahl’s Law to compare sequential and parallel execution.