Averages and measures of spread
To summarise a set of data you need a central value (mean, median or mode) and a spread (range or interquartile range). Picking the right pair makes comparisons fair and informative.
Three measures of central tendency
Mean
Mean = (sum of values) / (number of values)
Includes every value. Sensitive to outliers (a single extreme value pulls it).
Worked example: 4, 7, 5, 8, 6.
- Sum = 30; mean = 30/5 = 6.
Median
The middle value when the data is sorted. For an even count, average the two middle values.
Worked example: 4, 7, 5, 8, 6, 12.
- Sorted: 4, 5, 6, 7, 8, 12. Middle two: 6 and 7. Median = 6.5.
Robust to outliers (a single extreme value barely shifts the median).
Mode
The most frequent value. Some data sets have several modes (bimodal, etc.) or none.
For categorical data, the mode is the only meaningful "average".
Mean for a frequency table
Mean = Σ(fx) / Σf
where x is the value (or class midpoint for grouped data) and f is its frequency.
Worked example:
| Score | 1 | 2 | 3 | 4 |
|---|---|---|---|---|
| Freq | 4 | 6 | 8 | 2 |
- Σf = 20.
- Σfx = 1×4 + 2×6 + 3×8 + 4×2 = 4 + 12 + 24 + 8 = 48.
- Mean = 48/20 = 2.4.
Mean for grouped data (estimate)
Use the midpoint of each class as x. The result is an estimate, not exact, because we don't know the true values within a class.
| Class | 0–10 | 10–20 | 20–30 |
|---|---|---|---|
| Mid x | 5 | 15 | 25 |
| Freq | 6 | 8 | 6 |
- Σf = 20.
- Σfx = 5(6) + 15(8) + 25(6) = 30 + 120 + 150 = 300.
- Estimated mean = 300/20 = 15.
Spread — range and IQR
- Range = Max − Min. Easy but sensitive to outliers.
- Interquartile Range (IQR) = Q3 − Q1. The middle 50% of the data; robust.
For grouped data, use the cumulative frequency graph (S3) to estimate Q1 and Q3.
Choosing measures
| Situation | Best central measure | Best spread |
|---|---|---|
| Symmetric, no outliers | Mean | Range or IQR |
| Skewed or outliers | Median | IQR |
| Categorical | Mode | n/a |
Comparing two distributions
The standard exam structure:
"Compare the two distributions."
Always include:
- A statement comparing central tendency (median or mean, in context).
- A statement comparing spread (range or IQR, in context).
Example: "Class A had a higher median score (65 vs 60), so on average they did better. However, Class B had a smaller IQR (15 vs 20), so their scores were more consistent."
⚠Common mistakes— Common mistakes (examiner traps)
- Confusing mean with median for skewed data — always state which you're using.
- Forgetting to put the answer in context when comparing.
- Computing range with the wrong endpoints (using a class midpoint instead of class boundary).
- Using grouped-data midpoints to claim an exact answer — it's always an estimate.
- Skipping the units in a comparison statement.
➜Try this— Quick check
10 students' marks: 5, 6, 8, 8, 9, 10, 10, 10, 12, 14.
- Mean = 92/10 = 9.2.
- Median = average of 5th and 6th = (9 + 10)/2 = 9.5.
- Mode = 10 (appears 3 times).
- Range = 14 − 5 = 9.
AI-generated · claude-opus-4-7 · v3-deep-statistics