Histograms and cumulative frequency graphs
When you have lots of numerical data, lump it into classes (intervals). Then choose between a histogram (shape of distribution) or a cumulative frequency graph (running total — used for medians and quartiles).
Histograms with equal class widths
For equal-width classes, the height of each bar is just the frequency. There are no gaps between bars (because the data is continuous).
| Class | Frequency |
|---|---|
| 0–10 | 5 |
| 10–20 | 12 |
| 20–30 | 18 |
| 30–40 | 9 |
| 40–50 | 6 |
Plot bars 0–10, 10–20, etc., with heights 5, 12, 18, 9, 6.
Histograms with unequal class widths [Higher tier]
When class widths vary, the bar height represents frequency density, not raw frequency.
Frequency density = frequency / class width
This way, the area of each bar equals the frequency, and the chart honestly shows the distribution.
Worked example:
| Class | Frequency | Width | Density |
|---|---|---|---|
| 0–10 | 8 | 10 | 0.8 |
| 10–20 | 18 | 10 | 1.8 |
| 20–40 | 30 | 20 | 1.5 |
| 40–60 | 16 | 20 | 0.8 |
| 60–100 | 8 | 40 | 0.2 |
The bar for 20–40 is wider than for 10–20, but its density (1.5 < 1.8) is lower — making clear that 10–20 has the most concentrated data.
Reading a histogram
To find a frequency for any class:
Frequency = density × width
To estimate frequencies for a sub-interval inside one class, assume uniform distribution: split the class proportionally.
Cumulative frequency
A cumulative frequency table adds up frequencies from the lowest class onwards.
| Class | Freq | Cum Freq |
|---|---|---|
| 0–10 | 5 | 5 |
| 10–20 | 12 | 17 |
| 20–30 | 18 | 35 |
| 30–40 | 9 | 44 |
| 40–50 | 6 | 50 |
The total cum freq at the end equals the total number of data points.
Cumulative frequency graphs (ogive)
Plot (upper class boundary, cumulative frequency) points and join smoothly. The classic shape is an "S" (sigmoid).
Use the graph to estimate:
- Median (50th percentile): read across at half the total cum freq, drop down to the x-axis.
- Lower quartile (Q1): at 25% of total.
- Upper quartile (Q3): at 75% of total.
- IQR = Q3 − Q1.
- Number above a value: total − cum freq at that value.
Worked example: total = 50. Median sits at cum freq = 25. From the graph, that maps to roughly x = 23 (depends on the curve).
Box plots
A useful summary computed from the cumulative frequency curve:
- Min, Q1, Median, Q3, Max plotted on a number line.
- Box from Q1 to Q3, line at the median, "whiskers" out to min and max.
Box plots make it easy to compare two distributions side-by-side.
⚠Common mistakes— Common mistakes (examiner traps)
- Using frequency as bar height in a histogram with unequal widths.
- Plotting cumulative frequency at the lower boundary instead of the upper.
- Drawing straight-line steps between points instead of a smooth curve.
- Reading the median off the histogram — use the cum-freq graph instead.
- Misreading interpolation within a class — always assume uniform distribution.
➜Try this— Quick check
In a cumulative frequency graph for 60 students' test scores, the y-coordinate at x = 40 reads 18 and at x = 60 reads 42.
(a) Estimate the number of students scoring between 40 and 60. (b) Estimate the lower quartile (Q1).
(a) 42 − 18 = 24 students. (b) Q1 at cum freq 15. Read x → typically around 35 or so.
AI-generated · claude-opus-4-7 · v3-deep-statistics