Correlation and lines of best fit
What correlation describes
Correlation measures whether two variables move together on a scatter graph. It does not prove causation.
| Correlation | Pattern | Description |
|---|---|---|
| Strong positive | Tight upward trend | As x increases, y increases reliably |
| Weak positive | Loose upward trend | Slight increase, big scatter |
| Strong negative | Tight downward trend | As x increases, y decreases reliably |
| Weak negative | Loose downward trend | Slight decrease, big scatter |
| No correlation | Random scatter | No relationship visible |
Drawing a line of best fit
A line of best fit:
- Passes through the centre of the data (the mean point if known).
- Has roughly equal numbers of points above and below.
- Follows the general trend.
Use a ruler. Draw a single straight line. Do not join the dots.
Using a line of best fit
To estimate y given x: read up from x to the line, then across. To estimate x given y: read across from y to the line, then down.
Interpolation — estimate within the data range. Reasonable. Extrapolation — estimate beyond the data range. Risky and usually unreliable; CCEA frequently asks "explain why this estimate may not be accurate" — answer: "extrapolation outside the data range".
Outliers
A point clearly far from the trend. Investigate before dropping — it may be a data error or a genuine unusual case. Do not include outliers when drawing the line of best fit.
Causation vs correlation
A strong correlation does not prove that one variable causes the other. There may be a third "lurking" variable. CCEA marks an explanation about correlation only — never claim "x causes y" without justification.
Common CCEA exam tip
When asked to "comment on the reliability of the estimate", check whether the value of x lies inside the plotted data range. If yes, "reliable — interpolation". If no, "unreliable — extrapolation".
AI-generated · claude-opus-4-7 · v3-ccea-maths-leaves