TopMyGrade

Notes

Describing a population using statistics

Once you've collected and summarised data, you need to communicate the population's characteristics in plain English supported by figures.

A description checklist

A good descriptive statement covers:

  1. Centre — where the typical value sits (mean or median).
  2. Spread — how variable the data is (range or IQR).
  3. Shape — symmetric, skewed, bimodal, peaked.
  4. Outliers / unusual values — comment if present.
  5. Context — what the data measures and the units.

Example: "The median commute time was 28 minutes (IQR 12 minutes), with most journeys lying between 20 and 40 minutes. A small number of commutes longer than 60 minutes suggests a few people travel from far outside the local area."

Skewness — how to spot it

  • Symmetric: mean ≈ median, both quartiles roughly equidistant from the median.
  • Right-skewed (positive): long tail to the right; mean > median.
  • Left-skewed (negative): long tail to the left; mean < median.

For income data, right-skew is typical: a few very-high earners drag the mean above the median.

Estimating from samples — implied claims

Saying "the median commute is 28 minutes" is shorthand for "the sample median is 28; we estimate the population median is around 28."

For a defensible claim:

  • The sample must be representative (S1).
  • The sample must be large enough that random fluctuation is small.
  • Outliers should be checked, not silently removed.

Inferring from comparison

Often you'll compare two populations using their summary statistics. The mark schemes always demand:

  • A central comparison (medians).
  • A spread comparison (IQRs).
  • Context (what does this mean for the situation?).

Writing in context

Bad: "The median is 28." Better: "The median commute time was 28 minutes, indicating that half of the people in the sample took less than 28 minutes."

Plotting the description

Often a single diagram supports the description:

  • Box plot: shows median, IQR, range — perfect for skew identification.
  • Histogram / cumulative frequency curve: shows the shape of the distribution.

Common mistakesCommon mistakes (examiner traps)

  1. Numbers without context — "the mean is 7.4" means nothing without units.
  2. Citing only the mean. Always include a measure of spread and a comment on shape if relevant.
  3. Comparing without context — "A is higher" — higher what?
  4. Treating sample statistics as exact population values.
  5. Missing skewness — unequal whiskers in a box plot are an immediate clue.

Try thisQuick check

A box plot of 100 students' weights (kg): min 38, Q1 52, median 60, Q3 75, max 88. Describe the population.

  • Median 60 kg, IQR 23 kg.
  • Q3 − median = 15; median − Q1 = 8 → right-skewed.
  • "Half the students weighed between 52 and 75 kg, with a median of 60 kg. The distribution is skewed toward heavier weights, with a small number of students above 80 kg."

AI-generated · claude-opus-4-7 · v3-deep-statistics

Practice questions

Try each before peeking at the worked solution.

  1. Question 12 marks

    Describe a sample

    (F1) A sample of 50 daily temperatures has mean 14.2 °C and range 18 °C. Write a sentence describing the data.

    [Foundation tier]

    Ask AI about this

    AI-generated · claude-opus-4-7 · v3-deep-statistics

  2. Question 22 marks

    Skewness from box plot

    (F/H2) A box plot has min 5, Q1 10, median 12, Q3 25, max 50. Comment on the skewness.

    [Crossover tier]

    Ask AI about this

    AI-generated · claude-opus-4-7 · v3-deep-statistics

  3. Question 32 marks

    Compare populations

    (F/H3) School A: median test score 60, IQR 14. School B: median 55, IQR 8. Make two comparisons.

    [Crossover tier]

    Ask AI about this

    AI-generated · claude-opus-4-7 · v3-deep-statistics

  4. Question 43 marks

    Sample-to-population caveat

    (H4) A study of 30 commuters in one street found a median commute of 25 minutes. Comment on whether this is a reliable estimate of all commuters in the city.

    [Higher tier]

    Ask AI about this

    AI-generated · claude-opus-4-7 · v3-deep-statistics

  5. Question 52 marks

    Match summary to histogram

    (H5) A histogram has bars rising sharply from 0 to a peak around 20, then a long thin tail to about 100. Mean = 32, median = 22. State whether this is symmetric, right-skewed or left-skewed, and identify which average best summarises typical values.

    [Higher tier]

    Ask AI about this

    AI-generated · claude-opus-4-7 · v3-deep-statistics

  6. Question 63 marks

    Outlier rule

    (H6) A "fence" rule defines an outlier as a value below Q1 − 1.5 × IQR or above Q3 + 1.5 × IQR. With Q1 = 10 and Q3 = 22, identify any outliers in the values: 4, 12, 16, 19, 21, 24, 30, 50.

    [Higher tier]

    Ask AI about this

    AI-generated · claude-opus-4-7 · v3-deep-statistics

  7. Question 76 marks

    Pull together a description

    (H7) A sample of 200 households reports the number of TVs:

    TVs01234
    Freq870803012

    Write a brief description (mean, median, mode and brief comment).

    [Higher tier]

    Ask AI about this

    AI-generated · claude-opus-4-7 · v3-deep-statistics

Flashcards

S5 — Apply statistics to describe a population

10-card SR deck for AQA GCSE Maths topic S5

10 cards · spaced repetition (SM-2)