TopMyGrade

GCSE/Computer Science/AQA

CS3.8Data compression: lossy vs lossless; benefits, drawbacks and choosing the right compression for a use case

Notes

Data compression

Files take up storage and bandwidth — both finite. Compression shrinks file size by removing redundancy. The two main approaches at GCSE are lossless and lossy compression.

Why compress?

  • Storage — fit more in the same space.
  • Bandwidth — faster downloads / streaming.
  • Cost — less hosting and traffic.

Lossless compression

The compressed file decompresses back exactly identical to the original — no bits lost. Used when every bit matters: text, source code, executables, lossless image formats (PNG, BMP), zip archives.

How it works (in spirit):

  • Run-length encoding (RLE). Replace runs of repeated values with the value and a count. AAAAAA6A. Excellent for images with large flat areas.
  • Dictionary encoding (LZ77, Huffman). Build a dictionary of frequently occurring patterns, replace each with a short code. Used in ZIP, GZIP, PNG.
  • Huffman coding. Common characters get short codes; rare characters get long codes.

Typical ratio: 30-70% size reduction for text; less for already-random data.

Lossy compression

Discards data the human eye/ear can't easily detect. Smaller files at the cost of permanent quality loss — once decompressed, the original is gone.

  • JPEG for photos: discards fine colour detail, blocks of similar pixels are merged.
  • MP3, AAC, Opus for audio: removes inaudible frequencies (e.g. above 16 kHz) and masked sounds.
  • MP4, H.264, H.265 for video: combines lossy image + audio compression with motion prediction.

Typical ratio: 90% size reduction for photos and audio.

Which to use?

ScenarioBest choiceWhy
Source code, legal documentsLosslessEvery bit must be exact
Photo album for the webLossySmall files, eye-friendly
CT scan medical imagesLosslessDetail must survive
Music streamingLossyBandwidth matters more than perfection
Logo / iconLosslessSharp edges and colours preserved
Camera raw → social postLossyAcceptable quality loss for size

Worked exampleWorked example — RLE

Compress AAAABBBCCDAA using RLE. Result: 4A3B2C1D2A — 10 characters → 10 characters here (no saving for short runs).

Now: AAAAAAAAAAAAAAAA (16 A's) → 16A — 3 characters from 16. Excellent.

RLE works only when there are long runs. Random text actually grows under RLE.

Worked exampleWorked example — Huffman idea

Imagine a text where 'E' appears often, 'Z' rarely. Huffman might assign:

  • 'E' → 10 (2 bits)
  • 'Z' → 1110011 (7 bits)

Average bits per character drops, even though some get longer.

Common mistakesPitfalls

  1. Confusing lossy and lossless. Lossless = identical reconstruction; lossy = irreversible discard.
  2. Believing "compressed = smaller always". Already-compressed data (a JPEG, an MP3) doesn't compress further with general-purpose tools.
  3. Re-compressing lossy. Each save loses more quality (generation loss).
  4. Using lossy for the wrong content. A scanned legal document needs every pixel.
  5. Confusing compression with encryption. Compression makes files smaller; encryption makes them unreadable without a key.

Visual: lossy is one-way

ORIGINAL ─→ compress ─→ SMALLER LOSSY FILE ─→ decompress ─→ APPROXIMATION

You can't recover the original from the compressed version.

Worked exampleWorked example — choose wisely

A web designer needs to put a 12 MP photo on a webpage. Should they use PNG (lossless) or JPEG (lossy)?

For a photo on a webpage:

  • File size matters (faster page load, less data for visitors).
  • Slight quality loss is invisible at typical viewing distances.
  • Choose JPEG for an order-of-magnitude smaller file.

For a logo with sharp edges and few colours:

  • Lossy compression creates ugly artefacts around edges.
  • File would already be small without compression.
  • Choose PNG for crisp output.

Try thisQuick check

State whether each is lossless or lossy:

  • ZIP archive — lossless.
  • JPEG — lossy.
  • MP3 — lossy.
  • PNG — lossless.
  • MP4 video — lossy.
  • FLAC audio — lossless.

AI-generated · claude-opus-4-7 · v3-deep-computer-science

Practice questions

Try each before peeking at the worked solution.

  1. Question 14 marks

    Lossless vs lossy

    Define lossless and lossy compression and state how they differ in their effect on the original data.

    Ask AI about this

    AI-generated · claude-opus-4-7 · v3-deep-computer-science

  2. Question 23 marks

    Run-length encoding

    Use run-length encoding to compress the string AAAAAABBBCCCCCCCC. State the compressed form and the original/compressed lengths.

    Ask AI about this

    AI-generated · claude-opus-4-7 · v3-deep-computer-science

  3. Question 33 marks

    Choose compression — photo

    A photographer wants to email a 4000 × 3000 photo. Recommend a suitable compression type and justify.

    Ask AI about this

    AI-generated · claude-opus-4-7 · v3-deep-computer-science

  4. Question 43 marks

    Choose compression — code

    A programmer wants to back up source code. Recommend a compression type and justify.

    Ask AI about this

    AI-generated · claude-opus-4-7 · v3-deep-computer-science

  5. Question 53 marks

    When RLE fails

    Explain why RLE may increase the size of some files.

    Ask AI about this

    AI-generated · claude-opus-4-7 · v3-deep-computer-science

  6. Question 63 marks

    Re-compress and quality

    A pupil opens a JPEG photo, edits it, and saves it as a JPEG repeatedly across many sessions. Explain why the image quality steadily decreases.

    Ask AI about this

    AI-generated · claude-opus-4-7 · v3-deep-computer-science

  7. Question 72 marks

    Compression vs encryption

    State two differences between data compression and data encryption.

    Ask AI about this

    AI-generated · claude-opus-4-7 · v3-deep-computer-science

Flashcards

CS3.8 — Data compression — lossy and lossless

12-card SR deck for AQA GCSE Computer Science topic CS3.8

12 cards · spaced repetition (SM-2)