Fundamentals of data representation
Every piece of data inside a computer is ultimately stored as binary — a sequence of 0s and 1s. This section explains how numbers, text, images and sound are all encoded in binary, and how data can be compressed to save storage or transmission bandwidth.
Number bases (CS3.1 and CS3.2)
Computers work in binary (base 2), but humans also use decimal (base 10) and hexadecimal (base 16).
| Base | Digits used | Example |
|---|---|---|
| Binary (2) | 0, 1 | 1010₂ = 10₁₀ |
| Decimal (10) | 0–9 | 42₁₀ |
| Hexadecimal (16) | 0–9, A–F | 2A₁₆ = 42₁₀ |
Hexadecimal is a compact shorthand for binary — each hex digit represents exactly 4 bits (one nibble).
Units of information (CS3.3)
| Unit | Size |
|---|---|
| 1 bit | Single binary digit (0 or 1) |
| 1 nibble | 4 bits |
| 1 byte | 8 bits |
| 1 kilobyte (KB) | 1,024 bytes |
| 1 megabyte (MB) | 1,024 KB |
| 1 gigabyte (GB) | 1,024 MB |
| 1 terabyte (TB) | 1,024 GB |
Binary arithmetic and shifts (CS3.4)
Addition of 8-bit binary numbers follows the same rules as decimal addition, but carries happen at 2 (not 10). A left shift by 1 position doubles the value; a right shift halves it.
Overflow occurs when a result is too large to fit in the available number of bits.
Character encoding (CS3.5)
Text is stored as numbers. ASCII uses 7 bits (128 characters — English letters, digits, punctuation). Unicode uses up to 32 bits and covers virtually all of the world's writing systems.
Images (CS3.6)
A digital image is a grid of pixels. Each pixel is stored as a binary colour value. File size = width × height × colour depth (bits), divided by 8 for bytes.
Higher resolution (more pixels) and greater colour depth (more bits per pixel) produce better quality but larger files.
Sound (CS3.7)
Analogue sound is converted to digital by sampling — measuring the sound wave at regular intervals. File size = sample rate × duration × bit depth.
Higher sample rate and bit depth reproduce sound more accurately but create larger files.
Data compression (CS3.8)
Compression reduces file size:
- Lossless — exact original data can be reconstructed (ZIP, PNG, FLAC)
- Lossy — some data is discarded permanently (JPEG, MP3, MP4)
Lossy gives smaller files but reduced quality; lossless preserves quality at the cost of larger files.
Why data representation matters
Understanding how data is encoded lets you:
- Calculate file sizes for images and sound
- Understand why hexadecimal is used in error messages, colour codes and memory addresses
- Choose the right compression format for a use case
- Appreciate the trade-offs between quality and storage
AI-generated · claude-opus-4-7 · v3-deep-computer-science