Explore one of the most fundamental patterns in distributed systems. Learn how WAL (Write-Ahead Log) enables atomicity and durability in everything from SQLite to PostgreSQL.
Professional Knowledge Report
How Computers Save Data Safely
The Magic of the Write-Ahead Log
Published by getbetterat.work
The Big Idea
Computers have a big problem: they need to work extremely fast, but they also must
never lose your data. If the power goes out while saving a file, it can get ruined.
To fix this, engineers created the Write-Ahead Log (WAL). The rule is simple: before a
computer changes a main file, it first writes down exactly what it plans to do in a safe, append-only
list called a log. This simple trick powers almost every database, phone, and cloud server in the world.
1. The Crash Problem
When you save a file, the computer doesn't instantly write it to the hard drive. To be fast, it
holds the changes in its fast memory (RAM) first.
If the power fails before the memory dumps to the slow hard drive, the data is gone forever.
Even worse, if it crashes halfway through saving, the file is half-new and half-old. It is
broken.
Databases follow strict rules called ACID to prevent this. The most important
rule is that a change must be "all or nothing." You never want to deduct money from one bank
account and crash before adding it to the other.
Why Direct Saving Fails
❌ Saves directly to main file
❌ Power out = Half-written file
❌ Result: Permanent Data Corruption
2. The Safe List Solution
The Write-Ahead Log acts like a safety net. If the system crashes, the computer simply wakes up, reads
the safe list (the log), and finishes whatever it was doing. If it finds a half-finished job, it undoes
it.
Feature
Write-Ahead Log
Old Method
How it Writes
Adds to the end of a list
Copies the whole page
Cost to Save
One fast, straight-line save
Many slow, random saves
Crash Recovery
Reads the list and re-does work
Goes back to the old copy
3. The Memory Trick: Borrow &
Relax
Because the log is perfectly safe, the computer can use two clever tricks to run much faster. Engineers
call these rules "Steal" and "No-Force."
1. Borrowing Memory (Steal)
The computer is allowed to "steal" or borrow fast memory to work on a huge task, even before the
task is fully finished. If things go wrong, it just uses the log to erase the half-finished
work.
2. No Rushing (No-Force)
When you hit "save," the computer doesn't need to force all the heavy main files to save
immediately. As long as the tiny log file is saved, the computer can relax and update the big
files later.
4. Why Hardware Loves Logs
Why is adding to a list so much faster than updating the main file? Because of how physical hard drives
are built.
Writing in a straight line (Sequential) is hundreds of times faster than jumping around randomly
(Random). By turning all random database updates into one straight-line log, the computer can save data
at lightning speeds.
Old Hard Drives (HDD)
Straight-line Speed:~200 MB/s
Random Speed:~1 MB/s
Modern Drives (NVMe SSD)
Straight-line Speed:~7,000 MB/s
Random Speed:~600 MB/s
5. The History: From IBM to
ARIES
This brilliant idea didn't happen overnight. It took decades of trial and error by the world's smartest
computer scientists to perfect.
The 1970s: The First Try
IBM's "System R" project invented the early versions of the database. They
tried copying whole files first, but realized keeping a simple log of changes was much faster
for multiple users.
1992: The ARIES Breakthrough
A scientist named C. Mohan published the ARIES algorithm. He gave every
single save a unique ID number. If a crash happens, the computer just checks the ID numbers to
see what was saved and what wasn't. Today, almost every database uses this exact trick.
6. PostgreSQL: Saving
Checkpoints
A log file can't grow forever, otherwise your hard drive would run out of space. The famous database
PostgreSQL manages this by using "Checkpoints," just like saving your progress in a
video game.
1. Write Log
Save actions quickly
2. CHECKPOINT!
Permanently update big files
3. Delete Old
Log
Free up hard drive space
Engineers must balance these checkpoints. If checkpoints happen too often, the computer slows down from
doing too much work. If they happen too rarely, starting the computer back up after a crash takes a very
long time because the log is so long.
7. MySQL: The Two-Log System
Another famous database, MySQL, splits the job into two different logs. One builds
things up, and the other tears them down.
The REDO Log
This log remembers what needs to be added. If the power turns off, the computer
reads the Redo log and acts out everything that was supposed to happen, fast-forwarding back to
the exact moment before the crash.
The UNDO Log
This log remembers how to put things back exactly how they were. If you change
your mind and hit "cancel," or if a job crashes halfway, the Undo log acts like a time machine,
rewinding the file back to safety.
8. SQLite: The Smartphone
Database
SQLite is a tiny database that runs inside billions of phones and apps. It uses the log
to solve a frustrating problem: waiting in line.
Usually, if someone is editing a file, everyone else is locked out and has to wait to read it.
But SQLite changes the rules. Because all new edits go into the separate Log
File, other apps can continue reading from the Main File at the
exact same time! No one ever has to wait in line.
App 1 (Reading)Main File (Old Data)
App 2 (Writing)Log File (New
Data)
9. Cloud Servers: Voting on
the Truth
When you use a big cloud app, your data isn't saved on one computer; it is saved on many at once. To
stop them from getting confused, they use a log to vote on what the truth is, using a system called
Raft.
1. The Leader
One computer is elected boss. It writes your new data in its log first.
2. The Copy
The Leader sends a copy of the log to all the Follower computers.
3. The Vote
Only when a majority vote "We got it!" does the data become permanently
real.
10. Operating Systems:
Guarding Your Folders
It isn't just databases that use this trick. The basic hard drive system in your laptop (like ext4 on
Linux) uses it too. They call it Journaling. You can choose how safe you want your
files to be:
Fast Mode (Risky)
Logs only the folder names, but saves the actual file whenever it wants.
Normal Mode (Default)
Guarantees the file is saved BEFORE the folder name is updated in the log.
Super Safe Mode (Slow)
Logs the folder name AND copies the entire file into the log. Very safe, very slow.
11. Cyber Detectives: The
Hidden Windows Log
Windows computers have a deeply hidden list called the USN Journal. Every single time a
file is created, moved, or deleted, it prints a tiny, permanent receipt into this log.
Cyber detectives love this log. If a hacker breaks in and tries to change a file's date to look like
they were never there (a trick called "timestomping"), the hidden Windows log still remembers the exact
truth.
SYSTEM RECEIPT
RECEIPT NO:84920491
FILE:secret_passwords.txt
ACTION:DATE_FAKED
ACTUAL TIME:11:45 PM
HACKER CAUGHT
12. The Future: Brains vs.
Storage
Today, logs are getting a massive upgrade. Modern cloud systems are doing something radical: they are
separating the "brains" (the computer processing data) from the "storage" (the hard drives).
Because the Write-Ahead Log is the only thing that truly matters, systems like Amazon Aurora don't
even send database files over the network anymore. They only send the tiny log files. The
storage drives receive the log and build the database themselves. This makes setting up a new
massive cloud server take seconds instead of hours.
The Brains (Computer)
Sends ONLY the Log
The Storage (Drives)
The Ultimate Truth
The Write-Ahead Log is the ultimate source of truth for computers. The main database file is just a
snapshot of the past; the log is the true history of everything that happened. By forcing computers to
write down their plans before acting, engineers built the invisible safety net that holds our digital
world together.
More in this collection
Continue your research with other reports from the Distributed Systems series.
Memory Management
The Hidden Speed Trick: Copy-on-Write: A simple guide to the invisible trick computers use to run faster, save space, and handle thousands of tasks at once.