Professional Knowledge Report

How Computers Save Data Safely

The Magic of the Write-Ahead Log

Published by
getbetterat.work

The Big Idea

Computers have a big problem: they need to work extremely fast, but they also must never lose your data. If the power goes out while saving a file, it can get ruined.

To fix this, engineers created the Write-Ahead Log (WAL). The rule is simple: before a computer changes a main file, it first writes down exactly what it plans to do in a safe, append-only list called a log. This simple trick powers almost every database, phone, and cloud server in the world.

1. The Crash Problem

When you save a file, the computer doesn't instantly write it to the hard drive. To be fast, it holds the changes in its fast memory (RAM) first.

If the power fails before the memory dumps to the slow hard drive, the data is gone forever. Even worse, if it crashes halfway through saving, the file is half-new and half-old. It is broken.

Databases follow strict rules called ACID to prevent this. The most important rule is that a change must be "all or nothing." You never want to deduct money from one bank account and crash before adding it to the other.

Why Direct Saving Fails

  • ❌ Saves directly to main file
  • ❌ Power out = Half-written file
  • ❌ Result: Permanent Data Corruption

2. The Safe List Solution

The Write-Ahead Log acts like a safety net. If the system crashes, the computer simply wakes up, reads the safe list (the log), and finishes whatever it was doing. If it finds a half-finished job, it undoes it.

Feature
Write-Ahead Log
Old Method
How it Writes
Adds to the end of a list
Copies the whole page
Cost to Save
One fast, straight-line save
Many slow, random saves
Crash Recovery
Reads the list and re-does work
Goes back to the old copy

3. The Memory Trick: Borrow & Relax

Because the log is perfectly safe, the computer can use two clever tricks to run much faster. Engineers call these rules "Steal" and "No-Force."

1. Borrowing Memory (Steal)

The computer is allowed to "steal" or borrow fast memory to work on a huge task, even before the task is fully finished. If things go wrong, it just uses the log to erase the half-finished work.

2. No Rushing (No-Force)

When you hit "save," the computer doesn't need to force all the heavy main files to save immediately. As long as the tiny log file is saved, the computer can relax and update the big files later.

4. Why Hardware Loves Logs

Why is adding to a list so much faster than updating the main file? Because of how physical hard drives are built.

Writing in a straight line (Sequential) is hundreds of times faster than jumping around randomly (Random). By turning all random database updates into one straight-line log, the computer can save data at lightning speeds.

Old Hard Drives (HDD)
Straight-line Speed: ~200 MB/s
Random Speed: ~1 MB/s
Modern Drives (NVMe SSD)
Straight-line Speed: ~7,000 MB/s
Random Speed: ~600 MB/s

5. The History: From IBM to ARIES

This brilliant idea didn't happen overnight. It took decades of trial and error by the world's smartest computer scientists to perfect.

The 1970s: The First Try

IBM's "System R" project invented the early versions of the database. They tried copying whole files first, but realized keeping a simple log of changes was much faster for multiple users.

1992: The ARIES Breakthrough

A scientist named C. Mohan published the ARIES algorithm. He gave every single save a unique ID number. If a crash happens, the computer just checks the ID numbers to see what was saved and what wasn't. Today, almost every database uses this exact trick.

6. PostgreSQL: Saving Checkpoints

A log file can't grow forever, otherwise your hard drive would run out of space. The famous database PostgreSQL manages this by using "Checkpoints," just like saving your progress in a video game.

1. Write Log
Save actions quickly
2. CHECKPOINT!
Permanently update big files
3. Delete Old Log
Free up hard drive space

Engineers must balance these checkpoints. If checkpoints happen too often, the computer slows down from doing too much work. If they happen too rarely, starting the computer back up after a crash takes a very long time because the log is so long.

7. MySQL: The Two-Log System

Another famous database, MySQL, splits the job into two different logs. One builds things up, and the other tears them down.

The REDO Log

This log remembers what needs to be added. If the power turns off, the computer reads the Redo log and acts out everything that was supposed to happen, fast-forwarding back to the exact moment before the crash.

The UNDO Log

This log remembers how to put things back exactly how they were. If you change your mind and hit "cancel," or if a job crashes halfway, the Undo log acts like a time machine, rewinding the file back to safety.

8. SQLite: The Smartphone Database

SQLite is a tiny database that runs inside billions of phones and apps. It uses the log to solve a frustrating problem: waiting in line.

Usually, if someone is editing a file, everyone else is locked out and has to wait to read it.

But SQLite changes the rules. Because all new edits go into the separate Log File, other apps can continue reading from the Main File at the exact same time! No one ever has to wait in line.

App 1 (Reading) Main File (Old Data)
App 2 (Writing) Log File (New Data)

9. Cloud Servers: Voting on the Truth

When you use a big cloud app, your data isn't saved on one computer; it is saved on many at once. To stop them from getting confused, they use a log to vote on what the truth is, using a system called Raft.

1. The Leader

One computer is elected boss. It writes your new data in its log first.

2. The Copy

The Leader sends a copy of the log to all the Follower computers.

3. The Vote

Only when a majority vote "We got it!" does the data become permanently real.

10. Operating Systems: Guarding Your Folders

It isn't just databases that use this trick. The basic hard drive system in your laptop (like ext4 on Linux) uses it too. They call it Journaling. You can choose how safe you want your files to be:

Fast Mode (Risky)
Logs only the folder names, but saves the actual file whenever it wants.
Normal Mode (Default)
Guarantees the file is saved BEFORE the folder name is updated in the log.
Super Safe Mode (Slow)
Logs the folder name AND copies the entire file into the log. Very safe, very slow.

11. Cyber Detectives: The Hidden Windows Log

Windows computers have a deeply hidden list called the USN Journal. Every single time a file is created, moved, or deleted, it prints a tiny, permanent receipt into this log.

Cyber detectives love this log. If a hacker breaks in and tries to change a file's date to look like they were never there (a trick called "timestomping"), the hidden Windows log still remembers the exact truth.

SYSTEM RECEIPT
RECEIPT NO: 84920491
FILE: secret_passwords.txt
ACTION: DATE_FAKED
ACTUAL TIME: 11:45 PM
HACKER CAUGHT

12. The Future: Brains vs. Storage

Today, logs are getting a massive upgrade. Modern cloud systems are doing something radical: they are separating the "brains" (the computer processing data) from the "storage" (the hard drives).

Because the Write-Ahead Log is the only thing that truly matters, systems like Amazon Aurora don't even send database files over the network anymore. They only send the tiny log files. The storage drives receive the log and build the database themselves. This makes setting up a new massive cloud server take seconds instead of hours.

The Brains (Computer)

Sends ONLY the Log

The Storage (Drives)

The Ultimate Truth

The Write-Ahead Log is the ultimate source of truth for computers. The main database file is just a snapshot of the past; the log is the true history of everything that happened. By forcing computers to write down their plans before acting, engineers built the invisible safety net that holds our digital world together.