TechnicalJanuary 28, 202615 min read

PostgreSQL Vacuum: The Complete Guide to Not Losing Your Data

PostgreSQL uses Multi-Version Concurrency Control (MVCC) to handle concurrent access to data. This is fundamental to how PostgreSQL works, and understanding it is essential to understanding why VACUUM exists and why it matters.

How MVCC Creates Dead Tuples

When you UPDATE a row in PostgreSQL, the database does not modify the row in place. Instead, it marks the old version as "dead" and writes a new version. When you DELETE a row, the database marks it as dead but does not physically remove it. These dead row versions are called "dead tuples."

This is a feature, not a bug. Dead tuples allow other transactions that started before the UPDATE/DELETE to still see the old version of the data (read consistency). Once no transaction can see the old version, it becomes eligible for cleanup.

That cleanup process is VACUUM.

What VACUUM Actually Does

VACUUM scans a table and removes dead tuples that are no longer visible to any active transaction. It does several things:

Reclaims storage occupied by dead tuples (marks it as reusable)
Updates the visibility map (which pages are all-visible)
Updates the free space map (which pages have room for new tuples)
Freezes old transaction IDs to prevent wraparound

Note that regular VACUUM does NOT return disk space to the operating system. It marks space as reusable within the table. Only VACUUM FULL (which rewrites the entire table with an ACCESS EXCLUSIVE lock) returns space to the OS.

Autovacuum: Your Best Friend

PostgreSQL's autovacuum daemon automatically runs VACUUM on tables when the number of dead tuples exceeds a threshold. The key parameters:

autovacuum_vacuum_threshold = 50         -- minimum dead tuples
autovacuum_vacuum_scale_factor = 0.2     -- 20% of table must be dead
autovacuum_vacuum_cost_delay = 2ms       -- throttling delay
autovacuum_vacuum_cost_limit = 200       -- cost budget per round

The formula: vacuum triggers when dead tuples > threshold + (scale_factor * table_rows).

For a table with 1,000,000 rows: VACUUM triggers at 50 + (0.2 * 1,000,000) = 200,050 dead tuples.

For a table with 10 rows: VACUUM triggers at 50 + (0.2 * 10) = 52 dead tuples.

Tuning Autovacuum for Large Tables

The default scale_factor of 0.2 (20%) is problematic for large tables. If your table has 100 million rows, autovacuum waits until there are 20 million dead tuples before running. That is too many.

The fix: set per-table autovacuum parameters.

sql
ALTER TABLE orders SET (
  autovacuum_vacuum_scale_factor = 0.01,     -- 1% instead of 20%
  autovacuum_vacuum_threshold = 1000,         -- start earlier
  autovacuum_analyze_scale_factor = 0.005     -- analyze more frequently
);

Transaction ID Wraparound: The Ticking Clock

PostgreSQL uses 32-bit transaction IDs. That gives you roughly 2 billion usable transaction IDs before "wraparound" occurs. If wraparound happens, your database shuts down to prevent data corruption.

VACUUM FREEZE converts old transaction IDs to a special "frozen" state, effectively recycling them. Autovacuum handles this automatically with the "anti-wraparound" vacuum, which ignores the normal cost-based throttling and runs with high priority.

Check your wraparound risk:

sql
SELECT datname,
       age(datfrozenxid) AS xid_age,
       round(100 * age(datfrozenxid)::numeric / 2147483647, 2) AS pct_to_wraparound
FROM pg_database
ORDER BY xid_age DESC;

If pct_to_wraparound exceeds 50%, you need to investigate. If it exceeds 75%, you have a problem. If it exceeds 90%, you are in emergency territory.

Common Vacuum Problems and Fixes

Problem: Autovacuum is not keeping up

Check dead tuple ratio:

sql
SELECT schemaname, relname,
       n_dead_tup,
       n_live_tup,
       round(n_dead_tup::numeric / nullif(n_live_tup, 0) * 100, 2) AS dead_pct,
       last_autovacuum
FROM pg_stat_user_tables
WHERE n_dead_tup > 10000
ORDER BY n_dead_tup DESC;

Problem: VACUUM is being blocked

The most common cause: an idle-in-transaction session holding a snapshot. VACUUM cannot clean dead tuples that might still be visible to an active transaction.

sql
SELECT pid, state, backend_xmin, query_start,
       now() - query_start AS duration
FROM pg_stat_activity
WHERE state = 'idle in transaction'
  AND now() - query_start > interval '5 minutes';

Fix: Set idle_in_transaction_session_timeout:

sql
ALTER SYSTEM SET idle_in_transaction_session_timeout = '10min';
SELECT pg_reload_conf();

Problem: Table bloat after bulk operations

After large UPDATE or DELETE operations, run a manual VACUUM ANALYZE:

sql
VACUUM ANALYZE orders;

For extreme bloat, consider pg_repack (online table reorganization without ACCESS EXCLUSIVE lock) or VACUUM FULL during a maintenance window.

Monitoring Vacuum Health

Key metrics to watch:

Dead tuples per table (n_dead_tup in pg_stat_user_tables)
Time since last vacuum (last_autovacuum, last_vacuum)
Autovacuum worker count vs max_autovacuum_workers
Transaction ID age (age(datfrozenxid))
Table bloat ratio (estimated via pgstattuple or statistical estimation)

Set up alerts for: - Dead tuples exceeding 1M on any table - XID age exceeding 500M - No vacuum on a table with changes in the last 24 hours - Autovacuum workers consistently at max

Vacuum is not glamorous. It is not exciting. But it is the single most important maintenance operation in PostgreSQL. Neglect it, and your database will eventually stop accepting writes. Monitor it, and you will never have that problem.