Skip to main content
the invisible-layer how abstraction is making software engineers dumber

EXPLAIN ANALYZE: Reading the Database's Mind

8 min read Chapter 21 of 56
Summary

Line-by-line walkthrough of PostgreSQL EXPLAIN ANALYZE output, covering...

Line-by-line walkthrough of PostgreSQL EXPLAIN ANALYZE output, covering all major node types, how to spot performance problems from plan divergences, and a concrete before/after optimization from 2400ms to 3ms with index addition.

EXPLAIN ANALYZE: Reading the Database’s Mind

Every query you send to PostgreSQL gets a plan before it gets an answer. EXPLAIN shows you the plan. EXPLAIN ANALYZE runs the query and shows you the plan alongside what actually happened. The gap between those two — estimated versus actual — is where performance bugs live.

The Anatomy of a Plan

Take a realistic query: you’re building a dashboard that shows recent orders with customer details and total amounts.

EXPLAIN ANALYZE
SELECT o.id, o.created_at, o.total_amount,
       c.name, c.email
FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE o.status = 'completed'
  AND o.created_at >= '2025-01-01'
ORDER BY o.created_at DESC
LIMIT 50;

Here is the full output on a database with 4.2 million orders and 380,000 customers:

Limit  (cost=0.87..178.32 rows=50 width=76) (actual time=2347.112..2401.883 rows=50 loops=1)
  ->  Nested Loop  (cost=0.87..528941.22 rows=149087 width=76) (actual time=2347.109..2401.862 rows=50 loops=1)
        ->  Index Scan Backward using idx_orders_created_at on orders o  (cost=0.43..389124.71 rows=149087 width=52) (actual time=2346.991..2401.672 rows=50 loops=1)
              Filter: ((status)::text = 'completed'::text)
              Rows Removed by Filter: 38741
        ->  Index Scan using customers_pkey on customers c  (cost=0.42..0.94 rows=1 width=32) (actual time=0.003..0.003 rows=1 loops=50)
              Index Cond: (id = o.customer_id)
Planning Time: 0.284 ms
Execution Time: 2401.952 ms

Read this bottom-up (that’s how PostgreSQL executes it) and inside-out (child nodes feed into parent nodes).

Line 3-5: The inner scan on orders. PostgreSQL uses Index Scan Backward on idx_orders_created_at — it’s walking the index in reverse order because you asked for ORDER BY created_at DESC. But notice: the index only covers created_at. The status = 'completed' condition is applied as a Filter after fetching each row from the index. This means PostgreSQL is pulling rows one by one in date order, checking if each is 'completed', and discarding the ones that aren’t.

Rows Removed by Filter: 38741 tells the story. PostgreSQL examined 38,791 rows (38,741 discarded + 50 kept) to find just 50 matches. That’s a 99.87% discard rate.

Line 6-7: The customer lookup. For each of the 50 matching orders, PostgreSQL does an index scan on customers_pkey (the primary key index). loops=50 confirms this ran 50 times. Each lookup takes 0.003ms — fast, because primary key lookups on indexed data are cheap.

Line 2: The Nested Loop join. This is the join strategy. For each row from the outer side (orders), look up the matching row on the inner side (customers). Nested Loop is the right choice when the outer side produces few rows. The planner estimated 149,087 rows from the orders scan; it actually only needed 50 because of the LIMIT. The estimate was off, but the plan still worked acceptably — it just took a bad path getting there.

Line 1: The Limit node. Stops execution after 50 rows are produced. This is critical — without it, PostgreSQL would scan all 149,087 matching orders.

The costs. (cost=0.87..178.32 rows=50 width=76) — the first number (0.87) is startup cost (work before the first row can be returned). The second (178.32) is total cost (estimated). These are arbitrary units based on seq_page_cost, random_page_cost, and other planner configuration values. They’re useful for comparing plans, not for predicting wall-clock time.

Node Types You Must Recognize

Seq Scan

Seq Scan on large_table  (cost=0.00..184320.00 rows=5000000 width=64)
  Filter: (category = 'electronics')

Reads the entire table sequentially. Not inherently bad — for small tables or queries returning most rows, it’s optimal. Red flag: seeing this on a large table with a selective filter.

Index Scan

Index Scan using idx_orders_customer on orders  (cost=0.43..8.45 rows=1 width=52)
  Index Cond: (customer_id = 12345)

Looks up the index to find matching rows, then fetches each row from the table (the “heap”). The Index Cond shows what part of the WHERE clause the index satisfied directly.

Index Only Scan

Index Only Scan using idx_orders_status_date on orders  (cost=0.43..4.21 rows=10 width=12)
  Index Cond: (status = 'pending')
  Heap Fetches: 0

The entire answer comes from the index — no table access needed. Heap Fetches: 0 confirms the visibility map was up to date (meaning VACUUM has run recently). If Heap Fetches is high, PostgreSQL had to check the table anyway to verify row visibility, which partially defeats the purpose.

Bitmap Heap Scan

Bitmap Heap Scan on orders  (cost=1247.33..48391.42 rows=52840 width=52)
  Recheck Cond: (created_at >= '2025-06-01')
  ->  Bitmap Index Scan on idx_orders_created_at  (cost=0.00..1234.12 rows=52840 width=0)
        Index Cond: (created_at >= '2025-06-01')

Two-phase operation. First, scan the index and build a bitmap of which table pages contain matching rows. Then read those pages in physical order. This bridges the gap between “too many rows for an Index Scan, too few for a Seq Scan.” The Recheck Cond exists because the bitmap tracks pages, not individual rows — PostgreSQL must re-verify each row on the page.

Join Types

Hash Join: Builds a hash table from the smaller table, then probes it with rows from the larger table. Good for large joins without useful indexes.

Hash Join  (cost=12.50..52841.50 rows=52840 width=84)
  Hash Cond: (o.customer_id = c.id)
  ->  Seq Scan on orders o  (...)
  ->  Hash  (cost=8.00..8.00 rows=380 width=32)
        ->  Seq Scan on customers c  (...)

Merge Join: Both inputs must be sorted on the join key. Walks through them in parallel. Efficient when both sides are pre-sorted (via indexes).

Nested Loop: For each row from the outer table, scan the inner table. Efficient when the outer side is small and the inner side has an index. Disastrous when both sides are large — you get O(n×m) behavior.

Spotting Problems

Estimated vs actual rows divergence. If the planner estimates 100 rows but 500,000 come back, every downstream decision is wrong. The planner might choose a Nested Loop expecting 100 iterations, but it runs 500,000 times instead. Fix: run ANALYZE on the table to refresh statistics, or check if you need to increase default_statistics_target for the column.

Filter with high discard rate. In our original example, Rows Removed by Filter: 38741 meant the index wasn’t covering the filter condition. The index on created_at helped with ordering but not with the status filter.

Sort overflowing to disk.

Sort  (cost=391247.33..394747.33 rows=1400000 width=52) (actual time=8341.22..9127.44 rows=1400000 loops=1)
  Sort Key: created_at DESC
  Sort Method: external merge  Disk: 89432kB

external merge Disk: 89432kB means the sort didn’t fit in work_mem and spilled to disk. You can increase work_mem for the session or add an index that provides pre-sorted output.

The Fix: From 2400ms to 3ms

Back to our problem query. The issue: the index on created_at alone forces PostgreSQL to scan rows in date order and discard those that aren’t 'completed'. We need an index that handles both conditions:

CREATE INDEX idx_orders_status_created
  ON orders (status, created_at DESC);

Now run the query again:

Limit  (cost=0.56..5.87 rows=50 width=76) (actual time=0.041..0.089 rows=50 loops=1)
  ->  Nested Loop  (cost=0.56..15847.22 rows=149087 width=76) (actual time=0.039..0.082 rows=50 loops=1)
        ->  Index Scan Backward using idx_orders_status_created on orders o  (cost=0.43..12083.41 rows=149087 width=52) (actual time=0.027..0.042 rows=50 loops=1)
              Index Cond: ((status)::text = 'completed'::text)
        ->  Index Scan using customers_pkey on customers c  (cost=0.42..0.94 rows=1 width=32) (actual time=0.002..0.002 rows=1 loops=50)
              Index Cond: (id = o.customer_id)
Planning Time: 0.311 ms
Execution Time: 0.127 ms

The Filter line is gone. Index Cond now includes the status check — the index handles both filtering and ordering natively. PostgreSQL walks directly to the first 'completed' entry in the index (already sorted by created_at DESC), reads 50 entries, joins each to the customer, and stops. Zero wasted rows. 0.127ms total.

The improvement: 2401ms → 0.127ms. That’s a 19,000x speedup from one CREATE INDEX statement.

EXPLAIN vs EXPLAIN ANALYZE

EXPLAIN shows the planned execution without running the query. Use it when:

  • The query modifies data (UPDATE, DELETE) and you don’t want side effects
  • You want a quick estimate without waiting for a slow query to finish

EXPLAIN ANALYZE runs the query and shows actual timings. Use it when:

  • You need to see real row counts vs estimates
  • You’re diagnosing a specific performance problem
  • You want to measure the impact of an index change

For write queries, wrap them in a transaction you roll back:

BEGIN;
EXPLAIN ANALYZE DELETE FROM orders WHERE created_at < '2020-01-01';
ROLLBACK;

Finding Slow Queries: pg_stat_statements

You don’t always know which queries are slow. pg_stat_statements is a PostgreSQL extension that tracks execution statistics for every query:

CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

SELECT query,
       calls,
       mean_exec_time,
       total_exec_time,
       rows
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 20;

This shows the 20 slowest queries by average execution time. The query column shows a normalized version (with literal values replaced by $1, $2, etc.), so identical queries with different parameters are grouped together.

Sort by total_exec_time instead to find queries that are individually fast but called so often they dominate total database load. A 2ms query called 500,000 times per hour costs more than a 5-second query called once.

This is how you find the queries worth optimizing. Not by guessing, not by reading ORM code and imagining what SQL it generates — by measuring what the database actually spent time on. Then you take those queries, run EXPLAIN ANALYZE, and now you know exactly where every millisecond goes.

The database tells you everything. You just have to ask.