Database Optimization: Indexing, Query Plans, Partitioning & Caching
A database is only as fast as how well you use it. This guide covers the core techniques for squeezing maximum performance out of any database — from index design to connection pooling.
Indexing Strategies
Indexes are the single most impactful performance lever. A missing index turns a 2 ms query into a 20 second full table scan.
B-Tree Indexes (the default)
B-trees are the workhorse index. They maintain sorted data and support equality, range queries, and prefix matching.
-- Equality lookup
CREATE INDEX idx_users_email ON users(email);
SELECT * FROM users WHERE email = 'alice@example.com'; -- index seek: O(log n)
-- Range query
CREATE INDEX idx_orders_date ON orders(created_at);
SELECT * FROM orders WHERE created_at > '2026-01-01'; -- index range scan
How B-trees work internally:
┌───────────────┐
│ [50 | 100] │ ← root (2 keys, 3 pointers)
└──┬────┬────┬──┘
│ │ │
┌────────┘ │ └────────┐
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ [10|30] │ │ [60|80] │ │[120|150] │ ← internal nodes
└─┬──┬──┬──┘ └─┬──┬──┬──┘ └─┬──┬──┬──┘
│ │ │ │ │ │ │ │ │
▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼
[leaf pages with actual row pointers]
Pages are typically 8–16 KB. A 3-level B-tree can index billions of rows with only 3 disk reads per lookup.
Composite (Multi-Column) Indexes
Column order matters enormously. The leftmost prefix rule applies:
CREATE INDEX idx_orders_multi ON orders(customer_id, status, created_at);
-- ✅ Uses index (leftmost prefix)
SELECT * FROM orders WHERE customer_id = 42;
SELECT * FROM orders WHERE customer_id = 42 AND status = 'shipped';
SELECT * FROM orders WHERE customer_id = 42 AND status = 'shipped' AND created_at > '2026-01-01';
-- ❌ Cannot use index (skips customer_id)
SELECT * FROM orders WHERE status = 'shipped';
SELECT * FROM orders WHERE created_at > '2026-01-01';
Rules for column ordering:
- Equality columns first — columns used with
= - Range column last — the column used with
>,<,BETWEEN - High selectivity first — columns that filter out the most rows
Covering Indexes
An index that contains all columns needed by a query — the database never touches the table.
-- If your query is:
SELECT customer_id, status, total FROM orders WHERE customer_id = 42 AND status = 'shipped';
-- A covering index:
CREATE INDEX idx_orders_covering ON orders(customer_id, status) INCLUDE (total);
-- PostgreSQL INCLUDE avoids bloating the B-tree keys while still covering the query
This is an index-only scan — dramatically faster for read-heavy workloads.
Hash Indexes
O(1) exact-match lookups, but no range queries, no ordering, no prefix matching.
-- PostgreSQL hash index
CREATE INDEX idx_sessions_token ON sessions USING hash(token);
-- Only useful for: WHERE token = 'abc123'
Use case: session tokens, UUID lookups, exact-match-only patterns.
GIN & GiST Indexes (PostgreSQL)
| Type | For | Example |
|---|---|---|
| GIN (Generalized Inverted Index) | Arrays, JSONB, full-text search | CREATE INDEX ON docs USING gin(metadata jsonb_path_ops) |
| GiST (Generalized Search Tree) | Geometric data, ranges, nearest-neighbor | CREATE INDEX ON places USING gist(location) |
| BRIN (Block Range Index) | Naturally ordered data (timestamps) | CREATE INDEX ON events USING brin(created_at) — tiny index, huge table |
Partial Indexes
Index only the rows that matter:
-- Only index active orders (80% of queries, 5% of rows)
CREATE INDEX idx_active_orders ON orders(customer_id, created_at)
WHERE status != 'archived';
Smaller index → faster lookups, less storage, faster writes.
Index Anti-Patterns
| Anti-pattern | Problem | Fix |
|---|---|---|
| Index on every column | Write amplification, storage bloat | Index based on actual query patterns |
| Missing composite index | Multiple single-column indexes are much slower | Create one multi-column index |
| Wrong column order | Index exists but can't be used | Equality first, range last |
| Index on low-cardinality column | WHERE gender = 'M' still scans 50% of rows | Only index selective columns or use partial indexes |
| Never analyzing | Stale statistics → bad query plans | Run ANALYZE regularly (or enable autovacuum in PostgreSQL) |
Reading Query Plans
If you can read an EXPLAIN output, you can fix 90% of slow queries.
PostgreSQL EXPLAIN
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT o.id, c.name, o.total
FROM orders o
JOIN customers c ON c.id = o.customer_id
WHERE o.status = 'pending'
AND o.created_at > '2026-01-01';
Nested Loop (cost=0.85..124.50 rows=15 width=52) (actual time=0.05..0.30 rows=12 loops=1)
Buffers: shared hit=48
-> Index Scan using idx_orders_status_date on orders o (cost=0.43..62.15 rows=15 width=20)
Index Cond: ((status = 'pending') AND (created_at > '2026-01-01'))
Buffers: shared hit=12
-> Index Scan using customers_pkey on customers c (cost=0.42..4.15 rows=1 width=36)
Index Cond: (id = o.customer_id)
Buffers: shared hit=36
Planning Time: 0.15 ms
Execution Time: 0.35 ms
Key things to look for
| What you see | What it means | Action |
|---|---|---|
| Seq Scan on a large table | Full table scan, no index used | Add an index or check if one exists but isn't being used |
| actual rows ≫ rows (estimated) | Statistics are stale | Run ANALYZE tablename |
| Nested Loop with high outer rows | O(n×m) join | Consider hash join (increase work_mem) or add index on join column |
| Sort with high cost | Sorting in memory/disk | Add an index that provides the order, or increase work_mem |
| Buffers: shared read (not hit) | Data not in cache, disk I/O | Increase shared_buffers or investigate if working set fits in RAM |
| Rows Removed by Filter ≫ actual rows | Index is too broad | Use a more selective index or add conditions to index |
MySQL EXPLAIN
EXPLAIN FORMAT=TREE
SELECT * FROM orders WHERE customer_id = 42 AND status = 'pending';
Key columns: type (ALL = full scan, ref = index, const = PK lookup), key (which index), rows (estimated), Extra (Using filesort, Using temporary = red flags).
Partitioning
Split large tables into smaller, more manageable pieces.
Horizontal Partitioning (Sharding)
Each partition contains a subset of rows.
-- PostgreSQL declarative partitioning
CREATE TABLE events (
id BIGSERIAL,
created_at TIMESTAMPTZ NOT NULL,
event_type TEXT,
payload JSONB
) PARTITION BY RANGE (created_at);
CREATE TABLE events_2026_q1 PARTITION OF events
FOR VALUES FROM ('2026-01-01') TO ('2026-04-01');
CREATE TABLE events_2026_q2 PARTITION OF events
FOR VALUES FROM ('2026-04-01') TO ('2026-07-01');
-- Queries automatically route to the right partition
SELECT * FROM events WHERE created_at = '2026-02-15'; -- only scans events_2026_q1
Partition strategies
| Strategy | How it works | Best for |
|---|---|---|
| Range | Partition by value ranges (dates, IDs) | Time-series data, log tables |
| List | Partition by discrete values | Multi-tenant (by tenant_id), by region |
| Hash | Partition by hash of column | Even distribution when no natural range |
When to partition
- Table exceeds hundreds of millions of rows
- Queries always filter on the partition key (partition pruning works)
- You need to drop old data cheaply (drop partition instead of DELETE)
- Maintenance operations (VACUUM, REINDEX) on the full table are too slow
When NOT to partition
- Table has fewer than ~10M rows (index is sufficient)
- Queries don't filter on the partition key (all partitions scanned = worse than no partitioning)
- You'd need to query across all partitions frequently
Connection Pooling
Opening a database connection is expensive (TCP handshake, TLS, authentication, process creation in PostgreSQL). Connection pooling reuses existing connections.
PgBouncer (PostgreSQL)
; pgbouncer.ini
[databases]
mydb = host=127.0.0.1 dbname=mydb
[pgbouncer]
listen_port = 6432
pool_mode = transaction ; release connection after each transaction
max_client_conn = 1000 ; clients can open 1000 connections
default_pool_size = 20 ; only 20 actual PostgreSQL connections
| Pool mode | Connection released | Use case |
|---|---|---|
| session | When client disconnects | Legacy apps, LISTEN/NOTIFY |
| transaction | After each transaction commits | Most applications (recommended) |
| statement | After each statement | Only for simple autocommit workloads |
Application-level pooling
Most ORMs and drivers support pooling:
# SQLAlchemy connection pool
engine = create_engine(
"postgresql://user:pass@localhost/db",
pool_size=20, # steady-state connections
max_overflow=10, # burst connections above pool_size
pool_timeout=30, # wait before raising error
pool_recycle=1800, # recycle connections after 30 min
)
Serverless gotcha
Lambda/Cloud Functions can create thousands of connections. Solutions:
- RDS Proxy (AWS) — managed connection pooler for RDS/Aurora
- PgBouncer on ECS — self-hosted pooler
- Neon / Supabase — built-in connection pooling
- Prisma Data Proxy — application-level pooling for serverless
Caching Strategies
Cache-Aside (Lazy Loading)
def get_user(user_id):
# 1. Check cache
cached = redis.get(f"user:{user_id}")
if cached:
return json.loads(cached)
# 2. Cache miss → query database
user = db.query("SELECT * FROM users WHERE id = %s", user_id)
# 3. Populate cache
redis.setex(f"user:{user_id}", 3600, json.dumps(user)) # TTL 1 hour
return user
Pros: Only caches what's actually read. Cache failure doesn't break the app. Cons: First request is always a miss. Stale data until TTL expires.
Write-Through
def update_user(user_id, data):
# 1. Write to database
db.execute("UPDATE users SET ... WHERE id = %s", user_id)
# 2. Immediately update cache
redis.setex(f"user:{user_id}", 3600, json.dumps(data))
Pros: Cache is always up to date. Cons: Writes are slower (two writes). Cache may hold data that's never read.
Write-Behind (Write-Back)
Write to cache first, asynchronously write to database later. Maximum write performance but risk of data loss if cache crashes before flush.
Cache Invalidation Patterns
| Pattern | How | Trade-off |
|---|---|---|
| TTL | Expire after fixed time | Simple but stale for up to TTL duration |
| Event-driven | Invalidate on write events (CDC, pub/sub) | Fresh, but complex to implement |
| Version tag | Append version to cache key | No stale reads, but requires version tracking |
"There are only two hard things in Computer Science: cache invalidation and naming things." — Phil Karlton
Query Optimization Techniques
Avoid SELECT *
-- ❌ Fetches all columns, can't use covering index
SELECT * FROM orders WHERE customer_id = 42;
-- ✅ Fetch only needed columns
SELECT id, total, status FROM orders WHERE customer_id = 42;
Use EXISTS instead of COUNT for existence checks
-- ❌ Counts all matching rows
SELECT COUNT(*) > 0 FROM orders WHERE customer_id = 42 AND status = 'pending';
-- ✅ Stops at first match
SELECT EXISTS(SELECT 1 FROM orders WHERE customer_id = 42 AND status = 'pending');
Batch operations
-- ❌ N+1 queries
for user_id in user_ids:
db.query("SELECT * FROM users WHERE id = %s", user_id)
-- ✅ Single query
db.query("SELECT * FROM users WHERE id = ANY(%s)", user_ids)
Use CTEs for readability, subqueries for performance
In PostgreSQL < 12, CTEs are optimization fences — the planner can't push predicates into them. In PostgreSQL 12+, simple CTEs are inlined. But always check the query plan.
Pagination: keyset vs OFFSET
-- ❌ OFFSET: database reads and discards N rows
SELECT * FROM orders ORDER BY id LIMIT 20 OFFSET 10000; -- reads 10,020 rows
-- ✅ Keyset: seek directly to position
SELECT * FROM orders WHERE id > 10000 ORDER BY id LIMIT 20; -- reads 20 rows
Replication Topologies
Primary-Replica (Read Replicas)
Writes Reads
│ ┌────────────┐
▼ │ ▼
┌─────────┐ │ ┌──────────────┐
│ Primary │────┘ │ Replica 1 │
│ │────┐ └──────────────┘
└─────────┘ │ ┌──────────────┐
└───►│ Replica 2 │
└──────────────┘
Sync replication: Primary waits for replica ACK → zero data loss, higher latency. Async replication: Primary doesn't wait → lower latency, possible data loss on failover.
Multi-Primary
Both nodes accept writes. Requires conflict resolution (last-writer-wins, CRDTs, application-level merge).
Used by: Galera Cluster (MySQL), BDR (PostgreSQL), CockroachDB, DynamoDB Global Tables.
Vacuum & Maintenance (PostgreSQL)
PostgreSQL's MVCC creates dead tuples on UPDATE/DELETE. VACUUM reclaims space.
-- Check bloat
SELECT schemaname, tablename, n_dead_tup, last_autovacuum
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC
LIMIT 10;
Autovacuum tuning
-- For a high-write table, make autovacuum more aggressive
ALTER TABLE events SET (
autovacuum_vacuum_scale_factor = 0.01, -- trigger at 1% dead tuples (default 20%)
autovacuum_vacuum_cost_delay = 2 -- run faster (default 20ms)
);
pg_stat_statements — find slow queries
-- Top 10 queries by total time
SELECT query, calls, mean_exec_time, total_exec_time
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;
Performance Checklist
| Area | Check | Tool |
|---|---|---|
| Indexes | All WHERE/JOIN columns indexed | EXPLAIN ANALYZE, pg_stat_user_indexes |
| Query plans | No unexpected Seq Scans | EXPLAIN (ANALYZE, BUFFERS) |
| N+1 queries | ORMs not generating per-row queries | Application logs, pg_stat_statements |
| Connection pool | Pool size matches workload | PgBouncer stats, connection count monitoring |
| Cache hit ratio | > 99 % for OLTP | SELECT ... FROM pg_stat_database |
| Bloat | Dead tuples under control | pg_stat_user_tables, pgstattuple |
| Lock contention | No long-running transactions blocking others | pg_stat_activity, pg_locks |
| Memory | shared_buffers ≈ 25 % RAM, work_mem tuned | PostgreSQL config |
Resources
- Use The Index, Luke — The definitive SQL indexing guide
- PostgreSQL EXPLAIN Visualizer — Paste your EXPLAIN output for visual analysis
- pgMustard — AI-powered query plan analysis
- The Art of PostgreSQL — Advanced PostgreSQL techniques
- High Performance MySQL — 4th edition (2022)
- Database Reliability Engineering — O'Reilly DRE book
:::