How Database Indexes Actually Speed Up Reads (And When They Don’t)
Indexes are the most misunderstood performance tool in databases. Everyone knows they help reads. Fewer people know when they do nothing or even make things worse.
This article is a clean mental model for how indexes actually work and how to reason about them.
The Core Idea
An index is an ordered data structure that helps the database avoid scanning the entire table. Most relational databases use B+ trees, which make range scans efficient.
If your query can use an index, the database reads only the relevant part of the tree instead of scanning rows linearly.
B+ Tree in One Paragraph
- Keys are sorted.
- Internal nodes guide you to leaf nodes.
- Leaves store sorted key ranges and point to row locations.
This means:
- Point lookups are $O(\log n)$.
- Range scans are sequential across leaves.
Selectivity: The Hidden Rule
Indexes only help if they reduce the number of rows scanned.
If a column is low-selectivity (like a boolean), the query planner may ignore the index because scanning many rows is not worth the overhead.
When Indexes Hurt
- High write volume: every insert or update must update the index.
- Low selectivity: scanning the index can cost more than a table scan.
- Wrong order: multi-column indexes only help if the query matches the prefix.
Example:
Index on (country, city) helps:
- WHERE country = ‘IN’
- WHERE country = ‘IN’ AND city = ‘Delhi’
But not:
- WHERE city = ‘Delhi’
Covering Indexes (Fastest Case)
If all columns in the query exist inside the index, the database can respond without touching the base table. These are covering indexes, and they are often the fastest path.
A Simple Index Tuning Flow
- Identify slow queries.
- Check
EXPLAINplan. - Add or adjust indexes based on selectivity.
- Re-check write overhead.
Final Thought
Indexes are not magic. They are a trade-off: faster reads, slower writes. The key is to measure where your system hurts and tune accordingly.
Related Articles
Implementation and Performance Comparison of Sequential and Parallel Merge Sort Does Parallel Merge Sort Really Win? Implementing and Comparing from Scratch
A from-scratch comparison of sequential vs parallel merge sort, and why parallel doesn't always win.
Understanding the Tradeoff Between Reads and Writes in Databases and Why You Can’t Optimize Both at the Same Time
A clear explanation of the read/write tradeoff in databases and its impact on performance decisions.
Chess.com’s Authentication Flow — What’s Missing and How to Fix It
Exploring Chess.com's authentication system: what happens when email verification is missing, the security vulnerabilities it creates, and how to build a stronger authentication flow