How to Optimize SQL Queries in High-Traffic Applications



This content originally appeared on DEV Community and was authored by Gaurav

Optimizing SQL Queries for High‑Traffic Applications

(A practical, step‑by‑step guide that works across MySQL, PostgreSQL, SQL Server, and Oracle)

1. Why “SQL‑only” Optimisation Matters

High‑traffic apps often hit the database thousands to millions of times per second. Even a 1 ms improvement per query can save seconds of CPU time, reduce latency, and keep your infrastructure costs down. The biggest gains come from making the database do less work, not just from adding more hardware.

2. The Optimization Lifecycle

Phase Goal Typical Tools / Artefacts
A. Baseline Capture realistic workloads, identify hot queries. Slow‑query log, pg_stat_statements, Query Store (SQL Server), AWR (Oracle), application logs.
B. Diagnose Understand why a query is slow. EXPLAIN / EXPLAIN ANALYZE, visual explain plans, SHOW PROFILE, DMVs.
C. Refactor Rewrite / index / restructure to eliminate bottlenecks. DDL changes, query rewrite, materialized views, partitioning.
D. Validate Verify that performance improved without regressions. Same metrics as baseline, regression test suite, load‑testing tools (JMeter, k6).
E. Guard Prevent regressions in the future. CI checks, automated plan‑stability tests, query‑cost thresholds, monitoring alerts.

3. Foundations – “First‑Principles” Checks

# Checklist Why it matters
1 Use the right data types – avoid VARCHAR(255) for a 2‑digit code; prefer CHAR(2) or SMALLINT. Smaller rows → more fits in memory → fewer page reads.
2 Never SELECT * in production – enumerate columns. Reduces I/O, avoids unnecessary row‑locking and cache pressure.
3 Keep transactions short – commit ASAP. Reduces lock contention and dead‑lock probability.
4 Prefer set‑based operations over row‑by‑row loops (cursor, WHILE). DB engines are built for set processing; loops cause massive overhead.
5 Avoid functions on indexed columns in WHERE/JOIN (e.g., WHERE DATE(col) = ...). Index cannot be used → full scan. Use computed/derived columns or range predicates instead.
6 Parameterise queries (prepared statements) – not string concatenation. Enables plan reuse, prevents SQL injection, reduces parsing overhead.
7 Watch out for implicit conversionsWHERE int_col = '123' forces a scan. Force implicit cast on the column side kills the index.

4. Indexing – The Most Powerful Lever

4.1. Types of Indexes (quick reference)

DBMS B‑Tree Hash GiST / SP‑GiST BRIN Columnstore Full‑Text
MySQL ✅ ✅ (Memory) ✅ (8.0+) ✅ (InnoDB/ColumnStore) ✅
PostgreSQL ✅ ✅ (hash) ✅ ✅ ✅ (cstore_fdw) ✅
SQL Server ✅ ✅ (Memory‑Optimized) ✅ (spatial) ✅ (SQL Server 2019+) ✅ (Columnstore) ✅
Oracle ✅ ✅ (in‑memory) ✅ (Spatial) ✅ (approx.) ✅ (Hybrid) ✅

4.2. Index Design Process

  1. Identify the hot predicates – the columns that appear in WHERE, JOIN ON, ORDER BY, GROUP BY.
  2. Check cardinality – high cardinality (many distinct values) = good for B‑Tree; low cardinality may benefit from bitmap (PostgreSQL BRIN/GIN, Oracle bitmap).
  3. Covering (Include) columns – add non‑key columns to an index so the query can be satisfied entirely from the index (MySQL Server INCLUDE, PostgreSQL INCLUDE, MySQL covering via SELECT that matches the index).
  4. Composite index ordering – order columns by filtering power first, then by sorting/grouping.
    • Example: WHERE status = ? AND created_at > ? ORDER BY created_at DESC(status, created_at DESC).
  5. Avoid over‑indexing – each extra index adds write overhead (INSERT/UPDATE/DELETE). Keep an index‑to‑write‑ratio < 1 for OLTP workloads.

4.3. Practical Index Recipes

Scenario Recommended Index (MySQL/PostgreSQL) Why
Exact lookup on user_id (PK) PRIMARY KEY (user_id) Already optimal.
Range + filter: WHERE country = ? AND ts >= ? (country, ts) Country filters first, then range on ts.
Sorting: WHERE status = ? ORDER BY created_at DESC LIMIT 20 (status, created_at DESC) Uses index for both filter and order, LIMIT stops early.
Join: ON o.customer_id = c.id and c.region = ? customer (region, id) Region filter first, then join column.
Full‑text search on title FULLTEXT(title) (MySQL) / GIN(to_tsvector('english', title)) (PostgreSQL) Enables inverted index.
Very large table > 100 M rows, queries on recent data BRIN on timestamp column (ts) BRIN stores min/max per block → fast for “newest N rows”.
Frequent aggregates: GROUP BY product_id HASH index (PostgreSQL) or clustered index (SQL Server) on product_id Hash index speeds up equality grouping.

4.4. Maintaining Index Health

Action How to do it
Detect unused indexes MySQL performance_schema.events_statements_summary_by_digest; PostgreSQL pg_stat_user_indexes; SQL Server sys.dm_db_index_usage_stats.
Rebuild / reorganise MySQL OPTIMIZE TABLE; PostgreSQL REINDEX or VACUUM (FULL, ANALYZE); SQL Server ALTER INDEX REBUILD; Oracle ALTER INDEX REBUILD.
Update statistics MySQL ANALYZE TABLE; PostgreSQL ANALYZE; SQL Server UPDATE STATISTICS; Oracle DBMS_STATS.GATHER_TABLE_STATS.
Monitor index bloat PostgreSQL pgstattuple, MySQL SHOW TABLE STATUS; SQL Server sys.dm_db_index_physical_stats.

5. Query‑Plan Analysis – Reading EXPLAIN

5.1. Common Plan Nodes & Their Cost

Node Typical Meaning Red Flag
Seq Scan / Table Scan Full table read Expect an index.
Index Scan Reads index entries (may still fetch table rows). OK if selective.
Index Only Scan (PostgreSQL) Index satisfies query completely. Ideal!
Bitmap Index Scan → Bitmap Heap Scan Combines many indexes; may be slower than a single index. Consider a composite index.
Nested Loop Join For each row from outer, probe inner. Good for small outer side, otherwise consider Hash or Merge join.
Hash Join Builds hash table on inner side. Works well for large, unsorted data.
Merge Join Both sides must be sorted (often via index). Efficient if both inputs already ordered.
Sort Explicit sort operation. Try to push ordering into index.
Aggregate Grouping/aggregation. If GROUP BY matches index order → “GroupAggregate” (faster).
Subquery/CTE Materialisation Temporary result set. May be avoidable with rewrite.

5.2. Example: MySQL EXPLAIN

EXPLAIN FORMAT=JSON
SELECT  o.id, o.amount, c.name
FROM    orders o
JOIN    customers c ON c.id = o.customer_id
WHERE   o.status = 'paid'
  AND   o.created_at >= '2025-01-01'
ORDER BY o.created_at DESC
LIMIT 50;

Result (simplified):

{
  "query_block": {
    "select_id": 1,
    "cost_info": {"query_cost":"319.50"},
    "table": {
      "table_name":"orders",
      "access_type":"range",
      "possible_keys":["idx_status_created"],
      "key":"idx_status_created",
      "used_key_parts":["status","created_at"],
      "rows_examined_per_scan": 120,
      "filtered": 85.0,
      "attached_condition":"(`orders`.`status` = 'paid')"
    },
    "inner join":{...}
}

Interpretation

  • range on idx_status_created → good (uses composite index).
  • rows_examined_per_scan low → index selective.
  • No separate sort node → ordering satisfied by index (created_at DESC).

If you see type: ALL (full scan) or a separate sort step, you likely need a better index or query rewrite.

5.3. PostgreSQL EXPLAIN ANALYZE

EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON)
SELECT *
FROM   orders
WHERE  status = 'paid'
  AND  created_at >= CURRENT_DATE
ORDER  BY created_at DESC
LIMIT  100;

Typical output snippet:

{
  "Plan": {
    "Node Type": "Index Scan Backward",
    "Relation Name": "orders",
    "Index Name": "idx_status_created",
    "Index Cond": "(status = 'paid'::text)",
    "Filter": "(created_at >= now())",
    "Rows Removed by Filter": 0,
    "Buffers": {"Shared Hit Blocks": 75, "Shared Read Blocks": 0},
    "Actual Total Time": 0.78,
    "Actual Loops": 1
  }
}

Key take‑aways

  • Index Scan Backward = using index for descending order → no sort needed.
  • Buffers shows almost all data served from cache → good.

If you see Seq Scan instead, add a composite index or rewrite the predicate.

6. Advanced Physical Design

6.1. Partitioning (Horizontal Sharding)

DBMS Partitioning Styles When to use
MySQL 8+ Range, List, Hash, Key, Composite 10 M+ rows, time‑series, multi‑tenant data.
PostgreSQL 13+ Declarative range/list/hash partitions, sub‑partitioning. Same as MySQL + very large tables (> 500 M).
SQL Server Partitioned tables & indexes (range); can be on any column. OLTP with massive data, need to prune partitions.
Oracle Range/List/Hash partitions; interval partitioning. Enterprise data warehouses, multi‑year history.

Best practice: Partition on a column that appears in most range predicates (e.g., created_at). Keep the partition key immutable (no updates).

-- MySQL example
CREATE TABLE orders (
  id BIGINT PRIMARY KEY,
  status ENUM('new','paid','canceled') NOT NULL,
  created_at DATETIME NOT NULL,
  ... 
) PARTITION BY RANGE (YEAR(created_at)) (
  PARTITION p2023 VALUES LESS THAN (2024),
  PARTITION p2024 VALUES LESS THAN (2025),
  PARTITION pmax  VALUES LESS THAN MAXVALUE
);

6.2. Sharding (Application‑Level Horizontal Scaling)

Technique When to consider
Key‑based sharding (e.g., user_id % N) Very high write rates, > 10 M writes/sec.
Directory‑based sharding (lookup table) Uneven data distribution, need flexibility.
Consistent hashing (e.g., using Vitess, Citus, CockroachDB) Need to add/remove shards with minimal rebalancing.
Hybrid (partition + shard) Multi‑tenant SaaS with per‑tenant isolation.

Implementation tip: Keep the shard key in every table that needs to be joined (e.g., tenant_id). This allows joins to stay within a single shard.

6.3. Materialised Views & Summary Tables

  • When: Pre‑aggregate heavy reports (GROUP BY on large fact tables).
  • How: Use DB‑native materialised view (PostgreSQL CREATE MATERIALIZED VIEW, Oracle MATERIALIZED VIEW, SQL Server Indexed View).
  • Refresh: REFRESH FAST (incremental) or schedule nightly REFRESH COMPLETE.
  CREATE MATERIALIZED VIEW daily_sales
  BUILD IMMEDIATE
  REFRESH FAST ON COMMIT
  AS
  SELECT product_id, DATE(order_ts) AS day,
         SUM(amount) AS total_sales
  FROM orders
  GROUP BY product_id, DATE(order_ts);

6.4. Columnstore / OLAP Optimisations

  • SQL Server: CREATE CLUSTERED COLUMNSTORE INDEX on large fact tables → massive compression + vectorised scans.
  • PostgreSQL: Use cstore_fdw or timescaledb hypertables for time‑series.
  • MySQL: COLUMNSTORE engine (MariaDB) or InnoDB with ROW_FORMAT=COMPRESSED.

Rule of thumb: Use row‑store for OLTP (high‑frequency point reads/writes). Use column‑store for analytical queries that scan many rows but only a few columns.

7. Application‑Side Tactics

Tactic Why it matters Example
Connection Pooling Reduces handshake overhead, keeps sockets warm. HikariCP (Java), pgBouncer (Postgres), ProxySQL (MySQL).
Prepared‑statement caching Avoids re‑parsing, re‑planning. PDO::ATTR_EMULATE_PREPARES = false (PHP).
Batch inserts/updates Fewer round‑trips, better transaction granularity. INSERT INTO t (a,b) VALUES (?,?),(?,?),(?,?).
Read‑through / Write‑behind cache Offloads hot reads to Redis / Memcached. Cache product catalog for 5 min, invalidate on write.
Avoid N+1 queries Reduce number of round‑trips. Use JOIN or IN (...) instead of fetching children per parent.
Pagination strategy OFFSET gets slower with deep pages. Use keyset pagination (WHERE id > last_seen_id ORDER BY id).
Retry with exponential back‑off Handles transient dead‑locks or connection spikes gracefully. retry(attempt => 1..5, backoff => 2^attempt * 100ms).
Circuit breaker Prevents cascading failures when DB is overloaded. Hystrix / resilience4j.

8. Monitoring & Alerting

Metric Typical Threshold (OLTP) Tool
Avg query latency < 5 ms (core reads) Prometheus + Grafana (pg_stat_statements.mean_time)
95th‑percentile latency < 20 ms DataDog, New Relic
Cache hit ratio (buffer pool) > 95 % SHOW ENGINE INNODB STATUS (MySQL), pg_buffercache (Postgres)
Lock wait time < 10 ms performance_schema.events_waits_summary_by_thread_by_event_name
Dead‑lock count 0 (or < 1 per hour) SHOW ENGINE INNODB STATUS
Replication lag < 1 s SHOW SLAVE STATUS (MySQL), pg_stat_replication
Disk I/O < 80 % of provisioned bandwidth iostat, CloudWatch

Alert example (Prometheus)

- alert: HighQueryLatency
  expr: avg_over_time(pg_stat_statements_mean_time[5m]) > 0.02
  for: 2m
  labels:
    severity: critical
  annotations:
    summary: "Average PostgreSQL query latency > 20 ms"
    description: "Investigate slow queries; check execution plans."

9. End‑to‑End Optimization Checklist

✅ Item How to Verify
1 All hot queries have covering indexes (or at least an index on filter columns). EXPLAIN shows Index Scan / Index Only Scan.
2 No SELECT * in production code. Code review / static analysis.
3 Statistics are up‑to‑date (ANALYZE run after bulk loads). SELECT relname, n_live_tup FROM pg_stat_user_tables; compare with `pg_class


This content originally appeared on DEV Community and was authored by Gaurav