SQL Anti‑Patterns and How to Fix Them

Practical patterns to avoid, why they hurt, and concrete fixes you can apply today.

What an SQL Anti‑Pattern Is and Why It Matters

An SQL anti‑pattern is a common way of writing queries or designing schemas that seems to work initially but causes performance, maintainability, or correctness problems as data and load grow. These habits are often introduced for speed of delivery or because the developer lacks visibility into the optimizer and runtime costs.

Impact: slow dashboards, unpredictable query plans, excessive I/O, and brittle schemas that complicate analytics and feature development.

Top SQL Anti‑Patterns and How to Fix Them

N+1 Queries

Problem: issuing a query per row in application code (e.g., fetching orders inside a loop over customers). This multiplies round trips and kills latency.

Fix: fetch related data in a single query using joins or set-based aggregation.

-- Bad: N+1
SELECT id, name FROM customers;
-- for each customer: SELECT * FROM orders WHERE customer_id = ?;

-- Good: single query
SELECT c.id, c.name, COUNT(o.id) AS order_count
FROM customers c
LEFT JOIN orders o ON o.customer_id = c.id
GROUP BY c.id, c.name;

SELECT * in Production

Problem: selecting all columns returns unnecessary data, prevents index-only scans, and breaks when schema changes.

Fix: explicitly list required columns and prefer narrow projections; use covering indexes for frequent queries.

Functions on Indexed Columns

Problem: wrapping indexed columns in functions (e.g., LOWER(email)) prevents index usage and forces full scans.

Fix: store normalized values in a computed column or create a functional index when supported by the engine.

Overuse of DISTINCT and GROUP BY

Problem: using DISTINCT or grouping to mask duplicate data hides root causes and adds sorting/aggregation cost.

Fix: fix data model or deduplicate upstream; use window functions to inspect duplicates before removing them.

Scalar Subqueries in SELECT

Problem: scalar subqueries executed per row can be expensive and unpredictable.

Fix: rewrite as joins or precompute values in CTEs.

-- Bad
SELECT c.id,
  (SELECT COUNT(*) FROM orders o WHERE o.customer_id = c.id) AS order_count
FROM customers c;

-- Better
WITH order_counts AS (
  SELECT customer_id, COUNT(*) AS order_count
  FROM orders GROUP BY customer_id
)
SELECT c.id, COALESCE(oc.order_count,0) AS order_count
FROM customers c
LEFT JOIN order_counts oc ON oc.customer_id = c.id;

Entity‑Attribute‑Value (EAV) for Structured Data

Problem: storing attributes as key/value rows makes queries complex, prevents type safety, and scales poorly for analytics.

Fix: prefer normalized columns or JSON/JSONB for truly sparse, semi‑structured data; index JSON paths when needed.

Storing CSV or Lists in a Column

Problem: storing comma‑separated IDs or tags in a single column prevents joins, indexing, and correct referential integrity.

Fix: use join tables (many‑to‑many) or array/JSON types with proper indexes when the DB supports them.

Long Transactions and Locks

Problem: keeping transactions open while waiting for user input or doing heavy processing causes lock contention and deadlocks.

Fix: keep transactions short, move non‑transactional work outside the transaction, and use optimistic concurrency where appropriate.

Using Cursors for Row‑by‑Row Work

Problem: cursors often process rows one at a time and are orders of magnitude slower than set operations.

Fix: rewrite logic as set-based SQL or use bulk operations.

Missing Parameterization and Dynamic SQL Risks

Problem: concatenating user input into SQL invites SQL injection and prevents plan reuse.

Fix: use parameterized queries or prepared statements and validate inputs.

Triggers for Business Logic

Problem: embedding complex business rules in triggers hides behavior, complicates debugging, and can cause unexpected side effects.

Fix: move business logic to application code or well‑documented stored procedures; keep triggers minimal and idempotent.

Quick Comparison Table

Anti‑Pattern	Why it hurts	Fix
N+1 queries	Many round trips; high latency	Use joins/aggregates
SELECT *	Unnecessary I/O; breaks indexes	Project only needed columns
Functions on indexed columns	Prevents index use	Create functional index or normalize
EAV / CSV in column	Poor analytics; no FK enforcement	Normalized tables or JSON with indexes
Long transactions / cursors	Locking; slow throughput	Short transactions; set operations

Diagnosing Anti‑Patterns: Tools and Workflow

Start with symptoms: slow response, high I/O, or unpredictable latency. Then follow a repeatable workflow: capture the slow query, run an EXPLAIN or EXPLAIN ANALYZE, inspect index usage and row estimates, and check wait/lock statistics.

Practical Steps

Collect slow queries and correlate with application traces or dashboard slowdowns.
Run the optimizer plan and compare estimated vs actual row counts.
Look for full table scans, nested loops over large sets, and missing index seeks.
Test fixes on a staging dataset and measure I/O, CPU, and latency before deploying.

When to Use Benchmarks

Microbenchmarks help compare join orders, index choices, and materialized views. For dashboard queries, simulate realistic filters and concurrency to avoid surprises in production.

Refactor Checklist: Safe Steps to Improve SQL

Refactoring SQL in production requires caution. Use this checklist to reduce risk and make improvements incremental.

Checklist

Measure first — capture baseline metrics for latency, I/O, and CPU.
Small, reversible changes — add an index, test, then remove if it regresses.
Use feature flags or staged rollouts for query changes that affect user flows.
Document intent — why a query was changed and what tradeoffs were accepted.
Automate tests — include representative query performance checks in CI for critical reports.

When to Denormalize

Denormalize only when read performance is critical and you can maintain correctness via controlled ETL or materialized views. Denormalization trades write complexity for read speed.

Key Takeaways

Anti‑patterns are common because they often solve short‑term problems. The long‑term solution is to prefer set‑based operations, explicit schemas, and measurable changes. When in doubt, profile the query and let the optimizer guide you — but don't treat it as infallible.

Most important: adopt a repeatable diagnostic workflow (capture → explain → test → deploy) and keep transactions short.