Denormalization
The deliberate opposite of normalization — and when it's the right call.
Denormalization is the deliberate introduction of redundancy into a schema to improve query performance, at the cost of extra storage and update complexity.
If normalization is "store every fact in exactly one place," denormalization is "duplicate facts strategically so reads are faster."
The trade-off
✓ What you gain
- Fewer joins → faster queries
- Simpler SQL
- Better parallelism in distributed systems
- Predictable query performance
✗ What you give up
- More storage (data duplicated)
- Update complexity (multiple places to change)
- Risk of inconsistency
- Harder data integrity enforcement
A concrete example
Normalized — two tables, requires a join
Denormalized — one table, no join
When to denormalize
✓ Heavy read, light write workloads
Data warehouses, BI dashboards, reporting databases. Reads outnumber writes 100:1 or more. Optimizing for reads pays off.
✓ Complex joins are the bottleneck
If 80% of your queries join the same 5 tables, denormalizing that hot path can yield 10x speedups.
✓ Cloud data warehouses (Snowflake, BigQuery, Redshift)
These are optimized for wide, denormalized tables. Star schemas are the default — they're deliberately denormalized at the dimension level.
✓ Reporting tables / aggregate tables
Pre-computed summaries (daily sales by region) save expensive aggregations at query time.
When NOT to denormalize
✗ OLTP systems with frequent updates
If customer name changes daily and you've duplicated it across millions of order rows, every update is a nightmare.
✗ When data integrity is critical
Banking, healthcare, regulated industries where consistency must be enforced by the schema.
✗ Limited storage
Rarely a real constraint in 2026, but if you're on-prem with fixed disk budgets, redundancy costs add up.
✗ Simple data with few joins anyway
If your queries don't have many joins, there's nothing to gain by denormalizing.
Common denormalization patterns
Pre-joined columns
Most common. Copy frequently-needed dimension columns into the fact table (e.g., copy CategoryName into the Sales table).
Aggregate tables
Pre-compute and store daily/weekly/monthly summaries to avoid recalculating them.
Materialized views
Database-managed denormalization — the system maintains a denormalized view, refreshing it on schedule.
Wide flat tables
Common in analytics — one giant table with hundreds of columns instead of joining 20 narrow tables.
Denormalization → optimize for reads (OLAP).
Star schema → controlled denormalization for analytics.
Choose based on your read/write ratio.