Data Modeling

The architecture of every database, data warehouse, and analytics platform you'll ever work with.

What is data modeling?

Data modeling is the process of designing how data will be structured, stored, and related inside a database or data warehouse. It's the blueprint that decides whether your queries take 10ms or 10 minutes.

Good data modeling answers questions like:

How do we structure tables so reports run fast?
How do we track changes to a customer's address over time?
What's the right normalization level for this workload?
How do we model a many-to-many relationship efficiently?

Why data modeling matters in interviews

If you're interviewing for any role touching data — data engineer, analyst, backend engineer, ML engineer — you will be asked about:

Fact vs Dimension tables (asked in ~90% of data engineering interviews)
Normalization — when and why
Star vs Snowflake schemas — trade-offs
Slowly Changing Dimensions — Type 1, 2, 3
OLTP vs OLAP — design implications

This track gives you sharp, interview-ready answers for each — with diagrams and concrete examples.

The two halves of this track

📚 Technical Concepts

The fundamentals every interview tests. Fact vs Dimension tables, the three normal forms, schema patterns, SCDs, surrogate keys, denormalization, OLTP vs OLAP. Start here.

🎯 Applied Scenarios

Real design problems: modeling changing addresses, designing an e-commerce schema, handling many-to-many relationships, partitioning huge fact tables, dealing with late-arriving data.

Sample model: the e-commerce star schema

Here's the kind of structure you'll learn to design. A central fact table for orders, surrounded by dimension tables describing the context:

┌──────────────────┐ │ DateDimension │ │ DateSK (PK) │ │ Year, Quarter │ │ Month, Day │ └────────┬─────────┘ │ ┌──────────────────┐ ┌──────▼──────────┐ ┌──────────────────┐ │ CustomerDim │ │ OrderFacts │ │ ProductDim │ │ CustomerSK (PK) │◄───│ CustomerSK (FK) │ │ ProductSK (PK) │ │ Name, Email │ │ ProductSK (FK) │───►│ Name, Category │ │ Segment, City │ │ DateSK (FK) │ │ Brand, Price │ └──────────────────┘ │ Quantity │ └──────────────────┘ │ Amount, Tax │ │ ShippingCost │ └─────────────────┘

By the end of this track, you'll be able to design schemas like this from scratch, explain every choice, and answer the trade-off questions that follow.

How to use this track Read the technical concepts in order — they build on each other. Then work through the applied scenarios; they're modeled on actual interview questions. Practice the SQL side in our SQL Practice section where you can run real queries.

Fact vs Dimension Tables →