E-commerce Star Schema
A complete design for an e-commerce analytics warehouse — fact + dimensions, end to end.
This is a common open-ended design question. Here's a complete star schema covering orders, customers, products, and dates — the kind of design that powers BI dashboards at any modern e-commerce company.
The big picture
The fact table
The dimensions
CustomerDimension (Type 2 SCD)
ProductDimension (Type 2 SCD)
DateDimension
LocationDimension
Key design decisions
1. Separate location dimension for reusability
Both customer address and shipping address need location data. A reusable LocationDim avoids duplication in CustomerDim.
2. Product dimension tracks historical changes
Type 2 SCD on products means we can answer "what was this product's price in January 2023?" — useful for revenue attribution.
3. Multiple date references in the fact
OrderDateSK and ShipDateSK both point to DateDim. This is a role-playing dimension — same table, different roles in the fact.
4. Degenerate dimensions in the fact
OrderID stays in OrderFacts even though there's no OrderDim. It's a degenerate dimension — an attribute that exists only in the fact, not in any separate dimension. Useful for grouping fact rows back into their original business transactions.
5. Selective denormalization
CategoryName and BrandName are duplicated in ProductDim. This sacrifices some normalization purity for fewer joins on common queries. Star schemas explicitly trade storage for read speed.
6. Surrogate keys everywhere
Every dimension uses a surrogate key. This lets Type 2 SCDs work, makes joins fast, and decouples the warehouse from operational system ID changes.
Sample query: monthly revenue by category
What to mention in the interview
- Identify grain first — what does one row in the fact represent? (Here: one order line item.)
- Choose surrogate keys — for SCD support and join performance.
- Pick SCD types — Type 2 for Customer and Product (history matters), Type 1 for things like email corrections.
- Handle role-playing dimensions — OrderDate and ShipDate both reference DateDim.
- Explain measures vs dimensions — measures are additive numbers; foreign keys point to descriptive context.
- Be ready for "how would you partition this?" — by OrderDate, monthly or yearly.