🛠 Learn ERWIN

Star vs Snowflake Schema

Two ways to organize a data warehouse — performance vs storage.

Both schemas have a central fact table connected to dimensions. The difference is whether those dimensions are denormalized (star) or normalized (snowflake).

Star schema

A central fact table connects directly to denormalized dimension tables. Each dimension is a single table containing all related attributes.

Star Schema DateDim │ │ CustomerDim ────► SalesFact ◄──── ProductDim │ │ StoreDim Each dimension is one denormalized table holding all attributes. ProductDim has columns: ProductID, Name, Category, Brand, Manufacturer, etc.

Snowflake schema

Dimensions are normalized into multiple related tables. Where a star schema has one ProductDim, a snowflake has ProductDim → CategoryDim, ProductDim → BrandDim, etc.

Snowflake Schema DateDimCountryDimCategoryDim │ │ │ │ │ │ CityDim ─► CustomerDim ─► SalesFact ◄─ ProductDim ─► BrandDim │ │ StoreDim Each dimension is split into normalized sub-tables.

Side-by-side comparison

AspectStar SchemaSnowflake Schema
StructureSimple, flat dimensionsNormalized, hierarchical dimensions
NormalizationDenormalizedNormalized (3NF or higher)
StorageMore (data is redundant)Less (no duplication)
Query performanceFaster (fewer joins)Slower (more joins)
MaintenanceEasier — single dim to updateHarder — cascading updates
Query simplicitySimpler — fewer tablesMore complex SQL
Data integrityLower (denormalized risks)Higher (enforced relationships)
Use caseOLAP, BI dashboardsStorage-constrained warehouses

Concrete example

Star — product dimension as one table

ProductDim ProductID │ ProductName │ CategoryID │ CategoryName │ BrandID │ BrandName 1 │ Laptop X │ 10 │ Electronics │ B1 │ TechCo 2 │ Phone Y │ 10 │ Electronics │ B2 │ MobileCo 3 │ Chair Z │ 20 │ Furniture │ B3 │ HomeCo "Electronics" repeats for every electronic product. Fast to query, more storage.

Snowflake — product, category, brand as separate tables

ProductDim CategoryDim BrandDim ProductID │ Name │ CatID │ BrandID CatID │ CatName BrandID │ BrandName 1 │ Laptop X │ 10B1 10 │ Electronics B1 │ TechCo 2 │ Phone Y │ 10B2 20 │ Furniture B2 │ MobileCo 3 │ Chair Z │ 20B3 B3 │ HomeCo "Electronics" stored once. Less storage, but every query needs 2-3 more joins.

Query complexity comparison

Star — one join

SELECT p.CategoryName,
       SUM(s.Amount)
FROM SalesFact s
JOIN ProductDim p
   ON s.ProductID = p.ProductID
GROUP BY p.CategoryName;

Snowflake — two joins

SELECT c.CatName,
       SUM(s.Amount)
FROM SalesFact s
JOIN ProductDim p ON s.ProductID = p.ProductID
JOIN CategoryDim c ON p.CatID = c.CatID
GROUP BY c.CatName;

When to pick which

Pick Star when…

Pick Snowflake when…

The 90% answer For most modern cloud data warehouses (Snowflake, BigQuery, Redshift), star schemas win. Storage is cheap; query speed matters more. Use snowflake only when you have a specific reason to normalize a particular dimension.