🛠 Learn ERWIN

Many-to-Many with Bridges

When one product belongs to multiple categories — and how to model it cleanly.

The interview question "How would you handle a many-to-many relationship in a star schema?"

The scenario

A laptop sold on your e-commerce site might belong to multiple categories at once: Electronics AND Computers AND Gaming Gear. A single product → many categories. And a single category → many products. That's a many-to-many relationship.

Star schemas assume a fact joins to one row per dimension. So how do you fit a many-to-many?

The wrong approach: duplicate fact rows

One naive approach is to insert one fact row per category:

OrderFacts (BAD — duplicated rows) OrderID │ ProductID │ Category │ Amount 1P001 │ Electronics │ 1200 1P001 │ Computers │ 1200 1P001 │ Gaming │ 1200 Problem: SUM(Amount) is now $3,600 for one $1,200 order. Total revenue gets triple-counted.

The right approach: bridge table

A bridge table (also called a junction table or many-to-many table) sits between the two dimensions and stores the relationships.

ProductDim ProductCategoryBridge CategoryDim ├── ProductSK ├── ProductSK (FK) ├── CategorySK ├── ProductName ├── CategorySK (FK) ├── CategoryName ├── BrandID ├── IsPrimary └── ... └── ... └── Weight (optional split factor)

Then the fact joins through the bridge:

OrderFacts ProductDim ├── OrderFactSK ├── ProductSK ├── ProductSK (FK) ─────────────────► ├── ProductName ├── CustomerSK (FK) └── ... ├── Quantity └── OrderAmount ProductCategoryBridge ├── ProductSK ├── CategorySK └── IsPrimary CategoryDim ├── CategorySK └── CategoryName

The bridge table contents

ProductCategoryBridge ProductSK │ CategorySK │ IsPrimary P001CAT001Y ← Electronics (primary) P001CAT005N ← Computers P001CAT012N ← Gaming Gear P002CAT003Y ← Phones (primary)

Querying through the bridge

"Sales by primary category"

Use the IsPrimary flag to avoid double-counting:

SQL
SELECT cat.CategoryName, SUM(o.OrderAmount) AS TotalSales FROM OrderFacts o JOIN ProductCategoryBridge b ON o.ProductSK = b.ProductSK JOIN CategoryDim cat ON b.CategorySK = cat.CategorySK WHERE b.IsPrimary = 'Y' GROUP BY cat.CategoryName;

"Sales counted in every applicable category"

Skip the IsPrimary filter when you intentionally want products to appear in multiple categories (e.g. for filter/search analytics):

SQL
SELECT cat.CategoryName, SUM(o.OrderAmount) AS TotalSales FROM OrderFacts o JOIN ProductCategoryBridge b ON o.ProductSK = b.ProductSK JOIN CategoryDim cat ON b.CategorySK = cat.CategorySK GROUP BY cat.CategoryName; -- Caveat: this triple-counts the $1,200 order across 3 categories. -- Acceptable if the question is "how prominent is each category?"

Advanced: weighted bridges

For some scenarios, you might want to split a measure across the relationships. Add a Weight column to the bridge:

ProductSK │ CategorySK │ Weight P001CAT0010.60 ← 60% allocated to Electronics P001CAT0050.30 ← 30% to Computers P001CAT0120.10 ← 10% to Gaming Gear
SQL — weighted attribution
SELECT cat.CategoryName, SUM(o.OrderAmount * b.Weight) AS AttributedSales FROM OrderFacts o JOIN ProductCategoryBridge b ON o.ProductSK = b.ProductSK JOIN CategoryDim cat ON b.CategorySK = cat.CategorySK GROUP BY cat.CategoryName;

Other common bridge use cases

Trade-offs to mention

The interview answer "I'd use a bridge table — a junction table with foreign keys to both dimensions. Add an IsPrimary flag or weight column to handle attribution and avoid double-counting. The fact joins to the first dimension, then through the bridge to the second."