3. Physical Data Model
Turn the logical design into a real database — exact data types, indexes, tablespaces, the works.
What is a Physical Data Model?
The Physical Data Model (PDM) is the database-specific implementation of your logical model. It contains:
- Tables (instead of entities) with database-friendly names
- Columns with vendor-specific data types:
VARCHAR(50),NUMERIC(18,4),TIMESTAMP WITH TIME ZONE - Indexes — clustered, non-clustered, covering, partial
- Constraints — primary key, foreign key, unique, check, default
- Storage — tablespaces, partitions, file groups
- Triggers, sequences, and views
The audience is DBAs and the database engine itself. This is what your forward-engineering will turn into CREATE TABLE statements.
Setting up the physical layer
Pick a target database
Choose from the dropdown. Common options:
- Microsoft SQL Server 2019/2022
- Oracle 19c/21c
- PostgreSQL 14/15/16
- MySQL 8
- IBM Db2
- Snowflake — for data warehouses
- Amazon Redshift — for AWS warehouses
Pick the one matching your actual deployment target. For this walkthrough, choose PostgreSQL 15.
ERWIN now knows the data types and SQL dialect to use for generation.
Switch to Physical view
The canvas updates. Entities are now tables. Attributes are now columns. The names follow your physical naming convention (which is "same as logical with spaces removed" by default).
You might see "Customer ID" become "CustomerID" or "CUSTOMER_ID" depending on the default naming standard. We'll customize this in the next step.
Set up a naming standard
You'll get a dialog with rules. Common conventions:
- Tables: snake_case singular ("customer") or plural ("customers")
- Columns: snake_case ("customer_id", "first_name")
- Indexes: idx_table_columns ("idx_orders_customer_id")
- Foreign keys: fk_child_parent ("fk_orders_customer")
- Primary keys: pk_table ("pk_customer")
Apply the standard:
All your tables and columns are renamed at once, consistently.
Refine column data types
The auto-mapped types are usually reasonable but rarely perfect. Open each table and verify:
customer table
- customer_id →
BIGINT(use BIGINT not INT for IDs — INT overflows at ~2.1B rows) - first_name →
VARCHAR(50) - last_name →
VARCHAR(50) - email →
VARCHAR(254)(RFC max email length) - phone →
VARCHAR(20) - created_date →
TIMESTAMP WITH TIME ZONE(always with TZ)
order table
- order_id →
BIGINT - customer_id →
BIGINT NOT NULL - shipping_address_id →
BIGINT NOT NULL - order_date →
TIMESTAMP WITH TIME ZONE NOT NULL - order_status →
VARCHAR(20) NOT NULL - total_amount →
NUMERIC(12,2) NOT NULL(cents matter)
FLOAT or REAL for money — they have rounding errors that compound over many transactions. Always use NUMERIC / DECIMAL with explicit precision and scale.
Add indexes
Indexes speed up reads but slow down writes. Add them strategically.
For each table, right-click → Indexes → New. Recommended indexes for our e-commerce model:
| Table | Index | Columns | Why |
|---|---|---|---|
| customer | idx_customer_email | Login lookups | |
| order | idx_order_customer_id | customer_id | "Show my orders" |
| order | idx_order_date | order_date DESC | Recent orders queries |
| order_line | idx_order_line_order | order_id | "Show this order's items" |
| order_line | idx_order_line_product | product_id | "Where did this product sell?" |
| product | idx_product_sku | sku UNIQUE | SKU lookup is constant |
| product | idx_product_category | category_id, active | Browse by category |
For each, set the Index Type (B-tree is default; PostgreSQL also has Hash, GIN, BRIN — use B-tree unless you have a specific reason).
Set defaults and check constraints
For each column where appropriate, set:
- created_date DEFAULT
CURRENT_TIMESTAMP - active on Product, DEFAULT
TRUE - order_status CHECK
order_status IN ('Pending','Shipped','Delivered','Cancelled') - quantity on Order Line, CHECK
quantity > 0 - unit_price on Order Line, CHECK
unit_price >= 0
In Property Editor for each column → Default and Validation fields.
Configure surrogate-key generation
Your primary key columns need to auto-generate values. PostgreSQL uses GENERATED BY DEFAULT AS IDENTITY (or sequences):
For each PK column:
- Open column properties
- Set Identity = Yes
- Or attach a sequence (Postgres way): right-click table → Sequences → New
SQL Server uses IDENTITY(1,1), MySQL uses AUTO_INCREMENT, Oracle uses sequences. ERWIN handles the dialect for you — you just say "Identity" and it generates the right syntax.
Set up tablespaces / file groups (optional)
For larger databases, you'll split tables across storage tiers:
Create tablespaces:
- ts_data — for table data
- ts_index — for indexes (often on faster storage)
- ts_archive — for old partitions on cheap storage
Then assign each table/index to a tablespace in its properties. Skip this for development; do it for production-grade models.
Validate the physical model
The physical-mode validator catches:
- Tables without indexes on foreign keys (FK lookups will be slow)
- Tables without any indexes at all (full scans on every query)
- Columns with type-precision mismatches
- Naming convention violations
- Missing tablespace assignments (if you use them)
Fix anything critical, save as ecommerce_physical.erwin.
Logical vs Physical — side by side
Logical (Customer entity)
Physical (customer table)
What's next: generate the SQL
Your physical model now has everything needed to create a real database. In the next lesson, we'll use forward engineering to turn this model into a .sql file ready to run against PostgreSQL.