🏛 Data Modeling

Best Practices

The conventions and patterns that separate junior modelers from senior ones.

Naming conventions

The single biggest source of pain in real-world data models is inconsistent naming. Pick a standard, document it, and enforce it via ERWIN's naming standards.

Entity / table names

Column names

Constraint names

Named constraints are dramatically easier to debug — error messages reference the name, not an auto-generated number.

Use surrogate keys for all dimensions

Don't use natural keys (email, SSN, product code) as primary keys. Use an auto-generated integer/BIGINT instead. Detailed reasoning here.

The shorter rule Every table gets a numeric surrogate primary key named {table}_id. Natural keys become regular columns with UNIQUE constraints if needed.

Audit columns on every table

Add these four columns to every business-data table:

In ERWIN, define these as a Domain or template so adding them is one click. Future-you (when debugging "why did this record change yesterday?") will thank you.

Soft deletes vs hard deletes

For data you might need to recover or audit, use soft deletes:

Hard deletes are fine for ephemeral data (logs, sessions, caches) where recovery is meaningless.

Use ERWIN's domains religiously

Don't type VARCHAR(50) in 30 places. Define a domain once, reference it everywhere. Common domains:

DomainDefinitionUsed for
idBIGINT IDENTITYAll primary keys
fk_idBIGINT NOT NULLAll foreign keys
name_shortVARCHAR(50)First name, last name, etc.
name_longVARCHAR(255)Full names, titles
emailVARCHAR(254)RFC max email length
phoneVARCHAR(20)International phone
moneyNUMERIC(12,2)Prices, amounts
large_moneyNUMERIC(18,2)Account balances
percentageNUMERIC(5,2)0.00 to 100.00
flagBOOLEAN NOT NULL DEFAULT FALSEis_* and has_* columns
timestamp_auditTIMESTAMPTZ DEFAULT NOW()created_at, updated_at
status_codeVARCHAR(20)Status enums

Subject areas — break up big models

A single diagram with 100+ entities is unreadable. Use subject area diagrams:

Common subject areas in an enterprise model:

An entity can appear in multiple subject areas — useful for "boundary" entities like customer or product.

Common patterns to know

Lookup tables

Don't hardcode status values as strings. Create a lookup table:

order_status_lookup ├ PK status_code VARCHAR(20) ├ description VARCHAR(100) ├ is_terminal BOOLEAN └ display_order INT order ├ ... └ FK status_code → order_status_lookup

Audit log pattern

Generic audit trail for any record change:

audit_log ├ PK audit_id BIGINT IDENTITY ├ table_name VARCHAR(50) ├ record_id BIGINT ├ action VARCHAR(10) -- INSERT/UPDATE/DELETE ├ old_values JSONB -- pre-change snapshot ├ new_values JSONB -- post-change snapshot ├ changed_by VARCHAR(50) └ changed_at TIMESTAMPTZ

Polymorphic associations (use carefully)

When a child can belong to one of several parent types:

comment ├ PK comment_id BIGINT IDENTITY ├ commentable_type VARCHAR(50) -- 'post', 'product', 'order' ├ commentable_id BIGINT -- FK to one of those tables ├ author_id BIGINT FK → user └ body TEXT

ERWIN can't draw proper FKs for this — the relationship is enforced in application code. Convenient for some patterns, but you lose database-level referential integrity. Use sparingly.

Pitfalls to avoid

❌ Storing comma-separated values

Never. tags VARCHAR(200) = "red,large,sale" is wrong. Always normalize to a related table (product_tag with one row per tag).

❌ Magic numbers as status

Don't store order_status INT = 3. Use a status code string ("Shipped") or a lookup table. Future-you will not remember what 3 means.

❌ Single-table inheritance abuse

"Let's put customers, vendors, and employees in one 'person' table with a type column." This works for 2 weeks then becomes a nightmare. Use separate tables; share an abstract logical entity in the LDM if needed.

❌ Mixing concerns in one column

Don't put a date and a flag in the same column. Don't put currency and amount in one VARCHAR. Each column = one fact.

❌ Forgetting indexes on FK columns

Every FK should have an index on it. ERWIN doesn't always auto-create this in some database dialects. Check after generating SQL.

❌ Not documenting

The "what does this table mean" definitions take 30 seconds to write and save hours of confusion later. Always fill them in.

Version control your models

Save ERWIN models as XML for Git compatibility:

The senior modeler mindset Junior modelers think about "what data does this represent?" Senior modelers think about "who will query this in 3 years and how will they understand it?" Naming, definitions, and consistency aren't bureaucracy — they're the gift you give your future colleagues.