Texas A&M UniversityWork In Progress

Star schema best practices, conformed dimensions, relationship rules, and why bidirectional filters are dangerous.

Data Modeling with AggieDB

A well-structured data model is the difference between a report that loads in 2 seconds and one that times out. AggieDB uses the star schema pattern — a central fact table surrounded by dimension tables — which is the most performant layout for Power BI's VertiPaq and DirectLake engines.

Audience
Report builders ready to move beyond single-table reports
Key Rule
Facts in the center, dimensions around the edges, relationships always one-way

Star Schema Anatomy

              ┌──────────┐
              │   Date   │
              │ dimension│
              └────┬─────┘
                   │ 1:many
┌──────────┐  ┌───┴──────┐  ┌──────────┐
│  Person  │──│   Fact   │──│  Org     │
│ dimension│  │  Table   │  │ dimension│
└──────────┘  └──────────┘  └──────────┘
  • Fact tables contain the things you measure (storage bytes, user counts, event timestamps)
  • Dimension tables contain the things you filter and group by (departments, dates, people)
  • Relationships always point from dimension to fact (one-to-many)

The Three Conformed Dimensions

AggieDB defines three dimensions shared across all fact tables:

DimensionKey ColumnWhat It Contains
DateDate_Key (INT, YYYYMMDD)Calendar + fiscal dates, day of week, quarter, month names
PersonUser_Principal_NameDisplay name, department, job title, manager, campus
OrganizationOrg_IdDepartment hierarchy, cost center, division

Conformed means consistent. If the Date table says FY 2026 starts September 1, that's true in every model. If Person[Department] says "Engineering," that means the same thing whether you're looking at email storage or Teams usage.

Relationship Rules

Always One-Way (Single Direction)

✓ Person ──(1:many)──> EXO_Storage     (filter flows Person → EXO_Storage)
✗ Person <──(both)──> EXO_Storage      (bidirectional — avoid)

Why: Bidirectional filters create ambiguity when multiple fact tables share a dimension. The engine doesn't know which path to take, leading to wrong results or performance degradation.

One-to-Many Only

RelationshipAllowed?Why
Dimension (1) → Fact (many)YesStandard star schema
Fact (many) → Fact (many)NoUse a shared dimension instead
Dimension (1) → Dimension (1)RareOnly for snowflake hierarchies

No Circular References

If Table A relates to Table B, and Table B relates to Table C, Table C cannot relate back to Table A. Power BI will flag this as an error.

When to Denormalize

Sometimes breaking the rules is the right call:

ScenarioNormalize (Star Schema)Denormalize (Flat Table)
50+ million rows, multiple dimensionsYesNo — too large to flatten
< 100K rows, simple analysisOptionalYes — simpler model
Multiple fact tables sharing dimensionsYes — conformed dimensionsNo — duplication
One-off analysis, not sharedOptionalYes — faster to build

Common Modeling Mistakes

Mistake: Using a Single Flat Table

Problem: Everything in one table — dates, users, metrics all in the same rows. Works for Excel, breaks in Power BI at scale.

Fix: Extract dimensions into separate tables. Keep only keys and measures in the fact table.

Mistake: Bidirectional Filters Everywhere

Problem: "My slicer doesn't work" → developer enables bidirectional → slicer works but other visuals break or show wrong numbers.

Fix: If a slicer needs to filter across two fact tables, they should share a conformed dimension. Use CROSSFILTER in DAX for specific measures, not model-level bidirectional.

Mistake: Calculated Columns Instead of Measures

Problem: A column like Storage_GB calculated as Storage_Bytes / 1073741824 in the table. Increases model size, can't be filtered dynamically.

Fix: Create a measure instead. Measures compute on demand and respond to filter context.

Mistake: Too Many Columns

Problem: Loading 50 columns from a table when the report uses 8. DirectLake loads all columns in referenced tables.

Fix: In the semantic model, hide unused columns. Better yet, create a view in the Silver/Gold tier that only includes the columns you need.

Relationship Syntax (TMDL)

When working code-first, relationships are defined in relationships.tmdl:

relationship Date_to_EXO_Storage
    fromColumn: Date.Date_Key
    toColumn: EXO_Storage.Date_Key
    crossFilteringBehavior: oneDirection

Key fields:

  • crossFilteringBehavior: oneDirection — always set this (default is bothDirections which we avoid)
  • fromColumn — the "one" side (dimension)
  • toColumn — the "many" side (fact)

Naming conventions, schema design, conformed dimensions

Common measures for AggieDB models