Data Modeling with AggieDB
A well-structured data model is the difference between a report that loads in 2 seconds and one that times out. AggieDB uses the star schema pattern — a central fact table surrounded by dimension tables — which is the most performant layout for Power BI's VertiPaq and DirectLake engines.
Star Schema Anatomy
┌──────────┐
│ Date │
│ dimension│
└────┬─────┘
│ 1:many
┌──────────┐ ┌───┴──────┐ ┌──────────┐
│ Person │──│ Fact │──│ Org │
│ dimension│ │ Table │ │ dimension│
└──────────┘ └──────────┘ └──────────┘
- Fact tables contain the things you measure (storage bytes, user counts, event timestamps)
- Dimension tables contain the things you filter and group by (departments, dates, people)
- Relationships always point from dimension to fact (one-to-many)
The Three Conformed Dimensions
AggieDB defines three dimensions shared across all fact tables:
| Dimension | Key Column | What It Contains |
|---|---|---|
| Date | Date_Key (INT, YYYYMMDD) | Calendar + fiscal dates, day of week, quarter, month names |
| Person | User_Principal_Name | Display name, department, job title, manager, campus |
| Organization | Org_Id | Department hierarchy, cost center, division |
Conformed means consistent. If the Date table says FY 2026 starts September 1, that's true in every model. If Person[Department] says "Engineering," that means the same thing whether you're looking at email storage or Teams usage.
Relationship Rules
Always One-Way (Single Direction)
✓ Person ──(1:many)──> EXO_Storage (filter flows Person → EXO_Storage)
✗ Person <──(both)──> EXO_Storage (bidirectional — avoid)
Why: Bidirectional filters create ambiguity when multiple fact tables share a dimension. The engine doesn't know which path to take, leading to wrong results or performance degradation.
One-to-Many Only
| Relationship | Allowed? | Why |
|---|---|---|
| Dimension (1) → Fact (many) | Yes | Standard star schema |
| Fact (many) → Fact (many) | No | Use a shared dimension instead |
| Dimension (1) → Dimension (1) | Rare | Only for snowflake hierarchies |
No Circular References
If Table A relates to Table B, and Table B relates to Table C, Table C cannot relate back to Table A. Power BI will flag this as an error.
When to Denormalize
Sometimes breaking the rules is the right call:
| Scenario | Normalize (Star Schema) | Denormalize (Flat Table) |
|---|---|---|
| 50+ million rows, multiple dimensions | Yes | No — too large to flatten |
| < 100K rows, simple analysis | Optional | Yes — simpler model |
| Multiple fact tables sharing dimensions | Yes — conformed dimensions | No — duplication |
| One-off analysis, not shared | Optional | Yes — faster to build |
Common Modeling Mistakes
Mistake: Using a Single Flat Table
Problem: Everything in one table — dates, users, metrics all in the same rows. Works for Excel, breaks in Power BI at scale.
Fix: Extract dimensions into separate tables. Keep only keys and measures in the fact table.
Mistake: Bidirectional Filters Everywhere
Problem: "My slicer doesn't work" → developer enables bidirectional → slicer works but other visuals break or show wrong numbers.
Fix: If a slicer needs to filter across two fact tables, they should share a conformed dimension. Use CROSSFILTER in DAX for specific measures, not model-level bidirectional.
Mistake: Calculated Columns Instead of Measures
Problem: A column like Storage_GB calculated as Storage_Bytes / 1073741824 in the table. Increases model size, can't be filtered dynamically.
Fix: Create a measure instead. Measures compute on demand and respond to filter context.
Mistake: Too Many Columns
Problem: Loading 50 columns from a table when the report uses 8. DirectLake loads all columns in referenced tables.
Fix: In the semantic model, hide unused columns. Better yet, create a view in the Silver/Gold tier that only includes the columns you need.
Relationship Syntax (TMDL)
When working code-first, relationships are defined in relationships.tmdl:
relationship Date_to_EXO_Storage
fromColumn: Date.Date_Key
toColumn: EXO_Storage.Date_Key
crossFilteringBehavior: oneDirection
Key fields:
crossFilteringBehavior: oneDirection— always set this (default isbothDirectionswhich we avoid)fromColumn— the "one" side (dimension)toColumn— the "many" side (fact)
Related Documentation
Naming conventions, schema design, conformed dimensions
Common measures for AggieDB models