Marts for the Semantic Layer
The dbt Semantic Layer alters some fundamental principles of how you organize your project. Using dbt without the Semantic Layer necessitates creating the most useful combinations of your building block components into wide, denormalized marts. On the other hand, the Semantic Layer leverages MetricFlow to denormalize every possible combination of components we've encoded dynamically. As such we're better served to bring more normalized models through from the logical layer into the Semantic Layer to maximize flexibility. This section will assume familiarity with the best practices laid out in the How we build our metrics guide, so check that out first for a more hands-on introduction to the Semantic Layer.
Semantic Layer: Files and foldersβ
- 2οΈβ£ There are two major factors that alter our recommendations for the Semantic Layer:
- π There is more YAML in the form of semantic models and metrics.
- β« We may use a staging model directly if it forms a complete normalized component, and it will not have a mart at all.
- πͺ This combination means models at both the staging and marts layer may participate in the Semantic Layer and use more powerful, expansive YAML configuration.
- π Given this, for projects using the Semantic Layer we recommend a YAML-file-per-model approach, as below.
models
βββ marts
βΒ Β βββ customers.sql
βΒ Β βββ customers.yml
βΒ Β βββ orders.sql
βΒ Β βββ orders.yml
βββ staging
βββ __sources.yml
βββ stg_customers.sql
βββ stg_customers.yml
βββ stg_locations.sql
βββ stg_locations.yml
βββ stg_order_items.sql
βββ stg_order_items.yml
βββ stg_orders.sql
βββ stg_orders.yml
βββ stg_products.sql
βββ stg_products.yml
βββ stg_supplies.sql
βββ stg_supplies.yml
Semantic Layer: Where and why?β
π Directory structure: Add your semantic models to
models/semantic_models
with directories corresponding to the models/marts files. This type of organization makes it easier to search and find what you can join. It also supports better maintenance and reduces repeated code.models/marts/sem_orders.ymlsemantic_models:
- name: orders
defaults:
agg_time_dimension: order_date
description: |
Order fact table. This tableβs grain is one row per order.
model: ref('fct_orders')
entities:
- name: order_id
type: primary
- name: customer_id
type: foreign
dimensions:
- name: order_date
type: time
type_params:
time_granularity: day
Naming conventionβ
- π·οΈ Semantic model names: Use the
sem_
prefix for semantic model names, such assem_cloud_user_account_activity
. This follows the same pattern as other naming conventions likefct_
for fact tables anddim_
for dimension tables. - 𧩠Entity names: Don't use prefixes in Entity within the semantic model. This keeps the names clear and focused on their specific purpose without unnecessary prefixes.
This guidance helps you make sure your dbt project is organized, maintainable, and scalable, allowing you to take full advantage of the capabilities offered by the dbt Semantic Layer.
When to make a martβ
- β If we can go directly to staging models and it's better to serve normalized models to the Semantic Layer, then when, where, and why would we make a mart?
- π°οΈ We have models that have measures but no time dimension to aggregate against. The details of this are laid out in the Semantic Layer guide but in short, we need a time dimension to aggregate against in MetricFlow. Dimensional tables that
- 𧱠We want to materialize our model in various ways.
- π― We want to version our model.
- π We have various related models that make more sense as one wider component.
- 1οΈβ£ We have similar models across multiple data sources that make more sense unioned together.
- β We have models in our project we need to time to refactor but want to serve up to the Semantic Layer quickly.
- π Any of the above and more are great reasons to build a mart. Analytics engineering is about creativity and problem solving, so these are not prescriptive rules, there are many reasons to build marts in any project. The most important takeaway is that you don't have to if you're using the Semantic Layer.