Dbt Transformation Patterns
Build production-ready data pipelines with dbt best practices
✨ The solution you've been looking for
Master dbt (data build tool) for analytics engineering with model organization, testing, documentation, and incremental strategies. Use when building data transformations, creating data models, or implementing analytics engineering best practices.
See It In Action
Interactive preview & real-world examples
AI Conversation Simulator
See how users interact with this skill
User Prompt
Help me structure a new dbt project for e-commerce analytics with customer and order data
Skill Processing
Analyzing request...
Agent Response
Complete project structure with proper folder organization, naming conventions, and layer separation
Quick Start (3 Steps)
Get up and running in minutes
Install
claude-code skill install dbt-transformation-patterns
claude-code skill install dbt-transformation-patternsConfig
First Trigger
@dbt-transformation-patterns helpCommands
| Command | Description | Required Args |
|---|---|---|
| @dbt-transformation-patterns setting-up-dbt-project-structure | Organize models into staging, intermediate, and marts layers following medallion architecture | None |
| @dbt-transformation-patterns creating-incremental-models | Build efficient incremental models for processing large datasets with proper merge strategies | None |
| @dbt-transformation-patterns implementing-data-quality-tests | Set up comprehensive testing for data models including custom business logic tests | None |
Typical Use Cases
Setting up dbt project structure
Organize models into staging, intermediate, and marts layers following medallion architecture
Creating incremental models
Build efficient incremental models for processing large datasets with proper merge strategies
Implementing data quality tests
Set up comprehensive testing for data models including custom business logic tests
Overview
dbt Transformation Patterns
Production-ready patterns for dbt (data build tool) including model organization, testing strategies, documentation, and incremental processing.
When to Use This Skill
- Building data transformation pipelines with dbt
- Organizing models into staging, intermediate, and marts layers
- Implementing data quality tests
- Creating incremental models for large datasets
- Documenting data models and lineage
- Setting up dbt project structure
Core Concepts
1. Model Layers (Medallion Architecture)
sources/ Raw data definitions
↓
staging/ 1:1 with source, light cleaning
↓
intermediate/ Business logic, joins, aggregations
↓
marts/ Final analytics tables
2. Naming Conventions
| Layer | Prefix | Example |
|---|---|---|
| Staging | stg_ | stg_stripe__payments |
| Intermediate | int_ | int_payments_pivoted |
| Marts | dim_, fct_ | dim_customers, fct_orders |
Quick Start
1# dbt_project.yml
2name: "analytics"
3version: "1.0.0"
4profile: "analytics"
5
6model-paths: ["models"]
7analysis-paths: ["analyses"]
8test-paths: ["tests"]
9seed-paths: ["seeds"]
10macro-paths: ["macros"]
11
12vars:
13 start_date: "2020-01-01"
14
15models:
16 analytics:
17 staging:
18 +materialized: view
19 +schema: staging
20 intermediate:
21 +materialized: ephemeral
22 marts:
23 +materialized: table
24 +schema: analytics
# Project structure
models/
├── staging/
│ ├── stripe/
│ │ ├── _stripe__sources.yml
│ │ ├── _stripe__models.yml
│ │ ├── stg_stripe__customers.sql
│ │ └── stg_stripe__payments.sql
│ └── shopify/
│ ├── _shopify__sources.yml
│ └── stg_shopify__orders.sql
├── intermediate/
│ └── finance/
│ └── int_payments_pivoted.sql
└── marts/
├── core/
│ ├── _core__models.yml
│ ├── dim_customers.sql
│ └── fct_orders.sql
└── finance/
└── fct_revenue.sql
Patterns
Pattern 1: Source Definitions
1# models/staging/stripe/_stripe__sources.yml
2version: 2
3
4sources:
5 - name: stripe
6 description: Raw Stripe data loaded via Fivetran
7 database: raw
8 schema: stripe
9 loader: fivetran
10 loaded_at_field: _fivetran_synced
11 freshness:
12 warn_after: { count: 12, period: hour }
13 error_after: { count: 24, period: hour }
14 tables:
15 - name: customers
16 description: Stripe customer records
17 columns:
18 - name: id
19 description: Primary key
20 tests:
21 - unique
22 - not_null
23 - name: email
24 description: Customer email
25 - name: created
26 description: Account creation timestamp
27
28 - name: payments
29 description: Stripe payment transactions
30 columns:
31 - name: id
32 tests:
33 - unique
34 - not_null
35 - name: customer_id
36 tests:
37 - not_null
38 - relationships:
39 to: source('stripe', 'customers')
40 field: id
Pattern 2: Staging Models
1-- models/staging/stripe/stg_stripe__customers.sql
2with source as (
3 select * from {{ source('stripe', 'customers') }}
4),
5
6renamed as (
7 select
8 -- ids
9 id as customer_id,
10
11 -- strings
12 lower(email) as email,
13 name as customer_name,
14
15 -- timestamps
16 created as created_at,
17
18 -- metadata
19 _fivetran_synced as _loaded_at
20
21 from source
22)
23
24select * from renamed
1-- models/staging/stripe/stg_stripe__payments.sql
2{{
3 config(
4 materialized='incremental',
5 unique_key='payment_id',
6 on_schema_change='append_new_columns'
7 )
8}}
9
10with source as (
11 select * from {{ source('stripe', 'payments') }}
12
13 {% if is_incremental() %}
14 where _fivetran_synced > (select max(_loaded_at) from {{ this }})
15 {% endif %}
16),
17
18renamed as (
19 select
20 -- ids
21 id as payment_id,
22 customer_id,
23 invoice_id,
24
25 -- amounts (convert cents to dollars)
26 amount / 100.0 as amount,
27 amount_refunded / 100.0 as amount_refunded,
28
29 -- status
30 status as payment_status,
31
32 -- timestamps
33 created as created_at,
34
35 -- metadata
36 _fivetran_synced as _loaded_at
37
38 from source
39)
40
41select * from renamed
Pattern 3: Intermediate Models
1-- models/intermediate/finance/int_payments_pivoted_to_customer.sql
2with payments as (
3 select * from {{ ref('stg_stripe__payments') }}
4),
5
6customers as (
7 select * from {{ ref('stg_stripe__customers') }}
8),
9
10payment_summary as (
11 select
12 customer_id,
13 count(*) as total_payments,
14 count(case when payment_status = 'succeeded' then 1 end) as successful_payments,
15 sum(case when payment_status = 'succeeded' then amount else 0 end) as total_amount_paid,
16 min(created_at) as first_payment_at,
17 max(created_at) as last_payment_at
18 from payments
19 group by customer_id
20)
21
22select
23 customers.customer_id,
24 customers.email,
25 customers.created_at as customer_created_at,
26 coalesce(payment_summary.total_payments, 0) as total_payments,
27 coalesce(payment_summary.successful_payments, 0) as successful_payments,
28 coalesce(payment_summary.total_amount_paid, 0) as lifetime_value,
29 payment_summary.first_payment_at,
30 payment_summary.last_payment_at
31
32from customers
33left join payment_summary using (customer_id)
Pattern 4: Mart Models (Dimensions and Facts)
1-- models/marts/core/dim_customers.sql
2{{
3 config(
4 materialized='table',
5 unique_key='customer_id'
6 )
7}}
8
9with customers as (
10 select * from {{ ref('int_payments_pivoted_to_customer') }}
11),
12
13orders as (
14 select * from {{ ref('stg_shopify__orders') }}
15),
16
17order_summary as (
18 select
19 customer_id,
20 count(*) as total_orders,
21 sum(total_price) as total_order_value,
22 min(created_at) as first_order_at,
23 max(created_at) as last_order_at
24 from orders
25 group by customer_id
26),
27
28final as (
29 select
30 -- surrogate key
31 {{ dbt_utils.generate_surrogate_key(['customers.customer_id']) }} as customer_key,
32
33 -- natural key
34 customers.customer_id,
35
36 -- attributes
37 customers.email,
38 customers.customer_created_at,
39
40 -- payment metrics
41 customers.total_payments,
42 customers.successful_payments,
43 customers.lifetime_value,
44 customers.first_payment_at,
45 customers.last_payment_at,
46
47 -- order metrics
48 coalesce(order_summary.total_orders, 0) as total_orders,
49 coalesce(order_summary.total_order_value, 0) as total_order_value,
50 order_summary.first_order_at,
51 order_summary.last_order_at,
52
53 -- calculated fields
54 case
55 when customers.lifetime_value >= 1000 then 'high'
56 when customers.lifetime_value >= 100 then 'medium'
57 else 'low'
58 end as customer_tier,
59
60 -- timestamps
61 current_timestamp as _loaded_at
62
63 from customers
64 left join order_summary using (customer_id)
65)
66
67select * from final
1-- models/marts/core/fct_orders.sql
2{{
3 config(
4 materialized='incremental',
5 unique_key='order_id',
6 incremental_strategy='merge'
7 )
8}}
9
10with orders as (
11 select * from {{ ref('stg_shopify__orders') }}
12
13 {% if is_incremental() %}
14 where updated_at > (select max(updated_at) from {{ this }})
15 {% endif %}
16),
17
18customers as (
19 select * from {{ ref('dim_customers') }}
20),
21
22final as (
23 select
24 -- keys
25 orders.order_id,
26 customers.customer_key,
27 orders.customer_id,
28
29 -- dimensions
30 orders.order_status,
31 orders.fulfillment_status,
32 orders.payment_status,
33
34 -- measures
35 orders.subtotal,
36 orders.tax,
37 orders.shipping,
38 orders.total_price,
39 orders.total_discount,
40 orders.item_count,
41
42 -- timestamps
43 orders.created_at,
44 orders.updated_at,
45 orders.fulfilled_at,
46
47 -- metadata
48 current_timestamp as _loaded_at
49
50 from orders
51 left join customers on orders.customer_id = customers.customer_id
52)
53
54select * from final
Pattern 5: Testing and Documentation
1# models/marts/core/_core__models.yml
2version: 2
3
4models:
5 - name: dim_customers
6 description: Customer dimension with payment and order metrics
7 columns:
8 - name: customer_key
9 description: Surrogate key for the customer dimension
10 tests:
11 - unique
12 - not_null
13
14 - name: customer_id
15 description: Natural key from source system
16 tests:
17 - unique
18 - not_null
19
20 - name: email
21 description: Customer email address
22 tests:
23 - not_null
24
25 - name: customer_tier
26 description: Customer value tier based on lifetime value
27 tests:
28 - accepted_values:
29 values: ["high", "medium", "low"]
30
31 - name: lifetime_value
32 description: Total amount paid by customer
33 tests:
34 - dbt_utils.expression_is_true:
35 expression: ">= 0"
36
37 - name: fct_orders
38 description: Order fact table with all order transactions
39 tests:
40 - dbt_utils.recency:
41 datepart: day
42 field: created_at
43 interval: 1
44 columns:
45 - name: order_id
46 tests:
47 - unique
48 - not_null
49 - name: customer_key
50 tests:
51 - not_null
52 - relationships:
53 to: ref('dim_customers')
54 field: customer_key
Pattern 6: Macros and DRY Code
1-- macros/cents_to_dollars.sql
2{% macro cents_to_dollars(column_name, precision=2) %}
3 round({{ column_name }} / 100.0, {{ precision }})
4{% endmacro %}
5
6-- macros/generate_schema_name.sql
7{% macro generate_schema_name(custom_schema_name, node) %}
8 {%- set default_schema = target.schema -%}
9 {%- if custom_schema_name is none -%}
10 {{ default_schema }}
11 {%- else -%}
12 {{ default_schema }}_{{ custom_schema_name }}
13 {%- endif -%}
14{% endmacro %}
15
16-- macros/limit_data_in_dev.sql
17{% macro limit_data_in_dev(column_name, days=3) %}
18 {% if target.name == 'dev' %}
19 where {{ column_name }} >= dateadd(day, -{{ days }}, current_date)
20 {% endif %}
21{% endmacro %}
22
23-- Usage in model
24select * from {{ ref('stg_orders') }}
25{{ limit_data_in_dev('created_at') }}
Pattern 7: Incremental Strategies
1-- Delete+Insert (default for most warehouses)
2{{
3 config(
4 materialized='incremental',
5 unique_key='id',
6 incremental_strategy='delete+insert'
7 )
8}}
9
10-- Merge (best for late-arriving data)
11{{
12 config(
13 materialized='incremental',
14 unique_key='id',
15 incremental_strategy='merge',
16 merge_update_columns=['status', 'amount', 'updated_at']
17 )
18}}
19
20-- Insert Overwrite (partition-based)
21{{
22 config(
23 materialized='incremental',
24 incremental_strategy='insert_overwrite',
25 partition_by={
26 "field": "created_date",
27 "data_type": "date",
28 "granularity": "day"
29 }
30 )
31}}
32
33select
34 *,
35 date(created_at) as created_date
36from {{ ref('stg_events') }}
37
38{% if is_incremental() %}
39where created_date >= dateadd(day, -3, current_date)
40{% endif %}
dbt Commands
1# Development
2dbt run # Run all models
3dbt run --select staging # Run staging models only
4dbt run --select +fct_orders # Run fct_orders and its upstream
5dbt run --select fct_orders+ # Run fct_orders and its downstream
6dbt run --full-refresh # Rebuild incremental models
7
8# Testing
9dbt test # Run all tests
10dbt test --select stg_stripe # Test specific models
11dbt build # Run + test in DAG order
12
13# Documentation
14dbt docs generate # Generate docs
15dbt docs serve # Serve docs locally
16
17# Debugging
18dbt compile # Compile SQL without running
19dbt debug # Test connection
20dbt ls --select tag:critical # List models by tag
Best Practices
Do’s
- Use staging layer - Clean data once, use everywhere
- Test aggressively - Not null, unique, relationships
- Document everything - Column descriptions, model descriptions
- Use incremental - For tables > 1M rows
- Version control - dbt project in Git
Don’ts
- Don’t skip staging - Raw → mart is tech debt
- Don’t hardcode dates - Use
{{ var('start_date') }} - Don’t repeat logic - Extract to macros
- Don’t test in prod - Use dev target
- Don’t ignore freshness - Monitor source data
Resources
What Users Are Saying
Real feedback from the community
Environment Matrix
Dependencies
Framework Support
Context Window
Security & Privacy
Information
- Author
- wshobson
- Updated
- 2026-01-30
- Category
- productivity-tools
Related Skills
Dbt Transformation Patterns
Master dbt (data build tool) for analytics engineering with model organization, testing, …
View Details →Airflow Dag Patterns
Build production Apache Airflow DAGs with best practices for operators, sensors, testing, and …
View Details →Airflow Dag Patterns
Build production Apache Airflow DAGs with best practices for operators, sensors, testing, and …
View Details →