

Case Study
Building Trust in Data Migrations with DBT: The Power of audit_helper
During a major dbt refactoring, we faced challenges in ensuring data accuracy—and found a reliable solution in the audit_helper library.
Author – Ankit Devani
Overview
When migrating data to a new warehouse, optimizing dbt models, or cleaning up legacy transformations, even small changes can lead to unexpected discrepancies.
Traditional validation often depends on manual SQL, spreadsheets, or repetitive testing. These methods are slow, tedious, and prone to human error.
That’s where audit_helper comes in.
What Is audit_helper ?
Developed by dbt Labs, audit_helper is an open-source dbt package designed to compare two datasets—typically your original and updated models—on a row-by-row and column-by-column basis.

Key Capabilities
Row-by-Row Comparison
Detects added, removed, modified, and unchanged records with precision.
Column-Level Discrepancy Detection
Provides insight into exactly which columns have changed, allowing for pinpointed
debugging and deeper trust in the results.
Summary Statistics
High-level differences are clearly summarized, making it easy to assess impact quickly.
Dbt-Native Integration
Since it’s designed to work directly with dbt, implementation is simple and non-disruptive to existing workflows.
Real-World Impact
See how audit_helper transformed our data validation process:
60%
Faster UT and UAT Cycles
By automating the comparison process, we drastically reduced time spent on Unit Testing (UT) and User Acceptance Testing (UAT).
80%
Enhanced Trust in Data
The detailed discrepancy reports gave us confidence that no issues slipped through the cracks.
20%
less man power
Seamless integration into our dbt environment required no complicated setups or external tools.
How to Use audit_helper in Your Workflow

Step 1: Install the Package
in your packages.yml
packages:
- package: dbt-labs/audit_helper
version: 0.9.0
Then run:
dbt deps

Step 2: Set Up an Audit Model
Create a model in models/audit/ named audit_by_row.sql:
{{ audit_helper.compare_and_classify_relation_rows(
a_relation=ref('original_model'),
b_relation=ref('refactored_model'),
primary_key_columns=['id']
) }}

Step 3: Run the Audit
dbt run --select audit_by_row
You can also tag your audit models to isolate them from your main runs.
A Word of Caution
While audit_helper is incredibly useful, keep these in mind:
- It’s not designed for historical logging—each audit run replaces the previous result.
- Best used in development or staging environments due to its table replacement behavior.
Quantifying the Impact:
In one of our recent projects, we refactored 200+ dbt models with complex logic. What previously required a team and weeks of effort, now took just 3 days with audit_helper
.
Traditional Approach




With audit_helper





This allowed the team to focus on delivering new value instead of spending time chasing bugs or manually verifying data.
Final Thoughts
By automating data validation with audit_helper
, we streamlined complex model checks into a fast, reliable process—catching issues early and eliminating manual guesswork.
This approach not only delivered unmatched speed and accuracy, but also built trust into every row—giving our team full transparency into model changes and the assurance that nothing broke along the way.
When clients know their data is validated and production-ready, it transforms deliverables from “good enough” to confidently dependable—elevating both the work and the partnership.