AI code review
Buster reviews every PR, runs data diffs to validate changes, and investigates downstream impact. It consistently catches data-specific bugs that other AI code reviewers miss.
PR opened by Nate Sanders
GitHub Pull Request
Get BugBot
invoice-project-join-fix
buster
bot
reviewed 1 minute ago
Needs Review: Invoices may “disappear” from project totals
This PR changes how intermediate_finance_materials_invoice assigns each invoice line to a project. It now matches using invoice.po_number (trimmed) instead of the existing project_id link.
What I Found (Data Diff + Lineage Check)
Overall: more invoice lines get a project (+6,461 rows, ~+44%)
Wins: 7,071 rows that previously had no project now get one via po_number
Risk: 610 rows lose a project because some po_number values aren’t usable ids (they’re free-text like names and addresses), so the new match fails
Downstream Impact
Those 610 lines will stop counting toward project spend in downstream models like:
fct_project_spend (facts: invoice lines rolled up to project)
mart_project_costs_daily (daily costs by project)
mart_budget_vs_actual (budget vs actual by project)
This will result in some projects looking artificially cheaper (or “missing invoices”) even though the underlying invoices still exist.
Suggested Fix
Use a COALESCE to fallback to materials_order.project_id when po_number doesn’t match. This results in 100% coverage (0 missing values) while preserving the new matches.
Alert & issue triage
Buster investigates job failures, identifies root cause, and responds accordingly - opening PRs with fixes, sending Slack alerts, or intelligently filtering out false-positives.
Airflow DAG failed
Slack
Get BugBot
#data-alerts
8 members
Airflow Alert
APP
3 min ago
🚨 DAG Failed: analytics_daily
Execution Time: 2025-12-14T04:23:17+00:00
Log URL: http://airflow-qos-k8s-r1.example.com/dags/01831
Error Reason: None
1 reply
Buster
APP
1 min ago
What happened:
Test `unique_fct_orders_order_id` failed. 342 duplicate `order_id` values were found in `fct_orders`.
What I found:
Duplicates started 2025-12-14 at 6:00pm
Root cause: Refund adjustments from Stripe (Fivetran) are now syncing as separate rows instead of updating the original transaction
Each refunded order now has 2 rows: original + refund adjustment
Downstream impact:
`fct_orders`: Order count inflated ~2.3%
`mart_daily_sales`: Revenue double-counted for refunded orders
“Quarterly Executive Report” (Hex): GMV is now overstated
I opened a PR with a quick fix:
Updated `stg_orders` to deduplicate by `order_id`, keeping the most recent record and aggregating refund amounts.
PR #612: Handle refund adjustment rows in stg_orders
https://github.com/acme-analytics/dbt/pull/612
View Run
Open in Buster
Upstream change detection
Buster monitors source table schemas, JSON columns, and upstream repositories for changes. When something shifts, it investigates downstream impact and opens a PR to integrate or fix impacted models.
Upstream schema change detected
Slack
Get BugBot
#data-alerts
8 members
Buster
APP
1 min ago
🚨 Upstream Schema Change Detected
What changed:
Column `user_type` was renamed to `account_type` in the source database. Detected in Fivetran sync at 2025-03-14T06:15:22+00:00.
What I found:
Traced the change back to PR #1847 in the `core-api` repo: "Refactor user model naming conventions"
`user_type` is referenced in 4 downstream models:
`stg_users` — direct reference, will break
`int_users_enriched` — joins on `user_type`
`fct_user_signups` — filters by `user_type`
`mart_user_segments` — aggregates by `user_type`
Impact if not addressed:
Next dbt run will fail with “column `user_type does not exist” in `stg_users`
Downstream models (`int_users_enriched`, `fct_user_signups`, and `mart_user_segments`) will be blocked from building
I opened a PR with a quick fix:
Updated `stg_users` to select `account_type as user_type`, maintaining backward compatibility. No changes needed downstream.
PR #681: Hot fix: user_type column rename
https://github.com/acme-analytics/dbt/pull/681
View Run
Open in Buster
Build with Buster
Set up triggers for autonomous workflows
Pick from Buster’s prebuilt workflows or define your own. Create triggers for alerts, schedules, or webhooks - each with custom instructions defining Buster's workflow.
Runs
run_jzdeqacokljk4ioyuxqjvmkzrf
daily-dbt-audit
Flagged for review
pr_checks.yml
Oct 21, 2025, 4:00 PM
4m, 5s
run_mnaxvqzjkbhs8fmobgxlqhjzrt
upstream-change-review
No issues detected
pr_checks.yml
Oct 21, 2025, 4:00 PM
4m, 5s
run_ynhwertghjkf67asdlkfjhqw
pr-review
No issues detected
pr_checks.yml
Oct 21, 2025, 4:00 PM
4m, 5s
run_cmgvazqbgrh443aoiuoqxjkjh
pr-review
No issues detected
pr_checks.yml
Oct 21, 2025, 4:00 PM
4m, 5s
run_bjwnxfqhlpdt2focvwefklkqz
pr-review
No issues detected
pr_checks.yml
Oct 21, 2025, 4:00 PM
4m, 5s
run_jfakdbaqhlpdt2focvwefklkqz
dbt-test-update
No issues detected
pr_checks.yml
Oct 21, 2025, 4:00 PM
4m, 5s
daily-dbt-audit
5:36 AM
18 monitors passed, 2 anomalies investigated
Ran 20 monitors across staging and mart layers and flagged 2 anomalies:
stg_orders.shipping_address null rate up 34% → traced to new digital_only order type, expected
stg_payments.payment_method new value “klarna” detected → needs mapping
Opened PR to add “klarna” to payment method mapping. I also adjusted null rate thresholds for the digital orders monitor so we don't get false-positives going forward.
staging
←
add-klarna-payment-method
Purpose-built for data engineering
Purpose-built for data engineering
Buster has direct warehouse access, optimized data diff tooling, and its own DuckDB instance for statistical analysis. It has a deep understanding of dbt, Airflow, ETL pipelines, lineage, and more.
Enterprise-grade security
This includes state-of-the-art encryption, safe and reliable infrastructure partners, and independently verified security controls.
SOC 2 Type II compliant
Buster has undergone a Service Organization Controls audit (SOC 2 Type II) and pen test.
HIPAA compliant
Privacy & security measures to ensure that PHI is appropriately safeguarded.
Permissions & governance
Provision users, enforce permissions, & implement robust governance.
IP protection policy
Neither Buster nor our model partners train models on customer data.
Self-hosted deployment
Deploy in your own air-gapped environment.
Secure connections
SSL and pass-through OAuth available.
FAQs
How do I get started with Buster?
Getting started takes about 10 minutes. Check out the Quickstart guide in our docs to see how.
What kinds of tasks can Buster handle?
Buster especially excels at repetitive data engineering workflows. Any reactive task you might instruct a teammate to do, Buster can automate. You can see a few examples here.
How does Buster work with my existing tools?
Buster integrates directly with your tools across your stack. It has native integrations with tools like dbt Cloud, dbt Core, Airflow, Prefect, GitHub, Slack, major data warehouses, and more. You can see all of our integrations here.
Is Buster secure?
Yes. Buster is SOC 2 Type II compliant and is built with enterprise-grade security practices. You can read more about our security policies here.
How does Buster use my data?
We never train models on your data, and we do not permit third parties to train on your data. You can read more about our data use & privacy policies here.
Can I use my own keys?
Yes. You can bring your own OpenAI or Anthropic API keys.






