← All case studies
Data IntegrationData Products SaaS · Canada

3× Faster Data Loads: Rebuilding a Broken Analytics Pipeline

The situation

A growing data products company had operational data locked inside SQL Server with no reliable path to analytics. Stored procedures ran sequentially, taking hours to complete. Power BI reports were surfacing wrong numbers — missing columns and mismatched data types — and the sales and marketing team had stopped trusting the dashboards entirely.

What we built

We built end-to-end ETL pipelines extracting from SQL Server, transforming via dbt and DuckDB, and writing partitioned Parquet files to Azure Data Lake Storage. We authored parallelized Airflow DAGs for stored procedure loads, cutting runtime by two-thirds. We fixed the data type and column gaps in the source procedures that were corrupting downstream reports. We also set up unit testing for the DAG suite using pytest and flake8, wired into GitHub Actions for continuous validation.

The outcome

Pipeline runtime dropped 3×. Sales and marketing reclaimed their dashboards. The data team went from firefighting broken loads to shipping new pipelines — the infrastructure now has tests, automation, and documentation any engineer on the team can pick up.

Have a project like this? Let's talk.

Get in touch →