Unit testing SQL: the what, the why, and the how

Dec 3, 2023

A core draw of dbt is that it allows data practitioners to bring software engineering best practices into data workflows. One such practice is unit testing – testing small, functional pieces of code (hence the term “unit” testing) while isolating or mocking its inputs to ensure the desired behavior is always achieved. Unit testing makes code more resilient to changes and allows software engineers to develop on codebases more confidently.

Although basic column-level tests are ubiquitous in data transformation tooling such as dbt, unit testing is not as popular. That said, there is enormous utility in unit testing SQL transformations to make them less error-prone and easier to maintain over long periods of time.

A primer

If you aren’t familiar with unit testing, read on; otherwise, feel free to skip to the next section.

Unit testing involves three components: a function you want to test, the inputs to the function, and the outputs of the function given a set of inputs. Anything with a set of inputs and corresponding outputs can be unit tested: a function in a programming language, an Airflow task, a SQL query, or a dbt model. Even spreadsheets.

Here’s a simple unit test in Python:

# The function I want to test
def add_integers(a: int, b: int) -> int:
    return a + b

# A set of inputs I want to test against
test_case_inputs = [2, 4]

# The output that I am expecting from the function, given the above test case
test_case_output = 6

# The test
def test_two_positive_integers(a: int, b: int, result: int) -> int:
    test_result = add_integers(a=a, b=b)

    # Assert that the two are equal, otherwise the test fails
    assert result == test_result, f"Expected {result}, got {test_result}"

Once the business logic is defined for a function, you have an expectation of what the output looks like for a set of inputs. Unit testing involves defining inputs (i.e. test cases), passing them into a function, and comparing the function’s output with the expected output. If the two outputs match, the test passes; otherwise, it fails.

Unit testing SQL

At a high level, writing a SQL query to calculate a metric typically involves the following steps.

Validate business logic for the metric with stakeholder(s)
Write a SQL query that encodes the business logic on source data
QA and validate output. This might involve validating the data visually, computing summary statistics, and finally pulling in relevant stakeholders or domain experts if necessary.

Although not all SQL needs to be unit tested, in some cases, unit-testing SQL can help speed up development and improve the long-term reliability of a downstream data product.

Consider an accrual revenue metric. Accrual revenue applies revenue matching principles to record revenue from transactions in the period that they were earned in rather than in the period in which the payment was received. Doing any sort of revenue recognition in SQL is notoriously difficult to get right and involves a complex set of business rules.

A large reason for the complexity in calculating this metric correctly is because of the number of business rules that apply. Anecdotally speaking, what tends to happen is this:

Write SQL
Validate results manually; observe that the calculation is correct for only 3 out of 9 cases.
Iterate on SQL to cover the remaining cases; validate and observe the calculation is now correct for 5 out of 9 cases.
Iterate on SQL to cover the remaining cases; validate and observe the calculation has now broken for the cases validated previously, but oddly works for a different set of cases.
Spend a lot of time to get it right and finally ship the SQL query in a data product only to later find out the calculation broke because a case that didn’t exist in the data at the time did not get tested for.

How does unit testing help here?

Unit testing enables quicker feedback loops. The feedback loop above is lengthy: test cases have to be manually validated when changes to the query are made. This involves a lot of context-switching, tracking down data in source systems, and manual reconciliation of data. There may also be performance-related issues when using large volumes of source data to run tests. The validation step often gets done poorly or overlooked for these reasons. However, by defining named unit tests that test for highly specific cases, validation can be automated and incorporated into the development workflow.
Unit testing helps cover a larger set of test cases. Testing against source data is inherently limited because cases that haven’t occurred in source data do not get tested for. It is easier to ship unit-tested SQL with confidence because unit tests require you to test against mocked inputs rather than source data.
Unit testing helps document business logic. A second-order effect of unit testing SQL is that it forces you to think about the different ways in which the business logic could fail to be captured accurately in a SQL transformation. Unit-tested SQL forces the business logic to live beyond just a Google Doc and into a state that can be reliably tested against.
Improved long-term maintainability. When a SQL query or dbt model inevitably undergoes changes, unit tests ensure existing business logic encoded in the query does not break.

Implementing unit tests in SQL

A SQL query is simply a function with a set of inputs (i.e. rows of data) and outputs (i.e. query results), and like any other function, it can be unit tested.

Image 2: Unit testing the above SQL transformation

Inputs

The purpose of a unit test is to test the logic of a function in isolation, so it is critical to ensure its inputs are controlled.

A misstep would be to use source data as an input to your test. Why is this a bad idea? Good unit tests isolate the logic of the function being tested from external dependencies. When unit testing SQL, references to source data in SQL queries are replaced with references to mock data. Doing this ensures that a test failure is directly indicative of a failure in the function’s logic (rather than a failure in some part of the system external to the function).

Kent Beck’s Test Desiderata documents a number of desirable properties for unit tests and I highly recommend reading through these.

Outputs

Outputs are used to define the results expected from a SQL query given a set of mocked inputs. If the actual query results match expected results defined, the test passes.

Unit testing SQL: interactive notebook

The following Jupyter Notebook hosted on Google Colab shows a basic implementation of unit testing a SQL transformation.

Closing thoughts

As far as I can tell, there is currently a lack of tooling to support unit testing in SQL, which is why I relied on a lightweight, homegrown implementation of a unit testing framework with Python and DuckDB in the interactive notebook linked above. This implementation only scratches the surface of what’s possible with a fully-featured SQL unit testing framework, which naturally belongs in a data modeling or transformation layer such as dbt.

dbt Core 1.8 – set to release in Spring 2024 – ships with unit testing capability. This is an exciting development because it makes SQL unit testing accessible to a broader range of data practitioners. It’s also great to see a vibrant public discussion on its use cases and technical implementation (see roadmap here, and discussions here and here).

References

Interactive notebook
dbt Core 1.8 roadmap
Unit testing in dbt (discussion #1, #2)
Kent Beck - Desirable Unit Tests