Behind the Error Message: Solving row_count package bug in dbt

Tatum Brannan
Red Pill Analytics
Published in
4 min readSep 28, 2023

--

Photo by Ashley Batz on Unsplash

Exploring dbt Test Features

dbt provides a variety of intrinsic testing options. Among these, you’ll find fundamental tests like unique, not_null, accepted_values, and relationships.

Additionally, dbt offers external packages, and this blog post centers on one of them: dbt_utils. Inside this package, there’s a valuable test for comparing row counts.

To put it simply, this test evaluates two models by comparing their respective row counts to determine if they align. This testing process holds significance for validation, ensuring data integrity, and pinpointing potential project issues.

The Issue

In a dbt project within Big Query, a problem occurs when utilizing the dbt_utils.equal_rowcount package.

This problem surfaces when one of the tables being compared is empty, as the test is not equipped to handle such scenarios within Big Query.

The error message indicates that “none” is not recognized as the expected ‘integer’ type. In typical cases with other applications, querying the row count of an empty table yields a result of 0 (an integer).

However, the equal_rowcount function returns “none,” implying a potential Python-related issue within the function.

Resolving the Issue

We will develop a custom test to replace the dbt_utils rowcount function. This new test will consistently return 0 rows instead of null, effectively preventing the error from occurring.

Here’s the step-by-step process:

  • Generate a custom test function.
  • Integrate this custom test into the YAML configuration.
  • Execute dbt to observe the bug vanishing instantly — like magic!

This custom test, housed in the test section of the dbt project, functions similarly to the equal rowcount dbt package but ensures error-free operation.

The Code

{% test equal_rowcount_bigquery (model,column_name, compare_model) %}

with rpt as (
SELECT count(*) as test_model_count
FROM {{model}}
), valid as (
select count(*) as compare_to_model_count
from {{compare_model}}
)
select *
from rpt,valid
where rpt.test_model_count != valid.compare_to_model_count

{% endtest %}

This jinja snippet is created in a new test model within the dbt project. The test compares the row count of two different models.

Within the yaml of one of the two models, (in this case we are comparing compare_to_model to test_model) a rowcount test was added.

- name: test_model
tests:
- equal_rowcount_bigquery:
compare_model: ref('compare_to_model')

This test involves selecting the count of the test_model and then comparing that count to the rowcount of the empty compare_to_model.

What Comes Next?

Now that the test has been put into action and can be evaluated, one of two outcomes will occur: it either fails or passes.

Failures

When the test doesn’t pass, there are additional actions to take.

Upon test failure, it will generate compiled SQL code, which gets stored in the target file within your dbt project. This compiled code should then be copied and pasted into Big Query for further analysis to pinpoint the cause of the failure.

The failure signals a mismatch in row counts between the models. The SQL output will reveal the row counts in each model and offer guidance on resolving the model discrepancies.

Sql query with bug:

testing failure, sql code from target

Sql query without bug:

When the tests are executed with the recommended solution, they will no longer trigger the “none is not of type integer” error. Instead, they will accurately report 0 rows instead of null, effectively sidestepping the frustrating bug.

Passes

If the new custom generic test successfully passes, it signifies that the source table matches the target, which is always a favorable outcome.

In this scenario, there won’t be any SQL code generated in the target due to the successful test outcome. However, if there were, it would resemble the following:

Conclusion

The challenge involving the dbt_utils.equal_rowcount package has been successfully addressed. The issue caused by empty tables has been resolved by implementing a custom test that consistently returns 0 rows, eliminating the error.

Is your organization adapting to remain competitive? In the Architect Room, we design a roadmap for the future of your digital organization, while still capitalizing on current technology investments. Our knowledge spans a variety of offerings across all of the major public cloud providers. Visit Red Pill Analytics to learn more.

--

--