Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split out test-iasworld-data workflow from test-dbt-models so that data integrity tests will fail properly #671

Merged

Conversation

jeancochrane
Copy link
Contributor

@jeancochrane jeancochrane commented Dec 5, 2024

There is currently a bug in our weekly scheduled data integrity tests: Failing tests do not cause the workflow to fail, so we never get notified of them. (See here for an example workflow run that should have failed.) This makes sense in that the test-dbt-models workflow was originally intended to be used exclusively to run iasWorld tests, so it doesn't fail the workflow if tests fail, because we expect some iasWorld tests to fail. This doesn't work for data integrity tests, however, since in the case the workflow should fail if any test fails.

I thought about modifying the test-dbt-models workflow and its underlying run_iasworld_data_tests.py script to add a new flag to exit with a nonzero status if a test fails, but I realized that this would bloat run_iasworld_data_tests.py with a feature that it doesn't need. What's more, it's unreasonable to use the run_iasworld_data_tests.py script to run data integrity tests in the first place, since iasWorld tests and data integrity tests have been categorically different since #638. In my opinion, this is the root cause of the bug, and will continue causing problems until we disentangle the two testing workflows.

In order to fix the problem at the root, this PR splits out test-dbt-models into two workflows with different uses:

  • test-iasworld-data, which runs iasWorld tests via workflow dispatch using the run_iasworld_data_tests.py script
  • test-dbt-models, which runs any dbt tests via workflow dispatch or runs data integrity tests on a schedule using direct dbt test calls

As a result, we can configure test-dbt-models to fail if any tests fail, while test-iasworld-data can maintain its current behavior by not failing when tests fail.

Examples of the two workflows running as expected:

Note

Since this PR changes the purpose of the test-dbt-models workflow, we need to change the service-spark-iasworld to dispatch the new test-iasworld-data workflow instead. See this companion PR: ccao-data/service-spark-iasworld#25

Comment on lines +49 to +51
# Default to no select option, which will fall back to the default
# behavior of the underlying Python script
SELECT_OPTION=""
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block is different from the old implementation of test-dbt-models: Now that this workflow is dedicated to running iasWorld tests, we can fall back to the default behavior of the underlying run_iasworld_data_tests.py script, which means we can avoid passing in the --select or --selector options unless the caller specifies one of them.

Comment on lines +3 to +8
# This workflow is not configured to run on PRs, because PRs have their own CI
# process defined by the `build-and-test-dbt` workflow. Instead, the Data
# Department's Spark data extraction process dispatches it via API once per
# day after finishing the daily ingest of iasWorld data. For more detail, see:
#
# https://github.com/ccao-data/service-spark-iasworld
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is slightly different, to reflect the new conditions in which this workflow runs compared to the old test-dbt-models workflow.

Comment on lines +136 to +138
subject: "iasWorld tests errored for workflow run: ${{ github.run_id }}"
body: |
iasWorld tests raised an error for workflow ${{ github.run_id }}, run on ${{ env.TIMESTAMP }} UTC
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This subject and body are now slightly different from the old implementation ("iasWorld tests" instead of "dbt tests") to make it clear that the message specifically relates to iasWorld tests.

@@ -0,0 +1,141 @@
name: test-iasworld-data
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This workflow should be nearly identical to the old implementation of test-dbt-models, but GitHub can't pick up on the diff because we're keeping test-dbt-models and adapting it for a slightly more specific purpose. I'll call out every change I made in comments below.

@jeancochrane jeancochrane changed the title Split out test-iasworld-data workflow from test-dbt-models so that data integrity tests will fail Split out test-iasworld-data workflow from test-dbt-models so that data integrity tests will fail properly Dec 5, 2024
@jeancochrane jeancochrane marked this pull request as ready for review December 5, 2024 23:44
@jeancochrane jeancochrane requested a review from a team as a code owner December 5, 2024 23:44
@jeancochrane jeancochrane requested a review from dfsnow December 5, 2024 23:44
Copy link
Member

@dfsnow dfsnow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright @jeancochrane, this looks good to me. I think the separation here makes sense. Let's get this and the Spark PR merged today.

One tiny blocking issue: we need to update the README to reflect the new workflows and when they're run.

.github/workflows/test_dbt_models.yaml Outdated Show resolved Hide resolved
$DEFER_OPTION
working-directory: ${{ env.PROJECT_DIR }}
shell: bash
env:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question (non-blocking): Why pass these in as env vars rather than just interpolating them directly into the bash?

Copy link
Contributor Author

@jeancochrane jeancochrane Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question! There's no functional reason, I just felt like it was clearer in this particular case. For USER, GIT_SHA, GIT_REF, and GIT_AUTHOR, the script expects these environment variables to be set, so we need to convert the GitHub variables to env vars anyway. For SELECT_OPTION, CACHE_HIT, and SKIP_ARTIFACTS_OPTION, I felt like the GitHub variable references were a bit long/esoteric, and it feels easier to me to read them as env vars. Definitely a weak preference on my part, I'm open to changing any of these if you think a GitHub variable would be clearer.

@jeancochrane jeancochrane merged commit 1e65774 into master Dec 6, 2024
7 checks passed
@jeancochrane jeancochrane deleted the jeancochrane/split-out-iasworld-data-test-workflow branch December 6, 2024 21:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants