Add Performance Regression Tests #3054

nathaniel-may · 2021-02-05T16:26:38Z

Describe the feature

Add testing to detect whether key performance metrics are within an acceptable threshold.

Additional context

While aiming to improve the startup time of dbt with large projects, adding regression testing will prevent us from making future changes that slow down startup times below our acceptable threshold. Testing the parse time of large projects is a good place to start with the additions of the dbt-timing-project and the dbt parse command.

Who will this benefit?

This will benefit all users because it protects them from performance regressions in future releases. This benefits dbt contributors because it will catch performance regressions before merging and releasing.

The text was updated successfully, but these errors were encountered:

nathaniel-may · 2021-07-14T15:46:49Z

Implementation details:

using the hyperfine framework
will define a tiny bash script in the dbt repo for each performance metric we want to track / compare (e.g. - standard parsing, partial parsing, experimental parser etc.)
will commit artificial and sanitized dbt projects that represent different performance characteristics (e.g. - many doc blocks, huge projects, tiny projects, lots of macros, projects that replicate any new performance issues found in the field etc.)
this will form a grid where each performance characteristic is run on each example project and can be compared between dbt versions and development branches

Decisions to make:

What way should we make this on demand to run? Local script, github actions, etc
When does this run in CI? Nightly, on every PR, on tagged branches? These tests may take a long time to run as the metrics and example projects grow.
What is our definition of a regression? Hyperfine looks at the variance of multiple runs and reports a reasonable error margin for the performance numbers. We want to catch performance creep, but we don't want to be mashing rerun to get these to pass.
How do we want failures to be reported? Internal slack message, something in github actions?

Future work:

If we store these results, we can look at trends over time.
We could publish our findings publicly. See the rustc performance metrics.
The first iteration probably won't test anything that requires a connection to a warehouse. This should be added in the future. (compile, run, build etc.)

jtcohen6 · 2021-08-02T16:56:36Z

@nathaniel-may @leahwicz In advance of the cycle turnover on Wednesday, could we:

Summarize progress on this issue thus far
Open a new issue to track spillover into the next cycle
Estimate that new issue and add it to [Tracking] Q3 Cycle 2 #3467

nathaniel-may added enhancement New feature or request performance repo ci/cd Testing and continuous integration for dbt-core + adapter plugins labels Feb 5, 2021

jtcohen6 added the 1.0.0 Issues related to the 1.0.0 release of dbt label Jun 28, 2021

leahwicz assigned nathaniel-may Jun 29, 2021

nathaniel-may mentioned this issue Jul 20, 2021

Add Performance Regression Testing [Rust] #3602

Merged

11 tasks

nathaniel-may closed this as completed in #3602 Aug 11, 2021

nathaniel-may mentioned this issue Oct 7, 2021

[CT-14] Enable Regression Testing Notifications #4021

Closed

19 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Performance Regression Tests #3054

Add Performance Regression Tests #3054

nathaniel-may commented Feb 5, 2021

nathaniel-may commented Jul 14, 2021 •

edited

Loading

jtcohen6 commented Aug 2, 2021

Add Performance Regression Tests #3054

Add Performance Regression Tests #3054

Comments

nathaniel-may commented Feb 5, 2021

Describe the feature

Additional context

Who will this benefit?

nathaniel-may commented Jul 14, 2021 • edited Loading

Implementation details:

Decisions to make:

Future work:

jtcohen6 commented Aug 2, 2021

nathaniel-may commented Jul 14, 2021 •

edited

Loading