Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sort nodes in graph object by memory impact when building subdag #7271

Closed
wants to merge 2 commits into from

Conversation

ttusing
Copy link
Contributor

@ttusing ttusing commented Apr 4, 2023

Partially resolves #7195

Description

This implements the node sorting suggested in #7195. This is contained partially in #7194 and a separate PR is opened here in case the other changes in #7194 are not ready to be reviewed. This also includes a performance check for this change alone with the other new performance enhancements merged into the 1.5 beta (#7191).

Performance tests:

Current 1.5 beta:

dbt=1.5.0-b5
(env)  % dbt build --no-version-check -s mymodel
19:05:38  Running with dbt=1.5.0-b5
19:05:41  Found 306 models, 3346 tests... 
19:06:10  
19:06:16  
19:06:16  Running 1 on-run-start hook
19:06:16  1 of 1 START hook: on-run-start.0 ...................................... [RUN]

48 seconds

This branch:

(env)  % dbt build --no-version-check -s mymodel
19:03:39  Running with dbt=1.5.0-b5
19:03:42  Found 306 models, 3346 tests...
19:03:42  
19:03:48  
19:03:48  Running 1 on-run-start hook
19:03:48  1 of 1 START hook: on-run-start.0 ...................................... [RUN]

9 seconds

Because of the huge performance gains, I believe many teams would benefit from getting this change into DBT as soon as possible.

Checklist

@cla-bot cla-bot bot added the cla:yes label Apr 4, 2023
@ttusing ttusing changed the title Ct 2316/sort nodes Sort nodes in graph object by memory impact when building subdag Apr 4, 2023
@ttusing ttusing marked this pull request as ready for review April 4, 2023 19:55
@ttusing ttusing requested a review from a team April 4, 2023 19:55
@ttusing ttusing requested a review from a team as a code owner April 4, 2023 19:55
@ttusing ttusing requested review from iknox-fa and Fleid April 4, 2023 19:55
@stu-k stu-k requested a review from peterallenwebb April 4, 2023 19:56
@ttusing
Copy link
Contributor Author

ttusing commented Apr 4, 2023

@iknox-fa @peterallenwebb I spun out this PR in case some of the changes in #7194 require more in-depth review. I also re-ran performance checks, this sorting change alone improves at least my DBT project for developing a single model by 5x compared to 1.5.0-b5 and it seems highly unlikely to have any unintended impacts.

@ttusing
Copy link
Contributor Author

ttusing commented Apr 7, 2023

Closing since #7194 was merged

@ttusing ttusing closed this Apr 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CT-2316] [Feature] Improve subset_graph selection by removing trivial nodes and improving removal order
1 participant