Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate separate queries instead of OR filters #1035

Merged
merged 2 commits into from
Jan 28, 2025
Merged

Conversation

Swatinem
Copy link
Contributor

In todays episode of "the DB is slow for unexplainable reasons", we found out that querying compare_commitcomparison by both base_commit_id OR compare_commit_id is horribly slow. It is using a Seq Scan in that case for reasons unknown.

Splitting that OR filter up into two separate queries makes the DB use the appropriate indices, and the queries become instant. We can do this as the OR filter pretty much takes the union of both queries, which is equivalent of running both queries separately.

God only knows why, but here we are.

For the case where we were deleting models with files in chunk, the actual `DELETE` query would be chained to the existing input queryset.
That way, the resulting query would have `WHERE original_filter AND id IN (original_filter...)`.
Now only the `IN` part should remain.

Also adds a bunch of tracing annotations to better see each phase, in particular these file deletions.
In todays episode of "the DB is slow for unexplainable reasons", we found out
that querying `compare_commitcomparison` by **both** `base_commit_id` **OR** `compare_commit_id` is horribly slow.
It is using a `Seq Scan` in that case for reasons unknown.

Splitting that `OR` filter up into two separate queries makes the DB use the appropriate indices, and the queries become instant.
We can do this as the `OR` filter pretty much takes the union of both queries, which is equivalent of running both queries separately.

God only knows why, but here we are.
@Swatinem Swatinem requested a review from a team January 27, 2025 15:37
@Swatinem Swatinem self-assigned this Jan 27, 2025
@codecov-notifications
Copy link

codecov-notifications bot commented Jan 27, 2025

Codecov Report

Attention: Patch coverage is 97.36842% with 1 line in your changes missing coverage. Please review.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
services/cleanup/models.py 87.50% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copy link

✅ All tests successful. No failed tests were found.

📣 Thoughts on this report? Let Codecov know! | Powered by Codecov

Base automatically changed from swatinem/more-simple-deletes to main January 28, 2025 09:24
@Swatinem Swatinem added this pull request to the merge queue Jan 28, 2025
Merged via the queue into main with commit 8f00a29 Jan 28, 2025
20 checks passed
@Swatinem Swatinem deleted the swatinem/separate-or branch January 28, 2025 09:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants