Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bugfix] Optimize testcase queries over time period #3256

Merged
merged 2 commits into from
Sep 6, 2024

Conversation

vkarak
Copy link
Contributor

@vkarak vkarak commented Sep 3, 2024

This is a follow up fix on #3227 and build on top of #3253.

This PR eliminates the JOIN between the testcases and sessions arrays as is this very detrimental to performance for sessions that run many testcases. The problem with the current implementation is that the JOIN will replicate the whole session json_blob for every testcase record bloating the memory consumption and hurting performance significantly!

The approach taken in this PR is replacing the single JOIN query with 2+1 SELECT. First we get the distinct session UUIDs that correspond to the test cases matching the selection criteria. Then we use those session UUIDs to extract their blobs from the sessions table decoding them and indexing them in memory. Finally, we get the testcase UUIDs matching the criteria selection and from this we are able to quickly pick the JSON data of each selected test case.

I tested this impelmentation a DB of 163 sessions and 38K test cases on two systems and we get 6-10x performance improvements on the --performance-compare / --performance-report options (from 20s to 2s and from 27s down to 4s). Now, the DB query is no more the limiting performance factor, but rather the JSON decoding and the aggregation of the results.

@vkarak vkarak added prio: important bugfix reporting Issues related to reporting and processing the test results labels Sep 3, 2024
@vkarak vkarak added this to the ReFrame 4.7 milestone Sep 3, 2024
@vkarak vkarak requested review from ekouts and teojgo September 3, 2024 16:04
@vkarak vkarak self-assigned this Sep 3, 2024
@vkarak vkarak changed the title [bugfix] Optimize testcase queries by time period [bugfix] Optimize testcase queries over time period Sep 3, 2024
@vkarak vkarak force-pushed the bugfix/optimize-testcase-queries branch from 1b40352 to e23a5e4 Compare September 5, 2024 13:26
@vkarak vkarak merged commit fa40822 into reframe-hpc:develop Sep 6, 2024
23 checks passed
@vkarak vkarak deleted the bugfix/optimize-testcase-queries branch September 6, 2024 08:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix prio: important reporting Issues related to reporting and processing the test results
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants