-
-
Notifications
You must be signed in to change notification settings - Fork 593
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: Handle Trivy scan results asynchronously #3571
base: master
Are you sure you want to change the base?
Conversation
Use CompletableFutures with a FixedThreadPool Executor to asynchronously handle each scan result from Trivy Signed-off-by: robert-blackman <robert.blackman@monster.com>
Difference in classloaders seems to cause this when the common pool (used by parallelStream) tries to load a class for the first time Signed-off-by: robert-blackman <robert.blackman@monster.com>
Signed-off-by: robert-blackman <robert.blackman@monster.com>
Coverage summary from CodacySee diff coverage on Codacy
Coverage variation details
Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: Diff coverage details
Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: See your quality gate settings Change summary preferencesYou may notice some variations in coverage metrics with the latest Coverage engine update. For more details, visit the documentation |
Thanks for the PR @robert-blackman! Very much appreciate folks looking into unreleased code, it's the best opportunity to tweak and fix things! How is the performance holding up when uploading many BOMs? Processing records in a separate thread pool can potentially have the opposite effect. For example, with a worker pool thread of 20, DT could potentially be executing the |
Good point @nscuro! Maybe a better approach (perhaps a naive one 😂) would be to initialise a new executor with each |
In that case we may end up creating more contention as threads compete for database connections. One I believe throwing concurrency at the problem might not solve it without creating new issues. Instead, we may want to look at making the code more efficient. One example would be here: dependency-track/src/main/java/org/dependencytrack/tasks/scanners/TrivyAnalysisTask.java Line 447 in 35c7f86
Where we fetch a
dependency-track/src/main/java/org/dependencytrack/tasks/scanners/TrivyAnalysisTask.java Line 319 in 35c7f86
There's also logic here that checks whether a vulnerability exists, and creates it if necessary: dependency-track/src/main/java/org/dependencytrack/tasks/scanners/TrivyAnalysisTask.java Lines 450 to 456 in 35c7f86
This is executed for every Component<->Vulnerability pair. So if the same vulnerability is reported for multiple components, we're doing redundant work. So, to summarize, there are multiple opportunities here to structure the code in such a way that:
|
Thanks for the feedback @nscuro 🙂 My original approach really revolved around making the smallest change possible in order to meet the release window, but it did seem that the code in general could do with some optimisation. I'll have a go at implementing some form of local cache to handle the proposed optimisations, not sure if you had an idea for how that would best be implemented? A simple map would do the trick, but I'm not sure if something more robust would be warranted, for example caching vulnerabilities across scans 🤔 I do think (from our usage at least) that concurrency might still be necessary. For example, we are seeing some scans drop from 10-12 minutes down to 1:30-2:30 minutes. There are likely some threadpools out there that are better suited to this use case where fairness can be taken into account, did you have any thoughts on how concurrency might be best achieved here? New plan:
|
Use CompletableFutures with a FixedThreadPool Executor to asynchronously handle each scan result from Trivy
Description
This change allows the
TrivyAnalysisTask
to handle scan results asynchronously by borrowing the existingCompletableFutures
/CountDownLatch
pattern being used in theSnykAnalysisTask
. TheFixedThreadPool
can be configured usingTRIVY_THREAD_POOL_SIZE
(default: 10).Addressed Issue
fixes #3570
Additional Details
SCANNER_TRIVY_IGNORE_UNFIXED
is no longer checked for each vulnerability, as this is expensive and seemingly is not necessary. Instead it is checked once per analysis task.Checklist
[ ] This PR fixes a defect, and I have provided tests to verify that the fix is effective[ ] This PR implements an enhancement, and I have provided tests to verify that it works as intended[ ] This PR introduces changes to the database model, and I have added corresponding update logic[ ] This PR introduces new or alters existing behavior, and I have updated the documentation accordingly