-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
eds: decrease computational complexity of updates #11442
Conversation
…impl. Signed-off-by: Phil Genera <pgenera@google.com>
Signed-off-by: Phil Genera <pgenera@google.com>
Signed-off-by: Phil Genera <pgenera@google.com>
Signed-off-by: Phil Genera <pgenera@google.com>
I'm happy to put this behind a runtime guard if it seems prudent. |
IMO it's OK to not have a runtime guard for this, but I would raise the regression risk to medium at least. @snowp can you also take a look at this? |
Signed-off-by: Phil Genera <pgenera@google.com>
This should be rebased on #11505. |
Signed-off-by: Phil Genera <pgenera@google.com>
Signed-off-by: Phil Genera <pgenera@google.com>
Signed-off-by: Phil Genera <pgenera@google.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add to the PR description a comparison of your speed test before/after your n^2 fix?
You'd have to patch the speed-test only into a different client.
Signed-off-by: Phil Genera <pgenera@google.com>
Signed-off-by: Phil Genera <pgenera@google.com>
Signed-off-by: Phil Genera <pgenera@google.com>
Done, its still ~3.2x improvement over the baseline. Results are linked from the description; I can be more explicit than that if you'd like. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
up to you if you want to think about larger variable names.
This does need a @envoyproxy/senior-maintainers approval.
and thanks for doing this! |
After a bit of thought I noticed the performance of priorityAndLocalWeighted is about ~30% slower. Notably those tests don't exercise any of the n^2 logic (eg, with one iteration none of those .erase() calls happen), but I'm still surprised that the visitor-predicate-pattern is measurably slower than iterating in situ. Even with this mysterious observation, I think a 320% improvement in what I think is the common case is worth a 30% worse performance on the first iteration. |
Quick check: are you comparing optimized runs? Without optimization, inlining, and collapsing of dead logic I could see this being a significant performance degradation. |
I see in your comment you did use If the absolute per-itereration perf penalty is not too great it might be fine to just explain that, maybe as a comment in the code for posterity. |
Signed-off-by: Phil Genera <pgenera@google.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks! Praying to the gods of clang it goes through!
@envoyproxy/senior-maintainers
Signed-off-by: Phil Genera <pgenera@google.com>
I do not think we have been smiled upon :D. I'll be out the rest of this week, but will poke my head in late tonight in case there's something easy to do. |
also merge master to hopefully pick up a fix that was made to the http2 integration test for tsan. |
/wait |
Signed-off-by: Phil Genera <pgenera@google.com>
Signed-off-by: Phil Genera <pgenera@google.com>
Done and done. And it appears to have helped! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@envoyproxy/senior-maintainers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM modulo a nit.
/wait
Signed-off-by: Phil Genera <pgenera@google.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
Looking through the CI failures:
Both of these have high-flakiness warnings when I look at them in azure. They all (of course) pass locally and with RBE. |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
} | ||
|
||
skip_expensive_benchmarks = skip_switch.getValue(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add some big nice WARNING when this flag is enabled in order to increase the chances of someone noticing the difference between envoy_cc_benchmarks and tests for those benchmarks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in #12121
Makes BaseDynamicClusterImpl::updateDynamicHostList O(n) rather than O(n^2) Instead of calling .erase() on list iterators as we find them, we swap with the end of the list and erase after iterating over the list. This shows a ~3x improvement in execution time in the included benchmark test. Risk Level: Medium. No reordering happens to the endpoint list. Not runtime guarded. Testing: New benchmark, existing unit tests pass (and cover the affected function). Docs Changes: N/A Release Notes: N/A Relates to envoyproxy#2874 envoyproxy#11362 Signed-off-by: Phil Genera <pgenera@google.com> Signed-off-by: scheler <santosh.cheler@appdynamics.com>
Commit Message: Makes BaseDynamicClusterImpl::updateDynamicHostList O(n) rather than O(n^2)
Additional Description: Instead of calling .erase() on list iterators as we find them, we swap with the end of the list and erase after iterating over the list. This shows a ~3x improvement in execution time in the included benchmark test.
Risk Level: Medium. No reordering happens to the endpoint list. Not runtime guarded.
Testing: New benchmark, existing unit tests pass (and cover the affected function).
Docs Changes: N/A
Release Notes: N/A
#2874 #11362