-
Notifications
You must be signed in to change notification settings - Fork 24.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize warning header de-duplication #37725
Optimize warning header de-duplication #37725
Conversation
Now that warning headers no longer contain a timestamp of when the warning was generated, we no longer need to extract the warning value from the warning to determine whether or not the warning value is duplicated. Instead, we can compare strings directly. Further, when de-duplicating warning headers, are constantly rebuilding sets. Instead of doing that, we can carry about the set with us and rebuild it if we find a new warning value. This commit applies both of these optimizations.
Pinging @elastic/es-core-infra |
This change drops the performance gap from 10x to 6x on my machine (we have gone now from 40x to 6x). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. I left a couple of minor comments.
@@ -751,4 +770,40 @@ public AbstractRunnable unwrap() { | |||
return in; | |||
} | |||
} | |||
|
|||
private static Collector<String, Set<String>, Set<String>> LINKED_HASH_SET_COLLECTOR = new LinkedHashSetCollector<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be declared final
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed 6e1ecec.
@@ -232,7 +232,7 @@ void deprecated(final Set<ThreadContext> threadContexts, final String message, f | |||
while (iterator.hasNext()) { | |||
try { | |||
final ThreadContext next = iterator.next(); | |||
next.addResponseHeader("Warning", warningHeaderValue, DeprecationLogger::extractWarningValueFromWarningHeader); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder whether we need DeprecationLogger::extractWarningValueFromWarningHeader
at all now? It seems to be used in some tests and an assert in DeprecationLogger
but I think we can get rid of it there as well?
Then we can also get rid of ThreadContext#addResponseHeader(final String key, final String value, final Function<String, String> uniqueValue)
and just have #addResponseHeader(final String key, final String value)
?
In order to keep this PR small this could be done in a follow-up though as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I was going to remove the dead code in a follow-up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect. Then I'll approve now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this! LGTM
…ead-de-duplication * elastic/master: (24 commits) [TEST] Mute MlMappingsUpgradeIT testMappingsUpgrade Streamline skip_unavailable handling (elastic#37672) Only bootstrap and elect node in current voting configuration (elastic#37712) Ensure either success or failure path for SearchOperationListener is called (elastic#37467) Target only specific index in update settings test Add a note how to benchmark Elasticsearch Don't use Groovy's `withDefault` (elastic#37726) Adapt SyncedFlushService (elastic#37691) Mute FilterAggregatorTests#testRandom Switch mapping/aggregations over to java time (elastic#36363) [ML] Update ML results mappings on process start (elastic#37706) Modify removal_of_types.asciidoc (elastic#37648) Fix edge case in PutMappingRequestTests (elastic#37665) Use new bulk API endpoint in the docs (elastic#37698) Expose sequence number and primary terms in search responses (elastic#37639) Remove LicenseServiceClusterNotRecoveredTests (elastic#37528) Migrate SpecificMasterNodesIT to Zen2 (elastic#37532) Fix MetaStateFormat tests Use plain text instead of latexmath Fix a typo in a warning message in TestFixturesPlugin (elastic#37631) ...
* Writes a list of strings | ||
*/ | ||
public void writeStringList(List<String> list) throws IOException { | ||
public void writeStringCollection(final Collection<String> list) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I factor this change out to #37768.
* master: Liberalize StreamOutput#writeStringList (elastic#37768) Add PersistentTasksClusterService::unassignPersistentTask method (elastic#37576) Tests: disable testRandomGeoCollectionQuery on tiny polygons (elastic#37579) Use ILM for Watcher history deletion (elastic#37443) Make sure PutMappingRequest accepts content types other than JSON. (elastic#37720) Retry ILM steps that fail due to SnapshotInProgressException (elastic#37624) Use disassociate in preference to deassociate (elastic#37704) Delete Redundant RoutingServiceTests (elastic#37750) Always return metadata version if metadata is requested (elastic#37674)
@elasticmachine run gradle build tests 1 |
@elasticmachine run elasticsearch-ci/1 |
…ead-de-duplication * elastic/master: Use explicit version for build-tools in example plugin integ tests (elastic#37792) Change `rational` to `saturation` in script_score (elastic#37766) Deprecate types in get field mapping API (elastic#37667) Add ability to listen to group of affix settings (elastic#37679) Ensure changes requests return the latest mapping version (elastic#37633) Make Minio Setup more Reliable (elastic#37747)
@elasticmachine run elasticsearch-ci/1 |
Now that warning headers no longer contain a timestamp of when the warning was generated, we no longer need to extract the warning value from the warning to determine whether or not the warning value is duplicated. Instead, we can compare strings directly. Further, when de-duplicating warning headers, are constantly rebuilding sets. Instead of doing that, we can carry about the set with us and rebuild it if we find a new warning value. This commit applies both of these optimizations.
* elastic/master: Optimize warning header de-duplication (elastic#37725) Bubble exceptions up in ClusterApplierService (elastic#37729) SQL: Improve handling of invalid args for PERCENTILE/PERCENTILE_RANK (elastic#37803) Remove unused ThreadBarrier class (elastic#37666) Add built-in user and role for code plugin (elastic#37030) Consolidate testclusters tests into a single project (elastic#37362) Fix docs for MappingUpdatedAction SQL: Introduce SQL DATE data type (elastic#37693) disabling bwc test while backporting elastic#37639 Mute ClusterDisruptionIT testAckedIndexing Set acking timeout to 0 on dynamic mapping update (elastic#31140) Remove index audit output type (elastic#37707) Mute FollowerFailOverIT testReadRequestsReturnsLatestMappingVersion [ML] Increase close job timeout and lower the max number (elastic#37770) Remove Custom Listeners from SnapshotsService (elastic#37629) Use m_m_nodes from Zen1 master for Zen2 bootstrap (elastic#37701) Fix index filtering in follow info api. (elastic#37752) Use project dependency instead of substitutions for distributions (elastic#37730) Update authenticate to allow unknown fields (elastic#37713) Deprecate HLRC EmptyResponse used by security (elastic#37540)
Now that warning headers no longer contain a timestamp of when the warning was generated, we no longer need to extract the warning value from the warning to determine whether or not the warning value is duplicated. Instead, we can compare strings directly.
Further, when de-duplicating warning headers, are constantly rebuilding sets. Instead of doing that, we can carry about the set with us and rebuild it if we find a new warning value.
This commit applies both of these optimizations.
Relates #35754
Relates #37530
Relates #37597
Relates #37622