Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize warning header de-duplication #37725

Merged

Conversation

jasontedor
Copy link
Member

@jasontedor jasontedor commented Jan 22, 2019

Now that warning headers no longer contain a timestamp of when the warning was generated, we no longer need to extract the warning value from the warning to determine whether or not the warning value is duplicated. Instead, we can compare strings directly.

Further, when de-duplicating warning headers, are constantly rebuilding sets. Instead of doing that, we can carry about the set with us and rebuild it if we find a new warning value.

This commit applies both of these optimizations.

Relates #35754
Relates #37530
Relates #37597
Relates #37622

Now that warning headers no longer contain a timestamp of when the
warning was generated, we no longer need to extract the warning value
from the warning to determine whether or not the warning value is
duplicated. Instead, we can compare strings directly.

Further, when de-duplicating warning headers, are constantly rebuilding
sets. Instead of doing that, we can carry about the set with us and
rebuild it if we find a new warning value.

This commit applies both of these optimizations.
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

@jasontedor
Copy link
Member Author

This change drops the performance gap from 10x to 6x on my machine (we have gone now from 40x to 6x).

Copy link
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@danielmitterdorfer danielmitterdorfer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I left a couple of minor comments.

@@ -751,4 +770,40 @@ public AbstractRunnable unwrap() {
return in;
}
}

private static Collector<String, Set<String>, Set<String>> LINKED_HASH_SET_COLLECTOR = new LinkedHashSetCollector<>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be declared final?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed 6e1ecec.

@@ -232,7 +232,7 @@ void deprecated(final Set<ThreadContext> threadContexts, final String message, f
while (iterator.hasNext()) {
try {
final ThreadContext next = iterator.next();
next.addResponseHeader("Warning", warningHeaderValue, DeprecationLogger::extractWarningValueFromWarningHeader);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether we need DeprecationLogger::extractWarningValueFromWarningHeader at all now? It seems to be used in some tests and an assert in DeprecationLogger but I think we can get rid of it there as well?

Then we can also get rid of ThreadContext#addResponseHeader(final String key, final String value, final Function<String, String> uniqueValue) and just have #addResponseHeader(final String key, final String value)?

In order to keep this PR small this could be done in a follow-up though as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was going to remove the dead code in a follow-up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect. Then I'll approve now.

Copy link
Member

@danielmitterdorfer danielmitterdorfer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this! LGTM

…ead-de-duplication

* elastic/master: (24 commits)
  [TEST] Mute MlMappingsUpgradeIT testMappingsUpgrade
  Streamline skip_unavailable handling (elastic#37672)
  Only bootstrap and elect node in current voting configuration (elastic#37712)
  Ensure either success or failure path for SearchOperationListener is called (elastic#37467)
  Target only specific index in update settings test
  Add a note how to benchmark Elasticsearch
  Don't use Groovy's `withDefault` (elastic#37726)
  Adapt SyncedFlushService (elastic#37691)
  Mute FilterAggregatorTests#testRandom
  Switch mapping/aggregations over to java time (elastic#36363)
  [ML] Update ML results mappings on process start (elastic#37706)
  Modify removal_of_types.asciidoc (elastic#37648)
  Fix edge case in PutMappingRequestTests (elastic#37665)
  Use new bulk API endpoint in the docs (elastic#37698)
  Expose sequence number and primary terms in search responses (elastic#37639)
  Remove LicenseServiceClusterNotRecoveredTests (elastic#37528)
  Migrate SpecificMasterNodesIT to Zen2 (elastic#37532)
  Fix MetaStateFormat tests
  Use plain text instead of latexmath
  Fix a typo in a warning message in TestFixturesPlugin (elastic#37631)
  ...
* Writes a list of strings
*/
public void writeStringList(List<String> list) throws IOException {
public void writeStringCollection(final Collection<String> list) throws IOException {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I factor this change out to #37768.

* master:
  Liberalize StreamOutput#writeStringList (elastic#37768)
  Add PersistentTasksClusterService::unassignPersistentTask method (elastic#37576)
  Tests: disable testRandomGeoCollectionQuery on tiny polygons (elastic#37579)
  Use ILM for Watcher history deletion (elastic#37443)
  Make sure PutMappingRequest accepts content types other than JSON. (elastic#37720)
  Retry ILM steps that fail due to SnapshotInProgressException (elastic#37624)
  Use disassociate in preference to deassociate (elastic#37704)
  Delete Redundant RoutingServiceTests (elastic#37750)
  Always return metadata version if metadata is requested (elastic#37674)
@jasontedor
Copy link
Member Author

@elasticmachine run gradle build tests 1

@jasontedor
Copy link
Member Author

@elasticmachine run elasticsearch-ci/1

…ead-de-duplication

* elastic/master:
  Use explicit version for build-tools in example plugin integ tests (elastic#37792)
  Change `rational` to `saturation` in script_score (elastic#37766)
  Deprecate types in get field mapping API (elastic#37667)
  Add ability to listen to group of affix settings (elastic#37679)
  Ensure changes requests return the latest mapping version (elastic#37633)
  Make Minio Setup more Reliable (elastic#37747)
@jasontedor
Copy link
Member Author

@elasticmachine run elasticsearch-ci/1

@jasontedor jasontedor merged commit 7517e3a into elastic:master Jan 24, 2019
jasontedor added a commit that referenced this pull request Jan 24, 2019
Now that warning headers no longer contain a timestamp of when the
warning was generated, we no longer need to extract the warning value
from the warning to determine whether or not the warning value is
duplicated. Instead, we can compare strings directly.

Further, when de-duplicating warning headers, are constantly rebuilding
sets. Instead of doing that, we can carry about the set with us and
rebuild it if we find a new warning value.

This commit applies both of these optimizations.
@jasontedor jasontedor deleted the optimize-warning-head-de-duplication branch January 24, 2019 14:44
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Jan 24, 2019
* elastic/master:
  Optimize warning header de-duplication (elastic#37725)
  Bubble exceptions up in ClusterApplierService (elastic#37729)
  SQL: Improve handling of invalid args for PERCENTILE/PERCENTILE_RANK (elastic#37803)
  Remove unused ThreadBarrier class (elastic#37666)
  Add built-in user and role for code plugin (elastic#37030)
  Consolidate testclusters tests into a single project (elastic#37362)
  Fix docs for MappingUpdatedAction
  SQL: Introduce SQL DATE data type (elastic#37693)
  disabling bwc test while backporting elastic#37639
  Mute ClusterDisruptionIT testAckedIndexing
  Set acking timeout to 0 on dynamic mapping update (elastic#31140)
  Remove index audit output type (elastic#37707)
  Mute FollowerFailOverIT testReadRequestsReturnsLatestMappingVersion
  [ML] Increase close job timeout and lower the max number (elastic#37770)
  Remove Custom Listeners from SnapshotsService (elastic#37629)
  Use m_m_nodes from Zen1 master for Zen2 bootstrap (elastic#37701)
  Fix index filtering in follow info api. (elastic#37752)
  Use project dependency instead of substitutions for distributions (elastic#37730)
  Update authenticate to allow unknown fields (elastic#37713)
  Deprecate HLRC EmptyResponse used by security (elastic#37540)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants