-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Preserve response headers on cluster update task #31421
Preserve response headers on cluster update task #31421
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
latch.countDown(); | ||
} | ||
|
||
}); | ||
|
||
assertFalse(threadPool.getThreadContext().isSystemContext()); | ||
assertEquals(expectedHeaders, threadPool.getThreadContext().getHeaders()); | ||
assertEquals(Collections.emptyMap(), threadPool.getThreadContext().getResponseHeaders()); | ||
} | ||
|
||
latch.await(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a short timeout here, I was just running the test without the fix and after the test itself throws an AssertionError it doesn't finish but simply hangs otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are using this pattern all over the place (at least in the distributed systems tests). There are people in the team that prefer it this way rather than having a random timeout on the latch. The important thing is that the assertion trips and that the test fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the reason we don't use timeouts is that if they happen you don't get any info other than the time out. This way, the suite times out and you get a thread dump which helps (sometimes) to see deadlocks and where things are stuck.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are people in the team that prefer it this way rather than having a random timeout on the latch
Then I guess I'm fine with it, it just made failing tests hang locally for me for quiet a while (I guess there are some hard timeouts after all).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the clarification, Boaz.
#31241 changed the cluster state update tasks to run under system context. The context wrapping did not preserve response headers, though. This has led to a test failure on 6.x #31408, as the deprecation warnings were not carried back anymore to the caller when creating an index. This commit changes the restorable context supplier to preserve response headers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM2 (for what it's worth)
* 6.x: [DOCS] Omit shard failures assertion for incompatible responses (#31430) [DOCS] Move licensing APIs to docs (#31445) backport of: add is-write-index flag to aliases (#30942) (#31412) backport of: Add rollover-creation-date setting to rolled over index (#31144) (#31413) [Docs] Extend Homebrew installation instructions (#28902) [Docs] Mention ip_range datatypes on ip type page (#31416) Multiplexing token filter (#31208) Fix use of time zone in date_histogram rewrite (#31407) Revert "Mute DefaultShardsIT#testDefaultShards test" [DOCS] Fixes code snippet testing for machine learning (#31189) Security: fix joining cluster with production license (#31341) [DOCS] Updated version in Info API example [DOCS] Moves the info API to docs (#31121) Revert "Increasing skip version for failing test on 6.x" Preserve response headers on cluster update task (#31421) [DOCS] Add code snippet testing for more ML APIs (#31404) Docs: Advice for reindexing many indices (#31279)
* master: [DOCS] Omit shard failures assertion for incompatible responses (#31430) [DOCS] Move licensing APIs to docs (#31445) Add Delete Snapshot High Level REST API Remove QueryCachingPolicy#ALWAYS_CACHE (#31451) [Docs] Extend Homebrew installation instructions (#28902) Choose JVM options ergonomically [Docs] Mention ip_range datatypes on ip type page (#31416) Multiplexing token filter (#31208) Fix use of time zone in date_histogram rewrite (#31407) Core: Remove index name resolver from base TransportAction (#31002) [DOCS] Fixes code snippet testing for machine learning (#31189) [DOCS] Removed and params from MLT. Closes #28128 (#31370) Security: fix joining cluster with production license (#31341) Unify http channels and exception handling (#31379) [DOCS] Moves the info API to docs (#31121) Preserve response headers on cluster update task (#31421) [DOCS] Add code snippet testing for more ML APIs (#31404) Do not preallocate bytes for channel buffer (#31400) Docs: Advice for reindexing many indices (#31279) Mute HttpExporterTests#testHttpExporterShutdown test Tracked by #31433 Docs: Add note about removing prepareExecute from the java client (#31401) Make release notes ignore the `>test-failure` label. (#31309)
#31241 changed the cluster state update tasks to run under system context. The context wrapping did not preserve response headers, though. This has led to a test failure on 6.x #31408, as the deprecation warnings were not carried back anymore to the caller when creating an index. This cannot as easily be fixed because these response headers are generated on the execute method of the cluster state update thread (which could potentially be shared by multiple tasks). The general situation with the response headers looks a bit messy. For example, #23950 made sure to preserve the response headers when creating an index and updating settings for an index. The subsequent #25961 seems to have undone that for index creation (removing the wrapping of the action listener), but it looks like that did not break any tests?
In order to make #31241 less breaking and err on the side of caution, I've changed the restorable context supplier to preserve response headers.
Closes #31408