Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preserve response headers on cluster update task #31421

Merged
merged 1 commit into from
Jun 19, 2018

Conversation

ywelsch
Copy link
Contributor

@ywelsch ywelsch commented Jun 19, 2018

#31241 changed the cluster state update tasks to run under system context. The context wrapping did not preserve response headers, though. This has led to a test failure on 6.x #31408, as the deprecation warnings were not carried back anymore to the caller when creating an index. This cannot as easily be fixed because these response headers are generated on the execute method of the cluster state update thread (which could potentially be shared by multiple tasks). The general situation with the response headers looks a bit messy. For example, #23950 made sure to preserve the response headers when creating an index and updating settings for an index. The subsequent #25961 seems to have undone that for index creation (removing the wrapping of the action listener), but it looks like that did not break any tests?

In order to make #31241 less breaking and err on the side of caution, I've changed the restorable context supplier to preserve response headers.

Closes #31408

@ywelsch ywelsch added >enhancement v7.0.0 :Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. v6.4.0 labels Jun 19, 2018
@ywelsch ywelsch requested review from bleskes and tvernum June 19, 2018 08:28
Copy link
Member

@cbuescher cbuescher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ywelsch since this is what I also came up with in #31431 I'm all in favour. Test looks good also, so we don't rely on otherwise remotely related IT tests to catch this. I only left one small suggestion regarding adding a test timeout on error. Let me know what you think.

latch.countDown();
}

});

assertFalse(threadPool.getThreadContext().isSystemContext());
assertEquals(expectedHeaders, threadPool.getThreadContext().getHeaders());
assertEquals(Collections.emptyMap(), threadPool.getThreadContext().getResponseHeaders());
}

latch.await();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a short timeout here, I was just running the test without the fix and after the test itself throws an AssertionError it doesn't finish but simply hangs otherwise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are using this pattern all over the place (at least in the distributed systems tests). There are people in the team that prefer it this way rather than having a random timeout on the latch. The important thing is that the assertion trips and that the test fails.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the reason we don't use timeouts is that if they happen you don't get any info other than the time out. This way, the suite times out and you get a thread dump which helps (sometimes) to see deadlocks and where things are stuck.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are people in the team that prefer it this way rather than having a random timeout on the latch

Then I guess I'm fine with it, it just made failing tests hang locally for me for quiet a while (I guess there are some hard timeouts after all).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the clarification, Boaz.

@ywelsch ywelsch merged commit 40c4bd5 into elastic:master Jun 19, 2018
ywelsch added a commit that referenced this pull request Jun 19, 2018
#31241 changed the cluster state update tasks to run under system context. The context wrapping
did not preserve response headers, though. This has led to a test failure on 6.x #31408, as the
deprecation warnings were not carried back anymore to the caller when creating an index. This
commit changes the restorable context supplier to preserve response headers.
Copy link
Contributor

@bleskes bleskes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM2 (for what it's worth)

dnhatn added a commit that referenced this pull request Jun 20, 2018
* 6.x:
  [DOCS] Omit shard failures assertion for incompatible responses  (#31430)
  [DOCS] Move licensing APIs to docs (#31445)
  backport of: add is-write-index flag to aliases (#30942) (#31412)
  backport of: Add rollover-creation-date setting to rolled over index (#31144) (#31413)
  [Docs] Extend Homebrew installation instructions (#28902)
  [Docs] Mention ip_range datatypes on ip type page (#31416)
  Multiplexing token filter (#31208)
  Fix use of time zone in date_histogram rewrite (#31407)
  Revert "Mute DefaultShardsIT#testDefaultShards test"
  [DOCS] Fixes code snippet testing for machine learning (#31189)
  Security: fix joining cluster with production license (#31341)
  [DOCS] Updated version in Info API example
  [DOCS] Moves the info API to docs (#31121)
  Revert "Increasing skip version for failing test on 6.x"
  Preserve response headers on cluster update task (#31421)
  [DOCS] Add code snippet testing for more ML APIs (#31404)
  Docs: Advice for reindexing many indices (#31279)
dnhatn added a commit that referenced this pull request Jun 20, 2018
* master:
  [DOCS] Omit shard failures assertion for incompatible responses  (#31430)
  [DOCS] Move licensing APIs to docs (#31445)
  Add Delete Snapshot High Level REST API
  Remove QueryCachingPolicy#ALWAYS_CACHE (#31451)
  [Docs] Extend Homebrew installation instructions (#28902)
  Choose JVM options ergonomically
  [Docs] Mention ip_range datatypes on ip type page (#31416)
  Multiplexing token filter (#31208)
  Fix use of time zone in date_histogram rewrite (#31407)
  Core: Remove index name resolver from base TransportAction (#31002)
  [DOCS] Fixes code snippet testing for machine learning (#31189)
  [DOCS] Removed  and  params from MLT. Closes #28128 (#31370)
  Security: fix joining cluster with production license (#31341)
  Unify http channels and exception handling (#31379)
  [DOCS] Moves the info API to docs (#31121)
  Preserve response headers on cluster update task (#31421)
  [DOCS] Add code snippet testing for more ML APIs (#31404)
  Do not preallocate bytes for channel buffer (#31400)
  Docs: Advice for reindexing many indices (#31279)
  Mute HttpExporterTests#testHttpExporterShutdown test Tracked by #31433
  Docs: Add note about removing prepareExecute from the java client (#31401)
  Make release notes ignore the `>test-failure` label. (#31309)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. >enhancement v6.4.0 v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CI: DefaultShardsIT testDefaultShards fails with NPE in 6.x
4 participants