-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure to generate identical NoOp for the same failure #33141
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
We generate NoOps in two places: in InternalEngine and in TransportShardBulkAction after a write operation failed. However, we generate slightly different messages for the same failure: 1. InternalEngine uses Exception#getFailure to generate a message without the class name. > newOp [NoOp{seqNo=1, primaryTerm=1, reason='Contexts are mandatory in > context enabled completion field [suggest_context]'}] 2. TransportShardBulkAction uses Exception#toString to generate a message with the class name. > NoOp{seqNo=1, primaryTerm=1, > reason='java.lang.IllegalArgumentException: Contexts are mandatory in > context enabled completion field [suggest_context]'} If a write operation fails while a replica is recovering, that replica will possibly receive two different NoOps: one from InternalEngine and one from TransportShardBulkAction. Two different NoOps will trip TranslogWriter#assertNoSeqNumberConflic assertion. This commit makes sure that we generate the same No-Ops for the same failure. Closes elastic#32986
dnhatn
added
>bug
:Distributed Indexing/Recovery
Anything around constructing a new shard, either from a local or a remote source.
v7.0.0
v6.5.0
v6.4.1
labels
Aug 24, 2018
Pinging @elastic/es-distributed |
s1monw
approved these changes
Aug 25, 2018
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks @s1monw. |
dnhatn
added a commit
that referenced
this pull request
Aug 28, 2018
We generate slightly different NoOps in InternalEngine and TransportShardBulkAction for the same failure. 1. InternalEngine uses Exception#getFailure to generate a message without the class name: newOp [NoOp{seqNo=1, primaryTerm=1, reason='Contexts are mandatory in context enabled completion field [suggest_context]'}]. 2. TransportShardBulkAction uses Exception#toString to generate a message with the class name: NoOp{seqNo=1, primaryTerm=1, reason='java.lang.IllegalArgumentException: Contexts are mandatory in context enabled completion field [suggest_context]'}. If a write operation fails while a replica is recovering, that replica will possibly receive two different NoOps: one from recovery and one from replication. These two different NoOps will trip TranslogWriter#assertNoSeqNumberConflict assertion. This commit ensures that we generate the same Noop for the same failure. Closes #32986
dnhatn
added a commit
that referenced
this pull request
Aug 28, 2018
We generate slightly different NoOps in InternalEngine and TransportShardBulkAction for the same failure. 1. InternalEngine uses Exception#getFailure to generate a message without the class name: newOp [NoOp{seqNo=1, primaryTerm=1, reason='Contexts are mandatory in context enabled completion field [suggest_context]'}]. 2. TransportShardBulkAction uses Exception#toString to generate a message with the class name: NoOp{seqNo=1, primaryTerm=1, reason='java.lang.IllegalArgumentException: Contexts are mandatory in context enabled completion field [suggest_context]'}. If a write operation fails while a replica is recovering, that replica will possibly receive two different NoOps: one from recovery and one from replication. These two different NoOps will trip TranslogWriter#assertNoSeqNumberConflict assertion. This commit ensures that we generate the same Noop for the same failure. Closes #32986
dnhatn
added a commit
that referenced
this pull request
Aug 28, 2018
* master: [Rollup] Better error message when trying to set non-rollup index (#32965) HLRC: Use Optional in validation logic (#33104) Remove unused User class from protocol (#33137) ingest: Introduce the dissect processor (#32884) [Docs] Add link to es-kotlin-wrapper-client (#32618) [Docs] Remove repeating words (#33087) Minor spelling and grammar fix (#32931) Remove support for deprecated params._agg/_aggs for scripted metric aggregations (#32979) Watcher: Simplify finding next date in cron schedule (#33015) Run Third party audit with forbidden APIs CLI (part3/3) (#33052) Fix plugin build test on Windows (#33078) HLRC+MINOR: Remove Unused Private Method (#33165) Remove old unused test script files (#32970) Build analysis-icu client JAR (#33184) Ensure to generate identical NoOp for the same failure (#33141) ShardSearchFailure#readFrom to set index and shardId (#33161)
dnhatn
added a commit
that referenced
this pull request
Aug 28, 2018
* 6.x: [Rollup] Better error message when trying to set non-rollup index (#32965) Remove unused User class from protocol (#33137) [DOCS] Adds link to 6.3.0 release highlights Test: fix token bwc tests due to bad backport Ensure to generate identical NoOp for the same failure (#33141) [Docs] Add link to es-kotlin-wrapper-client (#32618) [Docs] Remove repeating words (#33087) Minor spelling and grammar fix (#32931) Run Third party audit with forbidden APIs CLI (part3/3) (#33052) Fix plugin build test on Windows (#33078) Watcher: Simplify finding next date in cron schedule (#33015) Remove old unused test script files (#32970) Build analysis-icu client JAR (#33184) Switch remaining tests to new style Requests (#33109) Use internal connection manager when fetching remote node info Switch remaining x-pack tests to new style Requests (#33108) Switch remaining ml tests to new style Requests (#33107) Token API supports the client_credentials grant (#33106)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
>bug
:Distributed Indexing/Recovery
Anything around constructing a new shard, either from a local or a remote source.
v6.4.1
v6.5.0
v7.0.0-beta1
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
We generate slightly different NoOps in InternalEngine and TransportShardBulkAction for the same failure.
If a write operation fails while a replica is recovering, that replica will possibly receive two different NoOps: one from recovery and one from replication. These two different NoOps will trip
TranslogWriter#assertNoSeqNumberConflict
assertion.This commit makes sure that we generate the same No-Ops for the same failure.
Closes #32986