-
Notifications
You must be signed in to change notification settings - Fork 24.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add RemoveCorruptedShardDataCommand #32281
Merged
vladimirdolzhenko
merged 93 commits into
elastic:master
from
vladimirdolzhenko:fix/31389_2
Sep 19, 2018
Merged
Changes from 91 commits
Commits
Show all changes
93 commits
Select commit
Hold shift + click to select a range
843f977
drop `index.shard.check_on_startup: fix`
4f01609
Merge remote-tracking branch 'remotes/origin/master' into fix/31389_1
a8f1488
add RemoveCorruptedSegmentsCommand; merge elasticsearch-translog and …
5f6b084
fix test with ClusterAllocationExplanation
1fc72e9
fix test with ClusterAllocationExplanation
153e4f2
create corrupted marker on `check_on_startup: true`; split testIndexC…
2964fef
Merge remote-tracking branch 'remotes/origin/master' into fix/31389_1
c71e306
create manually corruption marker (but don't corrupt index files) to …
a7668d6
checkstyle fix
6ee74a0
merge into ResolveShardCorruptionCommand
ee955b0
check is _state folder exist before reading state
918ce41
merge two commands into a single remove-corrupted-segments
97fa399
Merge remote-tracking branch 'remotes/origin/master' into fix/31389_1
ebef6d2
Merge remote-tracking branch 'remotes/origin/fix/31389_1' into fix/31…
5cddefb
fixes after merge with remote-tracking branch 'remotes/origin/fix/313…
fd407bb
move corruptIndex to CorruptionUtils
4bc9c95
reworked resolveShardPath
b29aa9a
split testShardLock; testCorruptedBothIndexAndTranslog is added
9ceeaf4
simplified test
addb03f
test code cleanup
0f29f0f
test code cleanup
e6c6d70
checkstyle
c155b36
addressed unit test comments
85b7eef
keep `fix` for 6.x branch
7f292e3
drop unused class
43ae3a1
remove-corrupted-data subcommand instead of remove-corrupted-segments
087d558
remove-corrupted-data subcommand instead of remove-corrupted-segments…
ad819ec
dropped `index.shard.check_on_startup: fix` - it has to go with anoth…
75fcafa
amendment on a CLI tool name
cf6837f
a bit of clean up + show translog file names in sorted order instead …
260a5f4
keep node lock on shard shamanizing; fix allocate empty primary; inst…
3de84e2
fix node lock scope
073d29f
renamed to RemoveCorruptedShardDataCommand
03bbc5f
added test for multi-node layout for a single env
3231803
added `fix` deprecation log message + test
64c29db
dropped `dry-run`
d1805d6
keep elasticsearch-translog for 6.x
c2b5b8a
added `fix` deprecation log message + test
14e6175
adjusted `fix` deprecation log message
fee8a5b
dropped `fix` to avoid deprecation warnings
e1808d6
Merge remote-tracking branch 'remotes/origin/fix/31389_1' into fix/31…
5b5d516
set 755 to elasticsearch-shard, elasticsearch-translog
5cee2b9
skip files added by Lucene's ExtrasFS
b11670c
skip files added by Lucene's ExtrasFS
e38238a
skip files added by Lucene's ExtrasFS
ad62da0
Merge remote-tracking branch 'remotes/origin/master' into fix/31389_1
6f6ca5a
Merge remote-tracking branch 'remotes/origin/master' into fix/31389_1
6763cf9
Merge remote-tracking branch 'remotes/origin/master' into fix/31389_1
7f1f6f3
Merge branch 'fix/31389_1' into fix/31389_2
5083e83
Merge remote-tracking branch 'remotes/origin/master' into fix/31389_1
2a9dbeb
resolved conflicts on Merge remote-tracking branch 'remotes/origin/ma…
d165a6c
Merge branch 'fix/31389_1' into fix/31389_2
f985de4
resolve conflict after Merge branch 'fix/31389_1' into fix/31389_2
aa16487
Merge remote-tracking branch 'remotes/origin/master' into fix/31389_1
f74c058
Merge remote-tracking branch 'remotes/origin/fix/31389_1' into fix/31…
28c6a5a
checkstyle
24bc3d4
added comment on the reason to keep index lock
2d2dd2b
dropped left-over
e196e9e
addressed documentation review comments (links, clean up)
4d89496
removed misleading comments
5bdb069
clean up; inlining of resolveShardPath; text adjustments
5349c72
extracted lock logic from NodeEnvironment ctor into NodeLock; reused …
4286800
reworked resolve shard path
01be5af
added Lucene.SOFT_DELETES_FIELD to IndexWriter
af64fd4
polish a bit NodeLock
d26fbfb
Merge remote-tracking branch 'remotes/origin/master' into fix/31389_1
f8fd76a
Merge remote-tracking branch 'remotes/origin/fix/31389_1' into fix/31…
47fa3fa
Merge branch 'remote/origin/master' into fix/31389_2
3a4916a
checkstyle
9f3a7fb
dropped testCheckOnStartupDeprecatedValue due to wrong merge with master
abcff3c
fix NodeEnvironment.NodeLock
91dc295
Merge remote-tracking branch 'remotes/origin/master' into fix/31389_2
33f3a45
improved message on delete marker
c796417
minor test code style change
4181988
fix test
a1593e8
fix test
185adc9
Merge remote-tracking branch 'remotes/origin/master' into fix/31389_2
f5cf90a
move shard-tool doc next to other docs
8de0ae5
fix [float] Removing a corrupted data files header
5b29ad0
Merge remote-tracking branch 'remotes/origin/master' into fix/31716_2
8242bbb
after merge fixes
418c922
Tweaks to docs
DaveCTurner 674d1ba
dropped unrelated checkIndexOnStartup = fix setting
53b404a
nodeEnv code style clean up
24ffdd1
do not expose node lock; code style adjustment; text comment adjustment
2a3f58d
Merge remote-tracking branch 'remotes/origin/master' into fix/31389_2
e1eb32f
tiny doc amendment
ee1f6a2
NodeEnvironment.NodeLock can skip node path if it is required
1df4685
Merge remote-tracking branch 'remotes/origin/master' into fix/31389_2
844adaf
after merge fix
8210f3b
after merge fix
dab5125
inline nodeLock
54c4030
add javadoc comment for pathFunction
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
#!/bin/bash | ||
|
||
ES_MAIN_CLASS=org.elasticsearch.index.shard.ShardToolCli \ | ||
"`dirname "$0"`"/elasticsearch-cli \ | ||
"$@" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
@echo off | ||
|
||
setlocal enabledelayedexpansion | ||
setlocal enableextensions | ||
|
||
set ES_MAIN_CLASS=org.elasticsearch.index.shard.ShardToolCli | ||
call "%~dp0elasticsearch-cli.bat" ^ | ||
%%* ^ | ||
|| exit /b 1 | ||
|
||
endlocal | ||
endlocal |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,107 @@ | ||
[[shard-tool]] | ||
== elasticsearch-shard | ||
|
||
In some cases the Lucene index or translog of a shard copy can become | ||
corrupted. The `elasticsearch-shard` command enables you to remove corrupted | ||
parts of the shard if a good copy of the shard cannot be recovered | ||
automatically or restored from backup. | ||
|
||
[WARNING] | ||
You will lose the corrupted data when you run `elasticsearch-shard`. This tool | ||
should only be used as a last resort if there is no way to recover from another | ||
copy of the shard or restore a snapshot. | ||
|
||
When Elasticsearch detects that a shard's data is corrupted, it fails that | ||
shard copy and refuses to use it. Under normal conditions, the shard is | ||
automatically recovered from another copy. If no good copy of the shard is | ||
available and you cannot restore from backup, you can use `elasticsearch-shard` | ||
to remove the corrupted data and restore access to any remaining data in | ||
unaffected segments. | ||
|
||
[WARNING] | ||
Stop Elasticsearch before running `elasticsearch-shard`. | ||
|
||
To remove corrupted shard data use the `remove-corrupted-data` subcommand. | ||
|
||
There are two ways to specify the path: | ||
|
||
* Specify the index name and shard name with the `--index` and `--shard-id` | ||
options. | ||
* Use the `--dir` option to specify the full path to the corrupted index or | ||
translog files. | ||
|
||
[float] | ||
=== Removing corrupted data | ||
|
||
`elasticsearch-shard` analyses the shard copy and provides an overview of the | ||
corruption found. To proceed you must then confirm that you want to remove the | ||
corrupted data. | ||
|
||
[WARNING] | ||
Back up your data before running `elasticsearch-shard`. This is a destructive | ||
operation that removes corrupted data from the shard. | ||
|
||
[source,txt] | ||
-------------------------------------------------- | ||
$ bin/elasticsearch-shard remove-corrupted-data --index twitter --shard-id 0 | ||
|
||
|
||
WARNING: Elasticsearch MUST be stopped before running this tool. | ||
|
||
Please make a complete backup of your index before using this tool. | ||
|
||
|
||
Opening Lucene index at /var/lib/elasticsearchdata/nodes/0/indices/P45vf_YQRhqjfwLMUvSqDw/0/index/ | ||
|
||
>> Lucene index is corrupted at /var/lib/elasticsearchdata/nodes/0/indices/P45vf_YQRhqjfwLMUvSqDw/0/index/ | ||
|
||
Opening translog at /var/lib/elasticsearchdata/nodes/0/indices/P45vf_YQRhqjfwLMUvSqDw/0/translog/ | ||
|
||
|
||
>> Translog is clean at /var/lib/elasticsearchdata/nodes/0/indices/P45vf_YQRhqjfwLMUvSqDw/0/translog/ | ||
|
||
|
||
Corrupted Lucene index segments found - 32 documents will be lost. | ||
|
||
WARNING: YOU WILL LOSE DATA. | ||
|
||
Continue and remove docs from the index ? Y | ||
|
||
WARNING: 1 broken segments (containing 32 documents) detected | ||
Took 0.056 sec total. | ||
Writing... | ||
OK | ||
Wrote new segments file "segments_c" | ||
Marking index with the new history uuid : 0pIBd9VTSOeMfzYT6p0AsA | ||
Changing allocation id V8QXk-QXSZinZMT-NvEq4w to tjm9Ve6uTBewVFAlfUMWjA | ||
|
||
You should run the following command to allocate this shard: | ||
|
||
POST /_cluster/reroute | ||
{ | ||
"commands" : [ | ||
{ | ||
"allocate_stale_primary" : { | ||
"index" : "index42", | ||
"shard" : 0, | ||
"node" : "II47uXW2QvqzHBnMcl2o_Q", | ||
"accept_data_loss" : false | ||
} | ||
} | ||
] | ||
} | ||
|
||
You must accept the possibility of data loss by changing parameter `accept_data_loss` to `true`. | ||
|
||
Deleted corrupt marker corrupted_FzTSBSuxT7i3Tls_TgwEag from /var/lib/elasticsearchdata/nodes/0/indices/P45vf_YQRhqjfwLMUvSqDw/0/index/ | ||
|
||
-------------------------------------------------- | ||
|
||
When you use `elasticsearch-shard` to drop the corrupted data, the shard's | ||
allocation ID changes. After restarting the node, you must use the | ||
<<cluster-reroute,cluster reroute API>> to tell Elasticsearch to use the new | ||
ID. The `elasticsearch-shard` command shows the request that | ||
you need to submit. | ||
|
||
You can also use the `-h` option to get a list of all options and parameters | ||
that the `elasticsearch-shard` tool supports. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -92,6 +92,10 @@ The maximum duration for which translog files will be kept. Defaults to `12h`. | |
[[corrupt-translog-truncation]] | ||
=== What to do if the translog becomes corrupted? | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To avoid having to set up redirects & help steer people to the new tool, I'd keep this section heading and just xref the new one. |
||
[WARNING] | ||
This tool is deprecated and will be completely removed in 7.0. | ||
Use the <<shard-tool,elasticsearch-shard tool>> instead of this one. | ||
|
||
In some cases (a bad drive, user error) the translog on a shard copy can become | ||
corrupted. When this corruption is detected by Elasticsearch due to mismatching | ||
checksums, Elasticsearch will fail that shard copy and refuse to use that copy | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The removal of the
elasticsearch-translog
tool is a breaking change, so this cannot happen in 6.5. At the moment this PR is only tagged for 7.0, which is ok, but it cannot be backported as-is.