-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adjust CombinedDeletionPolicy for multiple commits #27456
Conversation
Today, we keep only the last index commit and use only it to calculate the minimum required translog generation. This may no longer be correct as we introduced a new deletion policy which keeps multiple index commits. This changes adjust the `CombinedDeletionPolicy` so that it can work correctly with a new index deletion policy. Relates to elastic#10708, elastic#27367
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
// However, there are commits that are not deleted just because they are being snapshotted (rather than being kept by the policy). | ||
// TODO: We need to distinguish those commits and skip them in calculating the minimum required translog generation. | ||
long minRequiredGen = Long.MAX_VALUE; | ||
for (IndexCommit indexCommit : commits) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can do some streaming java8 magic here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, but indexCommit.getUserData()
throws IOException.
Thanks @bleskes. |
Today, we keep only the last index commit and use only it to calculate the minimum required translog generation. This may no longer be correct as we introduced a new deletion policy which keeps multiple index commits. This change adjusts the CombinedDeletionPolicy so that it can work correctly with a new index deletion policy. Relates to #10708, #27367
* es/master: (38 commits) Backport wait_for_initialiazing_shards to cluster health API Carry over version map size to prevent excessive resizing (#27516) Fix scroll query with a sort that is a prefix of the index sort (#27498) Delete shard store files before restoring a snapshot (#27476) Replace `delimited_payload_filter` by `delimited_payload` (#26625) CURRENT should not be a -SNAPSHOT version if build.snapshot is false (#27512) Fix merging of _meta field (#27352) Remove unused method (#27508) unmuted test, this has been fixed by #27397 Consolidate version numbering semantics (#27397) Add wait_for_no_initializing_shards to cluster health API (#27489) [TEST] use routing partition size based on the max routing shards of the second split Adjust CombinedDeletionPolicy for multiple commits (#27456) Update composite-aggregation.asciidoc Deprecate `levenstein` in favor of `levenshtein` (#27409) Automatically prepare indices for splitting (#27451) Validate `op_type` for `_create` (#27483) Minor ShapeBuilder cleanup muted test Decouple nio constructs from the tcp transport (#27484) ...
* es/6.x: (30 commits) Add wait_for_no_initializing_shards to cluster health API (#27489) Carry over version map size to prevent excessive resizing (#27516) Fix scroll query with a sort that is a prefix of the index sort (#27498) Delete shard store files before restoring a snapshot (#27476) CURRENT should not be a -SNAPSHOT version if build.snapshot is false (#27512) Fix merging of _meta field (#27352) test: do not run percolator query builder bwc test against 5.x versions Remove unused method (#27508) Consolidate version numbering semantics (#27397) Adjust CombinedDeletionPolicy for multiple commits (#27456) Minor ShapeBuilder cleanup [GEO] Deprecate ShapeBuilders and decouple geojson parse logic Improve docs for split API in 6.1/6.x (#27504) test: use correct pre 6.0.0-alpha1 format Update composite-aggregation.asciidoc Deprecate `levenstein` in favor of `levenshtein` (#27409) Decouple nio constructs from the tcp transport (#27484) Bump version from 6.1 to 6.2 Fix whitespace in Security.java Tighten which classes can exit ...
The commit looks harmless, unfortunately it can break the engine flush scheduler and the translog rolling. Both `uncommittedOperations` and `uncommittedSizeInBytes` are currently calculated based on the minimum required generation for recovery rather than the translog generation of the last index commit. This is not correct if other index commits are reserved for snapshotting even though we are keeping the last index commit only. This reverts commit e95d18e.
The commit looks harmless, unfortunately it can break the engine flush scheduler and the translog rolling. Both `uncommittedOperations` and `uncommittedSizeInBytes` are currently calculated based on the minimum required generation for recovery rather than the translog generation of the last index commit. This is not correct if other index commits are reserved for snapshotting even though we are keeping the last index commit only. This reverts commit e95d18e.
@bleskes I have reverted this PR. This looks harmless, unfortunately it can break the engine flush scheduler and the translog rolling. Both I will include this to #27367. |
* master: Revert "Adjust CombinedDeletionPolicy for multiple commits (elastic#27456)" Transpose expected and actual, and remove duplicate info from message. (elastic#27515) [DOCS] Fixed broken link in breaking changes Backport wait_for_initialiazing_shards to cluster health API
* master: Revert "Adjust CombinedDeletionPolicy for multiple commits (elastic#27456)" Transpose expected and actual, and remove duplicate info from message. (elastic#27515) [DOCS] Fixed broken link in breaking changes Backport wait_for_initialiazing_shards to cluster health API
@dnhatn I wonder if we should wrap the CombinedDeletionPolicy with the SanpshotDeletionPolicy, rather than the other way around. Did you look into that? |
@bleskes We can not distinguish between kept commits and snapshotted commits with the The actual issue is a single index commit assumption. This assumption is perfect and totally correct, however it no longer be correct if we update the translog (this PR) and index deletion policy. I am addressing this and will include it to #27367. |
@bleskes, Yes. We can use a |
@bleskes Just to confirm that this works as expected 👍 |
Today we can not distinguish between index commits that are kept by the primary policy and those are kept for snapshotting with a SnapshotDeletionPolicy. Since we enclose a SnapshotDeletionPolicy in a CombinedDeletionPolicy, we also we can not distinguish between those with a CombinedDeletionPolicy. This can be a problem if we update the TranslogDeletionPolicy to keep the minimum translog generation of undeleted index commits as we may keep the translog of a snapshotting commit even though it is "deleted" by the primary policy. To solve this, we enclose a CombinedDeletionPolicy in a SnapshotDeletionPolicy and track if an index commit is deleted by the primary policy, then use that value to maintain translog rather than the actual deletion of an index commit. Relates elastic#27456 elastic#27367
* master: Skip shard refreshes if shard is `search idle` (#27500) Remove workaround in translog rest test (#27530) inner_hits: Return an empty _source for nested inner hit when filtering on a field that doesn't exist. percolator: Avoid TooManyClauses exception if number of terms / ranges is exactly equal to 1024 Dedup translog operations by reading in reverse (#27268) Ensure logging is configured for CLI commands Ensure `doc_stats` are changing even if refresh is disabled (#27505) Fix classes that can exit Revert "Adjust CombinedDeletionPolicy for multiple commits (#27456)" Transpose expected and actual, and remove duplicate info from message. (#27515) [DOCS] Fixed broken link in breaking changes
* 6.x: [DOCS] s/Spitting/Splitting in split index docs inner_hits: Return an empty _source for nested inner hit when filtering on a field that doesn't exist. percolator: Avoid TooManyClauses exception if number of terms / ranges is exactly equal to 1024 Dedup translog operations by reading in reverse (#27268) Ensure logging is configured for CLI commands Ensure `doc_stats` are changing even if refresh is disabled (#27505) Fix classes that can exit Revert "Adjust CombinedDeletionPolicy for multiple commits (#27456)" Transpose expected and actual, and remove duplicate info from message.
Today, we keep only the last index commit and use only it to calculate the minimum required translog generation. This may no longer be correct as we introduced a new deletion policy which keeps multiple index commits. This change adjusts the
CombinedDeletionPolicy
so that it can work correctly with a new index deletion policy.Relates to #10708, #27367