Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use local checkpoint to calculate min translog gen for recovery #51905

Merged
merged 12 commits into from
Feb 10, 2020

Conversation

dnhatn
Copy link
Member

@dnhatn dnhatn commented Feb 5, 2020

Today we use the translog_generation of the safe commit as the minimum required translog generation for recovery. This approach has a limitation, where we won't be able to clean up translog unless we flush. Reopening an already recovered engine will create a new empty translog, and we leave it there until we force flush.

This commit removes the translog_generation commit tag and uses the local checkpoint of the safe commit to calculate the minimum required translog generation for recovery instead.

Closes #49970

@dnhatn dnhatn added >enhancement :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. v8.0.0 v7.7.0 labels Feb 5, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (:Distributed/Engine)

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the uncommittedSizeInBytes and uncommittedOperations metrics are pretty useless today, as they are not a measure of how much data needs to be recovered/replayed after a crash. I would rather base these metrics on the local checkpoint of the safe commit in all cases, and completely disregard the last commit (which is irrelevant for recovery).

@dnhatn
Copy link
Member Author

dnhatn commented Feb 7, 2020

I think that the uncommittedSizeInBytes and uncommittedOperations metrics are pretty useless today, as they are not a measure of how much data needs to be recovered/replayed after a crash. I would rather base these metrics on the local checkpoint of the safe commit in all cases, and completely disregard the last commit (which is irrelevant for recovery).

++. I pushed c4bccea.

@dnhatn dnhatn requested a review from ywelsch February 7, 2020 14:24
@ywelsch
Copy link
Contributor

ywelsch commented Feb 7, 2020

There are relevant test failures here

@dnhatn
Copy link
Member Author

dnhatn commented Feb 7, 2020

@ywelsch Thanks for looking at the test failures. I've addressed them in 5ea2cae.

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM. I am not familiar enough with all 6.x development that I can confidently say that this will work in both rolling and full restart upgrades, but my search for an issue did not reveal anything. So better wait for Yannick's approval too.

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dnhatn
Copy link
Member Author

dnhatn commented Feb 10, 2020

@ywelsch @henningandersen Thanks for reviewing.

@dnhatn dnhatn merged commit ebc4681 into elastic:master Feb 10, 2020
@dnhatn dnhatn deleted the seqno-tlog-policy branch February 10, 2020 13:26
dnhatn added a commit that referenced this pull request Feb 11, 2020
We roll a new translog generation and trim operations that are above the 
global checkpoint during primary-replica resync. If the initOperations
is empty, then the stale operation on the replica2 will be discarded as
it is the only operation in a translog file (since #51905 where we 
started using the local checkpoint to calculate the minimum required
translog generation for recovery). Otherwise, the stale op will be
retained along with initOperations but will be skipped in snapshots.

Relates #51905
Closes #52148
dnhatn added a commit that referenced this pull request Feb 12, 2020
Since #51905, we use the local checkpoint of the safe commit to 
calculate the number of uncommitted operations of a translog stats. If a
periodic flush triggered by afterWriteOperation completes before we sync
translog, then the last commit is not safe. We also need to sync
translog from Engine instead of the translog so that we can advance the
safe commit.

Relates #51905
Closes #52223
dnhatn added a commit that referenced this pull request Feb 12, 2020
Since #51905, we skip translog recovery if the local checkpoint of the 
safe commit equals to the global checkpoint. This change adjusts the
test not to create a new snapshot in that case.

Closes #52221
Relates #51905
@dnhatn dnhatn mentioned this pull request Feb 12, 2020
dnhatn added a commit that referenced this pull request Feb 12, 2020
We need to reduce the translog sync interval for indices with translog 
async setting so that we can have the safe commit in the assertBusy
interval. This is needed since #51905, where we use the local checkpoint
of the safe commit to calculate the number of uncommitted operations of
a translog stats.

Closes #52251
Relates #51905
dnhatn added a commit that referenced this pull request Feb 12, 2020
We need to reduce the translog sync interval for indices with translog
async setting so that we can have the safe commit in the assertBusy
interval. This is needed since #51905, where we use the local checkpoint
of the safe commit to calculate the number of uncommitted operations of
a translog stats.

Closes #52251
Relates #51905
dnhatn added a commit that referenced this pull request Feb 18, 2020
Asserts that no new operations are made into the translog since we re-opened the engine.

Relates #51905
Closes #52410
sbourke pushed a commit to sbourke/elasticsearch that referenced this pull request Feb 19, 2020
Asserts that no new operations are made into the translog since we re-opened the engine.

Relates elastic#51905
Closes elastic#52410
dnhatn added a commit to dnhatn/elasticsearch that referenced this pull request Feb 26, 2020
…tic#51905)

Today we use the translog_generation of the safe commit as the minimum
required translog generation for recovery. This approach has a
limitation, where we won't be able to clean up translog unless we flush.
Reopening an already recovered engine will create a new empty translog,
and we leave it there until we force flush.

This commit removes the translog_generation commit tag and uses the
local checkpoint of the safe commit to calculate the minimum required
translog generation for recovery instead.

Closes elastic#49970
dnhatn added a commit to dnhatn/elasticsearch that referenced this pull request Feb 26, 2020
Since elastic#51905, we skip translog recovery if the local checkpoint of the
safe commit equals to the global checkpoint. This change adjusts the
test not to create a new snapshot in that case.

Closes elastic#52221
Relates elastic#51905
dnhatn added a commit to dnhatn/elasticsearch that referenced this pull request Feb 26, 2020
Since elastic#51905, we use the local checkpoint of the safe commit to
calculate the number of uncommitted operations of a translog stats. If a
periodic flush triggered by afterWriteOperation completes before we sync
translog, then the last commit is not safe. We also need to sync
translog from Engine instead of the translog so that we can advance the
safe commit.

Relates elastic#51905
Closes elastic#52223
dnhatn added a commit to dnhatn/elasticsearch that referenced this pull request Feb 26, 2020
Asserts that no new operations are made into the translog since we re-opened the engine.

Relates elastic#51905
Closes elastic#52410
dnhatn added a commit that referenced this pull request Feb 26, 2020
Today we use the translog_generation of the safe commit as the minimum
required translog generation for recovery. This approach has a
limitation, where we won't be able to clean up translog unless we flush.
Reopening an already recovered engine will create a new empty translog,
and we leave it there until we force flush.

This commit removes the translog_generation commit tag and uses the
local checkpoint of the safe commit to calculate the minimum required
translog generation for recovery instead.

Closes #49970
dnhatn added a commit that referenced this pull request Feb 26, 2020
Since #51905, we skip translog recovery if the local checkpoint of the
safe commit equals to the global checkpoint. This change adjusts the
test not to create a new snapshot in that case.

Closes #52221
Relates #51905
dnhatn added a commit that referenced this pull request Feb 26, 2020
Since #51905, we use the local checkpoint of the safe commit to
calculate the number of uncommitted operations of a translog stats. If a
periodic flush triggered by afterWriteOperation completes before we sync
translog, then the last commit is not safe. We also need to sync
translog from Engine instead of the translog so that we can advance the
safe commit.

Relates #51905
Closes #52223
dnhatn added a commit that referenced this pull request Feb 26, 2020
Asserts that no new operations are made into the translog since we re-opened the engine.

Relates #51905
Closes #52410
@tlrx tlrx mentioned this pull request Mar 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. >enhancement v7.7.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Too many translog files open, and reach ulimit setting(65535)
5 participants