-
Notifications
You must be signed in to change notification settings - Fork 24.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML-DataFrame] make checkpointing more robust #44344
Conversation
- do not let checkpointing fail if indexes got deleted - treat missing seqNoStats as just created indices (checkpoint 0) - loglevel: do not treat failed updated checks as error fixes elastic#43992
Pinging @elastic/ml-core |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 We no longer throw NotFound exceptions on missing indices
👍 We handle indices going away between old checkpoints and new checkpoints
👍 We only expand to open indices
It may be good to log+audit when we detect an index was removed. It should just be an info
as it may be a desired action, but it would be good to give indication that we are no longer seeing a previously checkpointed index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM2
make checkpointing more robust: - do not let checkpointing fail if indexes got deleted - treat missing seqNoStats as just created indices (checkpoint 0) - loglevel: do not treat failed updated checks as error fixes elastic#43992
make checkpointing more robust: - do not let checkpointing fail if indexes got deleted - treat missing seqNoStats as just created indices (checkpoint 0) - loglevel: do not treat failed updated checks as error fixes elastic#43992
Since elastic#44344 we use IndicesOptions.LENIENT_EXPAND_OPEN when deciding which indices to include in checkpoint calculation. This change uses the same option when deciding which indices to search for data and which indices to get mappings from, otherwise there is a potential mismatch between the checkpoint details and what is searched elsewhere.
Since #44344 we use IndicesOptions.LENIENT_EXPAND_OPEN when deciding which indices to include in checkpoint calculation. This change uses the same option when deciding which indices to search for data and which indices to get mappings from, otherwise there is a potential mismatch between the checkpoint details and what is searched elsewhere.
Since #44344 we use IndicesOptions.LENIENT_EXPAND_OPEN when deciding which indices to include in checkpoint calculation. This change uses the same option when deciding which indices to search for data and which indices to get mappings from, otherwise there is a potential mismatch between the checkpoint details and what is searched elsewhere.
Since #44344 we use IndicesOptions.LENIENT_EXPAND_OPEN when deciding which indices to include in checkpoint calculation. This change uses the same option when deciding which indices to search for data and which indices to get mappings from, otherwise there is a potential mismatch between the checkpoint details and what is searched elsewhere.
make checkpointing more robust:
fixes #43992
flagged as
non-issue
because this belongs to a new unreleased feature, however, we see this as showstopper for releasing continuous data frames