Skip to content

Commit

Permalink
Allow flush during translog replay
Browse files Browse the repository at this point in the history
When replaying from translog flushes are currently not allowed for
historic reasons in stateful. However, the way translog is handled
in stateless reintroduced a similar reason - when we flush we assume
that the newest translog is all that need to be replayed in case of
a crash. Fix this to now copy over the translog info if we flush
during translog replay.

Flushes can primarily happen during translog replay due to
inaccuracies in the live version map causing it to go unsafe,
subsequently causing a refresh, which converts toa flush in
stateless. However, we could run in a disk constrained env during
replay and as such allowing flushes seems safer.
  • Loading branch information
henningandersen committed Nov 7, 2024
1 parent 90600bf commit 48047b3
Showing 1 changed file with 12 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -632,14 +632,22 @@ private void recoverFromTranslogInternal(
// thread to continue with recovery, but if it doesn't do anything async then there's no need to fork, hence why we use a
// SubscribableListener here
final var flushListener = new SubscribableListener<FlushResult>();
flush(false, true, flushListener);
flushCompletedTranslogRecovery(flushListener);
flushListener.addListener(l.delegateFailureAndWrap((ll, r) -> {
translog.trimUnreferencedReaders();
ll.onResponse(null);
}), engineConfig.getThreadPool().generic(), null);
});
}

protected void flushCompletedTranslogRecovery(SubscribableListener<FlushResult> flushListener) {
flush(false, true, flushListener);
}

protected boolean pendingTranslogRecovery() {
return pendingTranslogRecovery.get();
}

protected Translog.Snapshot newTranslogSnapshot(long fromSeqNo, long toSeqNo) throws IOException {
return translog.newSnapshot(fromSeqNo, toSeqNo);
}
Expand Down Expand Up @@ -2979,7 +2987,9 @@ protected Map<String, String> getCommitExtraUserData() {
return Collections.emptyMap();
}

final void ensureCanFlush() {
// todo: this protection is no longer needed, after ES now relies on sequence numbers for figuring out which translog to replay.
// in the past, ES would attach the latest translog generation to a commit, making it unsafe to flush during translog replay.
protected void ensureCanFlush() {
// translog recovery happens after the engine is fully constructed.
// If we are in this stage we have to prevent flushes from this
// engine otherwise we might loose documents if the flush succeeds
Expand Down

0 comments on commit 48047b3

Please sign in to comment.