-
Notifications
You must be signed in to change notification settings - Fork 457
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not wait running-xacts record from primary if checkoint doesn't contain oldestActiveXid #8484
base: main
Are you sure you want to change the base?
Conversation
3132 tests run: 3011 passed, 0 failed, 121 skipped (full report)Flaky tests (2)Postgres 16
Postgres 14
Code coverage* (full report)
* collected from Rust tests only The comment gets automatically updated with the latest test results
6e6b33c at 2024-07-28T12:52:52.069Z :recycle: |
…ntain oldestActiveXid
705438c
to
6e6b33c
Compare
else | ||
{ | ||
/* Assume that there are no active transactions except prepared */ | ||
PrescanPreparedTransactions(xids, nxids); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed with @knizhnik that he will try to make a repro test based on test_replica_start_with_too_many_unused_xids
to exercise the new path
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now the problem can not be reproduced with test: oldfstActiveXid is assigned when checkpoint record is replayed (online or shutdown). And checkpoint record is written by initdb. So when we just creates database, we already has checkpoint with valid oldestActiveXid.
So how this problem (with invalid oldestActiveXid
) can happen at all at prod?
Shutdown checkpoint contains oldestActiveXid=0
(just because there are no active transactions after shutdown.
Now pageserver contains the following logic:
// A shutdown checkpoint has `oldestActiveXid == InvalidTransactionid`,
// because at shutdown, all in-progress transactions will implicitly
// end. Postgres startup code knows that, and allows hot standby to start
// immediately from a shutdown checkpoint.
//
// In Neon, Postgres hot standby startup always behaves as if starting from
// an online checkpoint. It needs a valid `oldestActiveXid` value, so
// instead of overwriting self.checkpoint.oldestActiveXid with
// InvalidTransactionid from the checkpoint WAL record, update it to a
// proper value, knowing that there are no in-progress transactions at this
// point, except for prepared transactions.
//
// See also the neon code changes in the InitWalRecovery() function.
if xlog_checkpoint.oldestActiveXid == pg_constants::INVALID_TRANSACTION_ID
&& info == pg_constants::XLOG_CHECKPOINT_SHUTDOWN
{
let mut oldest_active_xid = self.checkpoint.nextXid.value as u32;
for xid in modification.tline.list_twophase_files(lsn, ctx).await? {
if (xid.wrapping_sub(oldest_active_xid) as i32) < 0 {
oldest_active_xid = xid;
}
}
self.checkpoint.oldestActiveXid = oldest_active_xid;
} else {
self.checkpoint.oldestActiveXid = xlog_checkpoint.oldestActiveXid;
}
But this code was added not so long time ago.
May be last time primary was active (and write shutdown checkpoint) before this change was deployed to prod.
In this case last written checkpoint will contain invalid (0) oldestActiveXid
.
In any case, if it happen with one client, it may also happen with other clients.
This is why I think it will be useful to add logic which can handle such situation.
@@ -366,23 +366,38 @@ RestoreRunningXactsFromClog(CheckPoint *checkpoint, TransactionId **xids, int *n | |||
if (!TransactionIdIsValid(checkpoint->oldestActiveXid)) | |||
{ | |||
elog(LOG, "cannot restore running-xacts from CLOG because oldestActiveXid is not set"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this message be under if (running_xacts_overflow_policy == OP_WAIT)
now? Otherwise, now we have a path to write this message, but then still try to restore running xacts (if (TransactionIdIsValid(checkpoint->oldestXid))
path)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I specially move the message before check to register I the go the fact that subxids overflow took place.
@@ -366,23 +366,38 @@ RestoreRunningXactsFromClog(CheckPoint *checkpoint, TransactionId **xids, int *n | |||
if (!TransactionIdIsValid(checkpoint->oldestActiveXid)) | |||
{ | |||
elog(LOG, "cannot restore running-xacts from CLOG because oldestActiveXid is not set"); | |||
goto fail; | |||
} | |||
if (running_xacts_overflow_policy == OP_WAIT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, on the second thought, I don't understand why we check the running_xacts_overflow_policy
here at all. At this moment, we do not know yet whether there is an overflow or not.
Maybe that's valid, but then it's worth a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I didn't want to rename GUC name and variable. And didn't want to introduce yet another GUC.
So I used the same GUC running_xacts_overflow_policy
to handle situation when oldestActiveXid is invalid.
I have the same issue with this PR as with the Why would it OK when oldestActiveXid==0 ? It's just as dangerous in that case AFAICS. (also, no tests..) |
The motivation of this PR is the same as with original PR introducing The problem with overflow was caused by gaps in XID range due to XID alignment. Concerning test - as I wrote in comment to the issue, right now the problem can not be reproduced.Checkpoint WAL record is created by initdb, which us used to prepare new database. So |
WAL for affected databases is there, so should be easy to check what exactly went wrong |
Problem
If oldestActiveXid is not set in checkpoint and primary is not running, then replica failed to start.
See https://neondb.slack.com/archives/C07E5U1660Z/p1721804204809549
Changes
If oldestActiveXid is invalid and oldestXid (CLOG low boundary) is valid, then restore runningxacts from the beginning of CLOG. Otherwise just assume that only prepared transaction are active.
Checklist before requesting a review
Checklist before merging