Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: sync: harden chain sync #10756

Merged
merged 4 commits into from
Apr 26, 2023
Merged

feat: sync: harden chain sync #10756

merged 4 commits into from
Apr 26, 2023

Conversation

Stebalien
Copy link
Member

Proposed Changes

This makes two changes:

  1. Write headers in chain order, not reverse order. This shouldn't matter, but it may help if badger crashes.
  2. Force collectChain to collect the correct chain. This shouldn't matter, but we might as well (as long as it's safe).

WRT 1, I'm worried about the following:

  1. Attempt to sync a chain from epochs 3-7.
  2. Write tipsets 7, 6, 5.
  3. Badger realizes that the current vlog is "full", so it creates a new one.
  4. Write tipsets 4, 3.

My concern is that, while badger will write synchronously, it doesn't call fsync on the file itself, and the directory containing the file. That means, if the machine crashes, the new vlog might just disappear.

Of course, the right solution would be to fix badger to sync the file/directory metadata, but that's not likely to happen, ever.

Checklist

Before you mark the PR ready for review, please make sure that:

  • Commits have a clear commit message.
  • PR title is in the form of of <PR type>: <area>: <change being made>
    • example: fix: mempool: Introduce a cache for valid signatures
    • PR type: fix, feat, build, chore, ci, docs, perf, refactor, revert, style, test
    • area, e.g. api, chain, state, market, mempool, multisig, networking, paych, proving, sealing, wallet, deps
  • Tests exist for new functionality or change in behavior
  • CI is green

@Stebalien Stebalien requested a review from a team as a code owner April 25, 2023 18:31
Copy link
Contributor

@arajasek arajasek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Stebalien I think the 2 changes you describe in the PR description make sense and are good to go. I don't understand why we're no longer putting the maybeHead, though.

@Stebalien
Copy link
Member Author

I don't understand why we're no longer putting the maybeHead, though.

We should already be doing that here:

lotus/chain/sync.go

Lines 1191 to 1198 in 3f03e63

for i := len(headers) - 1; i >= 0; i++ {
ts := headers[i]
if err := syncer.store.PersistTipset(ctx, ts); err != nil {
err = xerrors.Errorf("failed to persist synced tipset to the chainstore: %w", err)
ss.Error(err)
return err
}
}

We check that headers[0] is the correct head here:

lotus/chain/sync.go

Lines 1184 to 1186 in 3f03e63

if !headers[0].Equals(ts) {
return xerrors.Errorf("collectChain synced %s, wanted to sync %s", headers[0].Cids(), ts.Cids())
}
.

@arajasek
Copy link
Contributor

@Stebalien I do not think you're right here -- this is PutTipsetting, which is tremendously unfortunately named, but is what actually:

  • expands tipsets where possible
  • actually updates the chain head (if necessary).

I think dropping that code will break everything, though I'm admittedly not 100% sure.

Put != Persist
@Stebalien
Copy link
Member Author

Oh! Yes, you're right. I saw P...TipSet and didn't even register that they were different functions.

@Stebalien Stebalien merged commit d2c3e84 into master Apr 26, 2023
@Stebalien Stebalien deleted the steb/paranoid-chain-sync branch April 26, 2023 21:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants