Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rewrite sync manager #4599

Merged
merged 26 commits into from
Nov 6, 2020
Merged

rewrite sync manager #4599

merged 26 commits into from
Nov 6, 2020

Conversation

vyzo
Copy link
Contributor

@vyzo vyzo commented Oct 26, 2020

Note: hasn't been yet tested in the live network.
It is currently being tested in the live network.

}

func isHeavier(a, b *types.TipSet) bool {
return a.ParentWeight().GreaterThan(b.ParentWeight())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should probably do some other heuristic that accounts for number of blocks in tispets with the same ParentWeight, but idk if that will really affect anything

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's think about this a bit, it's not clear if it is the right thing to do.

Copy link
Member

@whyrusleeping whyrusleeping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM, But before merging we definitely want to have several people running it for a few days. We've fixed enough sync bugs in the past month that i'm sufficiently scared of touching this code.

@Kubuxu
Copy link
Contributor

Kubuxu commented Oct 27, 2020

It seems we should be forming tipsets in the pending queue. In essence from different peers we might be getting parts of what will become a tipset.
Currently heaviestTipset will just pop one of them and use it.

@Kubuxu
Copy link
Contributor

Kubuxu commented Oct 27, 2020

louts sync status is borked with this
as well as lotus sync wait.

Jakub Sztandera and others added 3 commits October 27, 2020 18:00
@vyzo
Copy link
Contributor Author

vyzo commented Oct 27, 2020

@whyrusleeping agreed - we definitely need to test it quite a bit before merging.

we don't care about order of checks!
@whyrusleeping
Copy link
Member

I'm running this now on a couple of my lotus nodes

@vyzo
Copy link
Contributor Author

vyzo commented Oct 28, 2020

Added a limit for the maximum number of active sync workers (set to 5), so that we don't spawn too many.

@vyzo
Copy link
Contributor Author

vyzo commented Oct 28, 2020

Improved the handling of the initial sync; I observed that during a from scratch sync we spawned multiple workers to work on sibling tipsets!

Copy link
Contributor

@schomatis schomatis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have some more documentation around the syncTargetBucket abstraction and its invariants please? (I know that this is inherited from the previous implementation but it has impact in the current changes.) There is something that I'm not understanding in the code about how do we detect a possible fork, it seems the RelatedToAny/sameChainAs can admit blocks from (potentially) different forks in the same bucket.

type syncTargetBucket struct {
tips []*types.TipSet
}

func (stb *syncTargetBucket) sameChainAs(ts *types.TipSet) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is asserting that the tipset has a direct parent/child connection with any (but not all) of the tipsets in this bucket. Later (*syncBucketSet).Insert() will add this tipset to the bucket. If the bucket contains a parent/child tipsets A <- B and then comes tipset C also child of A but unrelated to B we would accept it just the same leading to a bucket with two tipsets from potentially different chains:

A <- B
A <- C

Comment on lines +396 to +397
// it's not related to any active or pending sync; this could be a fork in which case we
// start a new worker to sync it, if it is *heavier* than any active or pending set;
Copy link
Contributor

@schomatis schomatis Nov 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment implies that after the RelatedToAny calls what's left here is a potential fork case but the fork could have been caught in RelatedToAny (see comment there).

@Kubuxu Kubuxu merged commit 5a34e5b into master Nov 6, 2020
@Kubuxu Kubuxu deleted the feat/sync-manager-redux branch November 6, 2020 20:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants