-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rewrite sync manager #4599
rewrite sync manager #4599
Conversation
cf5f577
to
f055681
Compare
which incidentally fixes tests
} | ||
|
||
func isHeavier(a, b *types.TipSet) bool { | ||
return a.ParentWeight().GreaterThan(b.ParentWeight()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should probably do some other heuristic that accounts for number of blocks in tispets with the same ParentWeight, but idk if that will really affect anything
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's think about this a bit, it's not clear if it is the right thing to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM, But before merging we definitely want to have several people running it for a few days. We've fixed enough sync bugs in the past month that i'm sufficiently scared of touching this code.
It seems we should be forming tipsets in the pending queue. In essence from different peers we might be getting parts of what will become a tipset. |
|
Signed-off-by: Jakub Sztandera <kubuxu@protocol.ai>
Signed-off-by: Jakub Sztandera <kubuxu@protocol.ai>
@whyrusleeping agreed - we definitely need to test it quite a bit before merging. |
we don't care about order of checks!
I'm running this now on a couple of my lotus nodes |
Added a limit for the maximum number of active sync workers (set to 5), so that we don't spawn too many. |
Improved the handling of the initial sync; I observed that during a from scratch sync we spawned multiple workers to work on sibling tipsets! |
Signed-off-by: Jakub Sztandera <kubuxu@protocol.ai>
Signed-off-by: Jakub Sztandera <kubuxu@protocol.ai>
this is ridiculous; since when do linters apply spellchecking on comments???
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have some more documentation around the syncTargetBucket
abstraction and its invariants please? (I know that this is inherited from the previous implementation but it has impact in the current changes.) There is something that I'm not understanding in the code about how do we detect a possible fork, it seems the RelatedToAny
/sameChainAs
can admit blocks from (potentially) different forks in the same bucket.
type syncTargetBucket struct { | ||
tips []*types.TipSet | ||
} | ||
|
||
func (stb *syncTargetBucket) sameChainAs(ts *types.TipSet) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is asserting that the tipset has a direct parent/child connection with any (but not all) of the tipsets in this bucket. Later (*syncBucketSet).Insert()
will add this tipset to the bucket. If the bucket contains a parent/child tipsets A <- B
and then comes tipset C
also child of A
but unrelated to B
we would accept it just the same leading to a bucket with two tipsets from potentially different chains:
A <- B
A <- C
// it's not related to any active or pending sync; this could be a fork in which case we | ||
// start a new worker to sync it, if it is *heavier* than any active or pending set; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment implies that after the RelatedToAny
calls what's left here is a potential fork case but the fork could have been caught in RelatedToAny
(see comment there).
Note: hasn't been yet tested in the live network.It is currently being tested in the live network.