-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add LoE to the ChainDB QSM test #1119
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test failure in getChain_addChain
makes sense: The test is making sure that for every chain bc
, adding its blocks to an empty model causes the selection to be bc
. With the LoE enabled and at Genesis (as is the case in the counterexamples), this of course fails when the chain is sufficiently long.
One way to fix this would be to discard these test cases, or even just disable the LoE completely for them.
The failure of alwaysPickPreferredChain
seems to be for a very similar reason.
ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/ChainDB/API.hs
Outdated
Show resolved
Hide resolved
ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/ChainDB/API.hs
Outdated
Show resolved
Hide resolved
ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/ChainDB/API.hs
Outdated
Show resolved
Hide resolved
ouroboros-consensus/test/storage-test/Test/Ouroboros/Storage/ChainDB/Model/Test.hs
Outdated
Show resolved
Hide resolved
Indeed, it was as simple as that. Fixed in 660b331. Could you check it, and in particular tell me if the comment of |
419866f
to
656c31d
Compare
80db2f9
to
2da9483
Compare
Now rebased on top of #1125. |
After 14 hours of tests, I find a counter-example to this PR. Reproducible on branch
|
Actually, upon inspecting this bug, it turns out that this failure is not LoE-related. The LoE is very much disabled and it looks like this has to do with GCing or streaming. I don't exactly know what this should do, but the behaviour of the model looks wrong to me. |
Yup, I can reproduce with the same seed on +source-repository-package
+ type: git
+ location: https://github.com/UnkindPartition/tasty
+ tag: dcbf32078133aa2b569774c417cc49a49f5c573b
+ subdir: quickcheck
+ --sha256: 0rwfcv0iq50ckzdgdssgkil6g9n76w1k6yxs0q4fdrkf5nlh4vv5 in |
41085a1
to
41d8a9d
Compare
e85da74
to
cd032b1
Compare
41d8a9d
to
a0a819d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really nice to finally get this
ouroboros-consensus/test/storage-test/Test/Ouroboros/Storage/ChainDB/Model.hs
Outdated
Show resolved
Hide resolved
ouroboros-consensus/test/storage-test/Test/Ouroboros/Storage/ChainDB/Model/Test.hs
Outdated
Show resolved
Hide resolved
ouroboros-consensus/test/storage-test/Test/Ouroboros/Storage/ChainDB/StateMachine.hs
Outdated
Show resolved
Hide resolved
ouroboros-consensus/test/storage-test/Test/Ouroboros/Storage/ChainDB/Model.hs
Outdated
Show resolved
Hide resolved
ouroboros-consensus/test/storage-test/Test/Ouroboros/Storage/ChainDB/Model.hs
Outdated
Show resolved
Hide resolved
ouroboros-consensus/test/storage-test/Test/Ouroboros/Storage/ChainDB/Model.hs
Outdated
Show resolved
Hide resolved
ouroboros-consensus/test/storage-test/Test/Ouroboros/Storage/ChainDB/Model.hs
Outdated
Show resolved
Hide resolved
ouroboros-consensus/test/storage-test/Test/Ouroboros/Storage/ChainDB/StateMachine.hs
Outdated
Show resolved
Hide resolved
5680c0b
to
53ea76c
Compare
This PR improves the code documentation of Genesis, and implements some fixes and optimizations. ### Documentation, refactoring, and testing The code documentation of GDD, CSJ, and LoE is better connected in cb5b69a. Additionally, cd032b1 brings small changes to the LoE, motivated by issues found while integrating the LoE into the ChainDB q-s-m tests (see #1119). 2f7152b enables CSJ in all of the uniform tests and tests run more tries now. Shrinking of peer schedules has been simplified in 627634f, and 6b44bfb adds an extra field to the `PointSchedule` type to specify the time at the end of a test, allowing to clean up a lot of schedules. ### Profiling and optimizations We consider our baseline, the fastest possible syncing, to be the scenario in which the syncing peer connects to a caught up peer and syncs using Praos. From the profiling of Genesis on mainnet data, the GDD governor and chain selection was revealed as one of the major sinks of CPU time when enabling Genesis and syncing from multiple peers (between 2 and 30 peers). Chain selection has been made idempotent again, to save on useless recomputations in 82376e6. Idempotency had been lost in #1015. 64f3da3 also calls to change selection less often, by avoiding calls when the LoE fragment doesn't change in a significant way. The overhead of GDD is eliminated in a592c65 by replacing a frequent call to `AnchoredFragment.takeWhileOldest` with a faster variant. With these changes, the only significant overhead of syncing from multiple peers comes from the BlockFetch logic, which is the topic of ongoing work. ### CSJ tests The testing code has been adapted in a969b37 to allow to specify multiple honest peers in CSJ tests. CSJ tests are the first time we care of running multiple honest peers to check that header are downloaded from only one of them. We also added a test in 1d7fc95 to check that headers are downloaded from at most one peer even in the presence of adversarial peers. ### GSM integration In b525959 Limit on Eagerness is enabled only while GSM is in syncing state, and 83b12ee does similarly for LoP. f69fd1f makes possible to toggle genesis components with environment variables. This was helpful to measure the overheads of the individual components.
cd0f027
to
15a9be3
Compare
This is useful for tests. Co-authored-by: Nicolas “Niols” Jeannerod <nicolas.jeannerod@tweag.io>
15a9be3
to
c0ecc23
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but we should wait for the synchronous review during a video call.
, maxClockSkew = maxClockSkew m | ||
, isOpen = True | ||
} | ||
addBlock cfg blk m = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would a pattern match on ignoreBlock
be more readable than an if
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I condensed this code a bit 👍
diff --git a/ouroboros-consensus/test/storage-test/Test/Ouroboros/Storage/ChainDB/Model.hs b/ouroboros-consensus/test/storage-test/Test/Ouroboros/Storage/ChainDB/Model.hs
index 2c8f5c7eef..05f7cabe74 100644
--- a/ouroboros-consensus/test/storage-test/Test/Ouroboros/Storage/ChainDB/Model.hs
+++ b/ouroboros-consensus/test/storage-test/Test/Ouroboros/Storage/ChainDB/Model.hs
@@ -402,14 +402,10 @@ addBlock :: forall blk. LedgerSupportsProtocol blk
-> blk
-> Model blk -> Model blk
addBlock cfg blk m =
- if ignoreBlock
- then m
- else
- chainSelection cfg $
- m
- { volatileDbBlocks =
- Map.insert (blockHash blk) blk (volatileDbBlocks m)
- }
+ | ignoreBlock = m
+ | otherwise = chainSelection cfg m {
+ volatileDbBlocks = Map.insert (blockHash blk) blk (volatileDbBlocks m)
+ }
where
secParam = configSecurityParam cfg
immBlockNo = immutableBlockNo secParam m
@@ -890,6 +909,10 @@ generator genBlock m@Model {..} = At <$> frequency | |||
, (if empty then 1 else 10, return GetMaxSlotNo) | |||
, (if empty then 1 else 10, genGetIsValid) | |||
|
|||
, (case loe of | |||
LoEDisabled -> (0, return $ UpdateLoE $ AF.Empty AF.AnchorGenesis) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't setting the frequency to 0 be equivalent to not generating this case at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we just use 0
here for convenience as we are inside of a list literal here; a similar pattern is used in a few other places; I slightly refactored this to at least not use a bogus value on the right.
diff --git a/ouroboros-consensus/test/storage-test/Test/Ouroboros/Storage/ChainDB/StateMachine.hs b/ouroboros-consensus/test/storage-test/Test/Ouroboros/Storage/ChainDB/StateMachine.hs
index 510ebac6df..184f018a4b 100644
--- a/ouroboros-consensus/test/storage-test/Test/Ouroboros/Storage/ChainDB/StateMachine.hs
+++ b/ouroboros-consensus/test/storage-test/Test/Ouroboros/Storage/ChainDB/StateMachine.hs
@@ -909,9 +909,10 @@ generator loe genBlock m@Model {..} = At <$> frequency
, (if empty then 1 else 10, return GetMaxSlotNo)
, (if empty then 1 else 10, genGetIsValid)
- , (case loe of
- LoEDisabled -> (0, return $ UpdateLoE $ AF.Empty AF.AnchorGenesis)
- LoEEnabled () -> (if empty then 1 else 10, UpdateLoE <$> genLoEFragment))
+ , let freq = case loe of
+ LoEDisabled -> 0
+ LoEEnabled () -> if empty then 1 else 10
+ in (freq, UpdateLoE <$> genLoEFragment)
-- Iterators
, (if empty then 1 else 10, uncurry Stream <$> genBounds)
@@ -0,0 +1,3 @@ | |||
### Breaking | |||
|
|||
- ChainDB: allow to trigger chain selection synchronously |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we update the PR description accordingly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... and maybe provide a rationale on how this change relates to adding LoE to the ChainDB tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, updated the PR description. The reason for adding this is just testing (also see the first commit message).
Co-authored-by: Nicolas “Niols” Jeannerod <nicolas.jeannerod@tweag.io>
c0ecc23
to
1357c1b
Compare
Closes #541
This PR adds a new instruction to the ChainDB q-s-m test, which updates the Limit on Eagerness (LoE) fragment and retriggers chain selection. This requires a model implementation of the LoE.
As a preparatory change, we add a way to trigger chain selection synchronously (ie such that we can wait for it to finish), which is only used in tests.