-
Notifications
You must be signed in to change notification settings - Fork 217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change applyBlocks
to also accept BlockSummary
#3172
Conversation
* Construct `ChainEvents` from `NonEmpty Block`. * Test that `discoverFromBlockData` yields the same address discovery state on `List` and on `Summary`, at least for the mock type `WalletState`.
…to work with `BlockEvents` instead of `Block`.
`Sublist` = a sublist of a list, with an optimized case for the full list.
1d892d1
to
278b9a8
Compare
bors try |
tryBuild failed: |
-- The main purpose of this data type is optimization: | ||
-- When processing whole 'Block', we want to avoid copying | ||
-- and redecorating the entire list of transactions in that 'Block'; | ||
-- instead, we want to copy a pointer to this list. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you mean the intent: when All
then we process all content, then Some
then only those pointed with ix (being map fst content), right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, though the focus is on the producer here: fromEntireBlock
uses All
and avoids redecorating. An external blockchain source will return Some
with specific indices.
nullBlockEvents BlockEvents{transactions,delegations} | ||
= null transactions && null delegations | ||
|
||
-- | Merge two 'Sublist' assuming that they are sublists of the /same/ list. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, when sublists of the same list are considered then this is perfectly true. The question is can we outlaw calling this function with sublists of different lists? Maybe it would be worth adding some invariant avalilable for export here checking this? Possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wondered the same thing, but I don't see how to do that in this context: We would need a run-time proof that the transaction data returned by an external blockchain data source indeed belongs to the same block; but that is out of reach and it's easier to postulate it using unsafeMkSublist
.
In other contexts, one could enforce uniqueness by using an ST-like existential type parameter:
wholeList :: forall a. [a] -> exists s. Sublist s a
filterSublist :: forall s a. (a -> Bool) -> Sublist s a -> Sublist s a
mergeSublist :: forall s a. Sublist s a -> Sublist s a -> Sublist s a
In other words, the type parameter s
would be a representing of the list that the sublist is a sublist of. But we'd be quickly walking into dependent types territory here, and this does not solve the issue in our context where the whole list s
is only known on a remote machine.
As Sublist
is a small gimmick used for optimization only, I opted to keep it simple and "unsafe" as it is.
mergeSublist :: Sublist a -> Sublist a -> Sublist a | ||
mergeSublist (All xs) _ = All xs -- result cannot be larger | ||
mergeSublist _ (All ys) = All ys | ||
mergeSublist (Some xs) (Some ys) = Some $ mergeOn fst const xs ys | ||
|
||
-- | Merge block events that belong to the same block. | ||
mergeSameBlock :: BlockEvents -> BlockEvents -> BlockEvents |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, here it is apparent that sublist are from the same list
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I'm not too happy about exporting mergeSublist
in the first place, but the export is necessary for testing.
, transactions = [] | ||
, delegations = [] | ||
} | ||
fromFiltered (fblock:_) = fblock |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we are sure fblock is NonEmpty, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm? fblock
is a single block. As for the list, I think it's indeed true that the argument to fromFiltered
will be nonempty when applied to the result of the call to applyBlockData
, but that is not obvious 🤔 — that is why I have opted to cover the case of []
regardless.
pure (filteredBlocks, (dw, Wallet u1 tip1 s1)) | ||
|
||
-- | Strict variant of 'mapAccumL'. | ||
mapAccumL' :: (s -> a -> (o,s)) -> s -> [a] -> ([o],s) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when I see something like this I always would like to make sure that even irrespective of bang patterns applied we have no thunk build up. I wonder if we should not start using nothunk
- some reading for inspiration https://well-typed.com/blog/2020/09/nothunks/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably. In other parts of our code, nothunks
is used, but not here yet. I do want to investigate the space usage of Wallet
and WalletState
more systematically at some point, but not right now — things are working out ok at the moment, so I figured that I could away with just making sure that I don't accidentally make things worse for the time being.
describe "Sublist" $ do | ||
it "merging is idempotent" $ | ||
property prop_idempotent | ||
it "merging has whole list as neutral element" $ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we have I=All
and A=Some
then mergeSublist I A = I
is I
neutral element really?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, you're right, the nomenclature "neutral element" is not correct. 🙈 The correct name is "absorbing element".
, (===) | ||
) | ||
|
||
import Data.Map.Strict as Map | ||
|
||
spec :: Spec | ||
spec = do | ||
describe "Sublist" $ do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also distinct the following properties:
- for any two sublists of the same list the result of merge is sublist that contains both sublists, is union sublist.
- merging is commutative
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok! There's not much that can go wrong in this code, but these laws are aesthetically pleasing. 😊
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was about to write that property 2 follows from property 1 because set union is commutative, but property 2 is slightly stronger due to the order of elements. 🤔
{------------------------------------------------------------------------------- | ||
Generators | ||
-------------------------------------------------------------------------------} | ||
genSublist :: [a] -> Gen (Sublist a) | ||
genSublist xs = unsafeMkSublist <$> sublistOf (zip [0..] xs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sublistOf
is from Quickcheck, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
discoverState (Summary discover summary) s | ||
=== discoverState (List blocks) s | ||
where | ||
-- TODO: Test sequence of blocks with more than one block. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, would be great to expand this property in this direction
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ooh, nice, that's an older comment. In another pull request, I had started setting up this ApplyBlocks
type for generating lists of blocks; I'll plug that in now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, the step should not be invasive and impact negatively the current block processing. I like Sublist abstraction and intention behind it that utilizes lazyness, and look inside the list contents selectively (if I understood intent correctly). Left some pointers/questions.
Thanks Pawel! 😊
which copies a pointer to the list of
deconstructs the list of transactions, decorates each list element with its index, and constructs a new list — this is way more expensive, and since the transaction processing is part of a tight loop, of relevance. |
Co-authored-by: paweljakubas <pawel.jakubas@iohk.io>
Started benchmarks against this branch -> https://buildkite.com/input-output-hk/cardano-wallet-nightly/builds/1289 |
Benchmarks look entirely neutral. 👍 |
bors merge |
3172: Change `applyBlocks` to also accept `BlockSummary` r=HeinrichApfelmus a=HeinrichApfelmus ### Issue number ADP-1422, ADP-1428 ### Overview [Light-mode][] (Epic ADP-1422) aims to make synchronisation to the blockchain faster by trusting an off-chain source of aggregated blockchain data. [light-mode]: https://input-output-hk.github.io/cardano-wallet/design/specs/light-mode In this pull request, we change the `applyBlocks` function to work with both a list of `Block` and a `BlockSummary`. Specifically, we introduce a function `applyBlockData` which performs the following steps: * `discoverFromBlockData`: discover addresses and and transaction data (`ChainEvents`) contained in the sequence of blocks * `applyBlockEventsToUTxO`: update the wallet UTxO from the given `ChainEvents`. ### Details * Property tests for `discoverFromBlockData` check that address discovery gives the same result for both a `List` of blocks and a `Summary`. * When processing a `List` of `Block`, we have to be a bit careful about performance, because we need to traverse ~ 32M transactions (as of Feb 2022). In this case, the `transactions` contained in `BlockEvents` are essentially a full copy of the `transactions` in `Block`. The best way to copy data is to copy a single reference to the data; for this purpose we introduce a `Sublist` type which avoids deconstructing and reconstructing the transaction list in the case where we do not filter transactions (`All`). * In order to preserve the current checkpointing mechanism, `applyBlocks` actually calls `applyBlockData` repeatedly instead of passing the full list of blocks. I will need to rethink the checkpointing mechanism in the future — the main issue is that in its current form, a consumer of `BlockSummary` cannot ask for a specific block height in order to set a suitable checkpoint there. In the future, the checkpointing logic will need to mediate between producer and consumer. ### Comments * I expect a slight regression in benchmarks, due to overhead involved in calling `applyBlockData` repeatedly. However, the introduction of `Sublist` should prevent a serious deterioration. Co-authored-by: Heinrich Apfelmus <heinrich.apfelmus@iohk.io>
Build failed: |
bors retry |
Build succeeded: |
Issue number
ADP-1422, ADP-1428
Overview
Light-mode (Epic ADP-1422) aims to make synchronisation to the blockchain faster by trusting an off-chain source of aggregated blockchain data.
In this pull request, we change the
applyBlocks
function to work with both a list ofBlock
and aBlockSummary
. Specifically, we introduce a functionapplyBlockData
which performs the following steps:discoverFromBlockData
: discover addresses and and transaction data (ChainEvents
) contained in the sequence of blocksapplyBlockEventsToUTxO
: update the wallet UTxO from the givenChainEvents
.Details
discoverFromBlockData
check that address discovery gives the same result for both aList
of blocks and aSummary
.List
ofBlock
, we have to be a bit careful about performance, because we need to traverse ~ 32M transactions (as of Feb 2022). In this case, thetransactions
contained inBlockEvents
are essentially a full copy of thetransactions
inBlock
. The best way to copy data is to copy a single reference to the data; for this purpose we introduce aSublist
type which avoids deconstructing and reconstructing the transaction list in the case where we do not filter transactions (All
).applyBlocks
actually callsapplyBlockData
repeatedly instead of passing the full list of blocks. I will need to rethink the checkpointing mechanism in the future — the main issue is that in its current form, a consumer ofBlockSummary
cannot ask for a specific block height in order to set a suitable checkpoint there. In the future, the checkpointing logic will need to mediate between producer and consumer.Comments
applyBlockData
repeatedly. However, the introduction ofSublist
should prevent a serious deterioration.