Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UTxO-HD targeting main #1267

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

UTxO-HD targeting main #1267

wants to merge 1 commit into from

Conversation

jasagredo
Copy link
Contributor

@jasagredo jasagredo commented Sep 26, 2024

Description

The changes from UTxO-HD span over ouroboros-consensus, ouroboros-consensus-diffusion and ouroboros-consensus-cardano. The core change is:

  • The UTxO set is extracted from the LedgerState in the form of LedgerTables.
  • These tables are stored in the LedgerDB, which can keep them in memory or on disk.
  • When performing an action that requires UTxOs, we have to ask the LedgerDB for those. This might perform IO.

Here I will explain how I would review this enormous PR. Instead of listing files I will describe concepts, and my suggestion is to go look at the mentioned files (or search for the concepts) then mark the file as viewed to offload it from the brain.

The ledger tables

  • The first step would be to understand the concept of LedgerTables, see Ouroboros.Consensus.Ledger.Tables.* modules. The LedgerTables are parametrized by l (in the end it will be by blk) and by mk (or MapKinds). MapKinds are just types parametrized by the Key and Value of l. These will be TxIn|TxOut for unitary blocks and CanonicalTxIn|HardForkTxOut for hard fork blocks.
  • LedgerTables are barbies-like, see Ouroboros.Consensus.Ledger.Tables.Combinators.
  • LedgerTables are (most commonly) empty (EmptyMK), a (possibly restricted) UTxO set (ValuesMK), a set of TxIns (KeysMK), a sequence of differences (DiffMK) or a combination of values + diffs (TrackingMK). The only non-obvious one is DiffMK which is a map of sequences of changes to a value (in the UTxO case values don't change, they are created and destroyed, so there will be at most 2 elements there). On top of that there is a DiffSeqMK which is a fingertree of differences. Only used in V1 (see below).
  • The LedgerState is itself parametrized by this same mk. The data instances will then make use of that mk to define tables associated with the block. So the byron ledger state ignores it, the shelley ledger state has a new field with the tables and the hard fork ledger state will propagate the mk through the telescope, therefore having an mk of the particular state in the Telescope.
  • The LedgerTables can live on their own, which for unitary blocks don't make a difference, but for the Cardano Block, we go from an mk passed to the Telescope (therefore tables at the tip of the Telescope) to CardanoLedgerTables, in which each value is a HardForkTxOut. This cost is non-trivial and we only want to pay it when applying a new block/transaction.
  • LedgerTables can be extracted and injected into the ledger state via (un)stowLedgerTables.
  • The ledger tables of the Extended ledger state are the same as the ones form the LedgerState.
  • A very important bit that maybe was not clear above is that the HardForkBlock has no canonical tables because our definitions are not compositional for the HF block, only the CardanoBlock has "hard fork tables". See the constraints of HasHardForkLedgerTables.

Applying and ticking (Ouroboros.Consensus.Ledger.Abstract/Basics)

When ticking a block, some differences might be created, and no values are needed. So the types go from l EmptyMK to Ticked1 l DiffMK. This is the case at least in two moments: when going from Byron to Shelley (all values are created here) and when going from Shelley to Allegra (avvm addresses are deleted). See the relevant functions: translateLedgerStateByronToShelley and translateLedgerStateShelleyToAllegra.

When applying a block, we get the inputs needed (getBlockKeySets then read those from the LedgerDB), tick the ledger state without tables (possibly creating diffs), apply those diffs on the values from the LedgerDB, then call the ledger rules. We then diff the input and output tables to get a set of differences from applying a block, to which we will prepend the ones from ticking. See applyBlockResult and the Shelley functions for applying blocks.

The story with transactions is pretty similar.

The LedgerDB versions (Ouroboros.Consensus.Storage.LedgerDB)

There are two flavors of the LedgerDB, each one having two implementations:

  • V1 (Ouroboros.Consensus.Storage.LedgerDB.V1): we keep a sequence of EmptyMK ledger states and dump the values into a BackingStore. We can get back values from the backing store at any ledger state, by opening a BackingStoreValueHandle and reading from it. The BackingStore consists of a "complete" UTxO set at some anchor and then a sequence of differences. To get values at a given point we have to read the anchor, then reapply the differences up to the desired point. This is "wasteful" if done in memory (why keep diffs and have to reapply them every time if we can just apply them in place?) but it is useful on the on-disk implementation which puts the "complete" UTxO set on the disk, offloading it from memory. There are two implementations:
    • OnDisk: It uses LMDB underneath. See the Ouroboros.Consensus.Storage.LedgerDB.V1.BackingStore.Impl.LMDB.* modules.
    • InMemory: Not intended for real use. As mentioned above it is wasteful. It serves as a reference impl for the OnDisk implementation.
  • V2 (Ouroboros.Consensus.Storage.LedgerDB.V2): We keep a sequence of StateRefs, which are EmptyMK ledger states together with a tables handle from which we can read values monadically. This is very similar to the previous LedgerDB, in which we kept a sequence of (complete) LedgerStates. There are two implementations:
    • InMemory
    • LSM: still a WIP

Evaluating forks

In order to evaluate forks, we created the concept of Forkers, where each LedgerDB implementation has their own concept. They are just an abstract interface that allows to query for values and push differences that eventually can be dumped back into the LedgerDB (only by ChainSelection, others use ReadOnlyForkers). Note that they allocate resources so there is some juggling with ResourceRegistries there.

Ledger queries (Ouroboros.Consensus.Ledger.Query)

Some queries will have to look at the UTxO set, in particular GetUtxoByAddress, GetUtxoWhole and GetUtxoByTxin. We categorize them by the means of QueryFootprint. We will process each one of them differently.

Other queries use QFNoTables, GetUtxoByTxIn uses QFLookupTables and will have to read a single value from the tables, and GetUtxoWhole and GetUtxoByAddress use QFTraverseTables as they will have to scan the whole UTxO set.

For the HardForkBlock there is another class Ouroboros.Consensus.HardFork.Combinator.Ledger.Query.BlockSupportsHFLedgerQuery which has faster implementations than projecting the tables into the particular tip of the Telescope, because we can usually judge whether we want the result without upgrading the TxOut to the latest era.

In essence, queries are now monadic. Queries that don't look at the UTxO set are artificially monadic (just a pure of the already existing logic).

The mempool

The mempool in essence will have to acquire (read only) forkers on the LedgerDB at the tip, then read values for the incoming transactions and apply them. The returned diffs are appended to the ones in the mempool, which keeps a TrackingMK with the current values and past diffs.

When revalidating transactions we cannot know if the UTxO set changed so we will have to re-read the values from the (new) forker.

The internal state is now a TMVar because we need to acquire >> read tables >> update where read tables is in IO and the others are in STM.

The snapshots

We now store snapshots in a new format:

  • V1-OnDisk: a copy of the lmdb database and a (Haskell-CBOR) serialization of the LedgerState.
  • V*-InMemory: a (Haskell-CBOR) serialization of the UTxO set and a (Haskell-CBOR) serialization of the LedgerState.

Note that for V2 we can take snapshots at any time of the immutable tip, but for v1 we have to take flush some differences from the BackingStore into the anchor to advance it to the immutable tip.

This is abstracted by either implementation in Ouroboros.Consensus.Storage.LedgerDB.V*...tryTakeSnapshot

The forging loop

The forging loop didn't change much. Each iteration runs with a resource registry (to allocate the forkers). Then we use the forker to provide values for the mempool snapshot acquisition, in case of a revalidation.

Changes in Byron/Shelley/Cardano

The changes here are mostly fulfilling everything that was described above, to make all the types match. There are some specific things which are interesting to look at because they might be non-trivial:

  • Translation functions (with the two examples I already mentioned)
  • The TxIn|TxOut data instances, the LedgerState data instance and the HasLedgerTables instances
  • applyBlock for shelley. The cardano one is just the HFC one, which injects the CardanoTables into the tip of the Telescope (here is where we do the costly step, but it usually won't be that costly because the UTxO set for a block is small).
  • The Cardano.Ledger module which defines the CardanoTxIn and CardanoTxOut.

Other changes

The rest of the changes are mainly just following GHC adjusting the types here and there. Most other code doesn't use tables so an abstract mk or EmptyMK is used to make the kind well-formed.

Copy link
Contributor Author

@jasagredo jasagredo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a pass over the non-testing libraries.

@jasagredo jasagredo marked this pull request as ready for review October 24, 2024 13:29
@jasagredo jasagredo changed the title WIP: UTxO-HD targeting main UTxO-HD targeting main Oct 24, 2024
Copy link
Contributor

@nfrisby nfrisby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the result of my first pass on the Ouroboros.Consensus.Ledger.Tables.* modules.

Copy link
Contributor

@nfrisby nfrisby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another round of comments. This is all of the *Hard* files, except for Query.hs.

@@ -753,9 +899,14 @@ translateLedgerStateBabbageToConwayWrapper =
-- we monkey-patch the governance state by ticking across the
-- era/epoch boundary using Babbage logic, and set the governance
-- state to the updated one /before/ translating.
--
-- NOTE we are ignoring the differences created by the ticking that
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe write "ignoring the DiffMK" instead of "ignoring the differences" to avoid alarm/confusion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is luckily gone in #1297, so rebasing will resolve this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rewrote it as Nick suggests, will deal with it when I do a rebase.

I'm leaving this thread open to see it again in the future.

type TelescopeWithTxList g f tx xs' xs =
Telescope g (Product (ListOfTxs tx xs') f) xs

matchPolyTxs' ::
Copy link
Contributor

@nfrisby nfrisby Oct 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation allocates a list of txs for each previous era.

I wonder if an approach similar to composeTxOutTranslations $ ipTranslateTxOut hardForkEraTranslation from Combinator/Ledger.hs might make the logic easier to follow and also avoid allocating extraneous lists.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps lets chat about this on a call, I'm not sure what you mean here.

-- translated to newer eras. This function fills that hole and allows us to
-- promote tables from one era into tables from the next era.
--
-- TODO(jdral): this is not optimal. If either 'translateTxInWith' or
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reifying a TODO about skipping traversals if the key/value translations are id

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might also consider combining the two traversals into a single traversal. And maybe also something about monotonicity.

Copy link
Contributor

@nfrisby nfrisby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All *Query*hs files, except:

  • ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/ChainDB/Impl/Query.hs
  • ouroboros-consensus/test/consensus-test/Test/Consensus/MiniProtocol/LocalStateQuery/Server.hs
  • ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Mempool/Query.hs

where
lcfg = configLedger cfg

answerBlockQueryTraverse
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some suspicious duplication with answerBlockQueryLookup right above.

Copy link
Contributor Author

@jasagredo jasagredo Nov 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, these are pretty similar but the footprint is concretized on each. I tried to unify them with extremely cryptic GHC errors. I will leave this open to revisit but maybe we have to live with this.

answerMonadicQueryVia ::
  forall m xs footprint result. ( MonadSTM m
  , All SingleEraBlock xs
  , HardForkHasLedgerTables xs
  , BlockSupportsHFLedgerQuery xs
  , CanHardFork xs
  )
  => (   forall result'. NP ExtLedgerCfg xs
      -> QueryIfCurrent xs footprint result'
      -> ReadOnlyForker' m (HardForkBlock xs)
      -> m result'
     )
  -> ExtLedgerCfg (HardForkBlock xs)
  -> BlockQuery (HardForkBlock xs) footprint result
  -> ReadOnlyForker' m (HardForkBlock xs)
  -> m result
answerMonadicQueryVia answerVia (ExtLedgerCfg cfg)
    qry
    forker = do
      st@(HardForkLedgerState hardForkState)  <- ledgerState <$> atomically (roforkerGetLedgerState forker)
      let ei   = State.epochInfoLedger lcfg hardForkState
          cfgs = hmap ExtLedgerCfg $ distribTopLevelConfig ei cfg
      case qry of
        QueryIfCurrent (queryIfCurrent :: QueryIfCurrent xs footprint result') ->
          answerVia
                cfgs
                queryIfCurrent
                forker
        -- We only call this with QFLookupTables or QFTraverseTables, so these
        -- two matches below are effectively dead.
        QueryAnytime queryAnytime (EraIndex era) ->
          pure $ interpretQueryAnytime
            lcfg
            queryAnytime
            (EraIndex era)
            hardForkState
        QueryHardFork queryHardFork ->
          pure $ interpretQueryHardFork
            lcfg
            queryHardFork
            st
    where
      lcfg = configLedger cfg
src\ouroboros-consensus\Ouroboros\Consensus\HardFork\Combinator\Ledger\Query.hs:262:17: error: [GHC-25897]
    • Couldn't match type ‘result1’
                     with ‘Either (MismatchEraInfo xs) result1’
      Expected: QueryIfCurrent xs footprint result
        Actual: QueryIfCurrent xs footprint result1
      ‘result1’ is a rigid type variable bound by
        a pattern with constructor:
          QueryIfCurrent :: forall (xs :: [*]) (footprint :: QueryFootprint)
                                   result.
Failed, 215 modules loaded.
                            QueryIfCurrent xs footprint result
                            -> BlockQuery
                                 (HardForkBlock xs) footprint (HardForkQueryResult xs result),
        in a case alternative
        at src\ouroboros-consensus\Ouroboros\Consensus\HardFork\Combinator\Ledger\Query.hs:259:9-78
    • In the second argument of ‘answerVia’, namely ‘queryIfCurrent’
      In the expression: answerVia cfgs queryIfCurrent forker
      In a case alternative:
          QueryIfCurrent (queryIfCurrent :: QueryIfCurrent xs footprint result')
            -> answerVia cfgs queryIfCurrent forker
    • Relevant bindings include
        queryIfCurrent :: QueryIfCurrent xs footprint result1
          (bound at src\ouroboros-consensus\Ouroboros\Consensus\HardFork\Combinator\Ledger\Query.hs:259:25)
        cfgs :: NP ExtLedgerCfg xs
          (bound at src\ouroboros-consensus\Ouroboros\Consensus\HardFork\Combinator\Ledger\Query.hs:257:11)
        hardForkState :: State.HardForkState (Flip LedgerState EmptyMK) xs
          (bound at src\ouroboros-consensus\Ouroboros\Consensus\HardFork\Combinator\Ledger\Query.hs:255:31)
        st :: LedgerState (HardForkBlock xs) EmptyMK
          (bound at src\ouroboros-consensus\Ouroboros\Consensus\HardFork\Combinator\Ledger\Query.hs:255:7)
        lcfg :: LedgerConfig (HardForkBlock xs)
          (bound at src\ouroboros-consensus\Ouroboros\Consensus\HardFork\Combinator\Ledger\Query.hs:278:7)
        forker :: ReadOnlyForker' m (HardForkBlock xs)
          (bound at src\ouroboros-consensus\Ouroboros\Consensus\HardFork\Combinator\Ledger\Query.hs:254:5)
        qry :: BlockQuery (HardForkBlock xs) footprint result
          (bound at src\ouroboros-consensus\Ouroboros\Consensus\HardFork\Combinator\Ledger\Query.hs:253:5)
        (Some bindings suppressed; use -fmax-relevant-binds=N or -fno-max-relevant-binds)
    |
262 |                 queryIfCurrent
    |                 ^^^^^^^^^^^^^^

Perhaps I just need to stare at it a bit more...

Copy link
Contributor

@nfrisby nfrisby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*LedgerDB* files, except I stopped when I got to the LMDB impl. I'll pick up there tomorrow.

Copy link
Contributor

@nfrisby nfrisby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My previous review was the *LedgerDB* files up to but excluding LMDB.

This review picks up there and stops before V2.

Copy link
Contributor

@nfrisby nfrisby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the LedgerDB*V2 files

) . mkFsPath . (:[])) dirs

-- | Testing only! Truncate all snapshots in the DB.
implIntTruncateSnapshots :: MonadThrow m => SomeHasFS m -> m ()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could share a control HOF with destroySnapshots.

-> HandleArgs
-> (LedgerDB m l blk, TestInternals m l blk)
implMkLedgerDb h bss = (LedgerDB {
getVolatileTip = getEnvSTM h implGetVolatileTip
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of these impl* functions share a worrying amount of code with the V1 impl.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but they use different types. Perhaps we should hide this in some typeclass? We will eventually delete V1 so I'm unsure how worth this is.

Copy link
Contributor

@nfrisby nfrisby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the LedgerDB files after V2, ie the tests.

, DbChangelog.Unit.tests
, DbChangelog.QuickCheck.tests
]
, SnapshotPolicy.tests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which of these tests V2?


tests :: TestTree
tests = testGroup "DbChangelog"
[ testProperty "flushing" $ verboseShrinking $ withMaxSuccess samples $ conjoin
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest removing verboseShrinking

, dbChangelogStartsAt = slotNo
}

-- TODO: Shrinking might not be optimal. Shrinking finds the shortest prefix of the list of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reifying a TODO

(toFlush, toKeep) = DbChangelog.splitForFlushing dblog
toFlushTip = maybe undefined DbChangelog.toFlushSlot toFlush
toKeepTip = DbChangelog.immutableTipSlot $ anchorlessChangelog toKeep
LedgerTables (SeqDiffMK toKeepDiffs) = DbChangelog.adcDiffs $ anchorlessChangelog toKeep
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alignment

dblog = resultingDbChangelog setup
dblog' = DbChangelog.onChangelog (DbChangelog.prune sp) dblog

-- | The prefixBackToAnchor function rolls back all volatile states.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale haddock: There is no prefixBackToAnchor function.


type Key = String

data GenOperationsState = GenOperationsState {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please comment these fields, at least the pending and consumed ones.

@@ -753,9 +899,14 @@ translateLedgerStateBabbageToConwayWrapper =
-- we monkey-patch the governance state by ticking across the
-- era/epoch boundary using Babbage logic, and set the governance
-- state to the updated one /before/ translating.
--
-- NOTE we are ignoring the differences created by the ticking that
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is luckily gone in #1297, so rebasing will resolve this

Comment on lines +433 to +438
queryLedgerGetTraversingFilter idx@IZ = \case
GetUTxOByAddress addrs ->
filterGetUTxOByAddressOne addrs
GetUTxOWhole ->
const True
GetCBOR q' -> queryLedgerGetTraversingFilter idx q'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this could also be shared with the Cardano block and the two-era Shelley blocks.

-> LedgerTables (LedgerState (HardForkBlock xs)) mk
injectLedgerTables idx =
LedgerTables
. mapKeysMK injTxIn
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worthwhile to have a combined operation that maps both keys and values in one pass?

Makes sense, though I'm not sure Data.Map exposes anything to perform the mapping of both keys and values at the same time. So we'd probably end up defining

mapKeysValuesMK fkeys fvalues = Map.mapKeys fkeys . Map.map fvalues

The above is probably more susceptible to optimisations than mapKeysMK . mapMK, so that might be an argument in favour of mapKeysValuesMK

If so, is it also worthwhile to require the key mapping is monotone?

This would rely on injectCanonicalTxIn and distribCanonicalTxIn being monotonic. I suppose they are monotonic in practice, but if we want to use a monotonic key mapping here then we should make it a requirement on the HasCanonicalTxIn class

defaultQueryBatchSize :: QueryBatchSize -> Word64
defaultQueryBatchSize requestedQueryBatchSize = case requestedQueryBatchSize of
RequestedQueryBatchSize value -> value
DefaultQueryBatchSize -> 100_000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that I know of


transaction =
ltzipWith3A
(rangeRead rqCount)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by more flexible?

V1InMem -> LedgerDB.LedgerDbFlavorArgsV1
(LedgerDB.V1.V1Args LedgerDB.V1.DisableFlushing LedgerDB.V1.DisableQuerySize LedgerDB.V1.InMemoryBackingStoreArgs)
V1LMDB -> LedgerDB.LedgerDbFlavorArgsV1
(LedgerDB.V1.V1Args LedgerDB.V1.DisableFlushing LedgerDB.V1.DisableQuerySize (LedgerDB.V1.LMDBBackingStoreArgs (BS.LiveLMDBFS (shfs (ChainDB.RelativeMountPoint "lmdb"))) defaultLMDBLimits Dict.Dict))
Copy link
Member

@amesgen amesgen Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could see that one would want to adapt the LMDB live dir to a non-default subdir, especially when analyzing an existing ChainDB. (Doesn't necessarily have to be implemented here, can also create a follow-up issue with potential improvements/polishing.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in fact for the chaindb of which we get blocks from. Not for the current live chaindb.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah no I'm wrong. It actually points to the LedgerDB in which we will perform operations.

Comment on lines +396 to +407
oldLedger <- IOLike.atomically $ LedgerDB.getVolatileTip initLedgerDB
frk <- LedgerDB.getForkerAtWellKnownPoint initLedgerDB registry VolatileTip
tbs <- LedgerDB.forkerReadTables frk (getBlockKeySets blk)
LedgerDB.forkerClose frk
case runExcept $ tickThenXApply ledgerCfg blk (oldLedger `withLedgerTables` tbs) of
Right newLedger -> do
when (blockSlot blk >= slotNo) $ storeLedgerState newLedger
when (blockSlot blk > slotNo) $ issueWarning blk
when ((unBlockNo $ blockNo blk) `mod` 1000 == 0) $ reportProgress blk
return (continue blk, newLedger)
LedgerDB.reapplyThenPushNOW internal blk
LedgerDB.tryFlush initLedgerDB
return (continue blk, ())
Copy link
Member

@amesgen amesgen Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we have a duplicate reapply here, right? Could we avoid this, for example by creating a forker and pushing + committing the diff from newLedger via the forker?

Same point about the occurrences of reapplyThenPushNOW further below (maybe apart from reproMempoolForge)

Copy link
Contributor

@nfrisby nfrisby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is every *Mempool* file, except ouroboros-consensus/test/consensus-test/Test/Consensus/Mempool/StateMachine.hs (it's new, big, and involved, so I'll review it separately).

fullDiff :: DiffMK Token ()
fullDiff = DiffMK $ consumedDiff <> producedDiff

getPayloadKeySets tx = LedgerTables $ KeysMK $ consumed <> produced
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we pull the produced keys as well? So that we can check whether they already exist?

If so, that is not immediately obvious, so I think deserves a comment on the definition of the getTransactionKeySets method in SupportsMempool.hs and getBlockKeySets in Abstract.hs.


OK, maybe it's just a bug here. Because the upstream definition in Ledger doesn't obviously do it:

-- | The validity of any individual block depends only on a subset
-- of the UTxO stored in the ledger state. This function returns
-- the transaction inputs corresponding to the required UTxO for a
-- given Block.
--
-- This function will be used by the consensus layer to enable storing
-- the UTxO on disk. In particular, given a block, the consensus layer
-- will use 'neededTxInsForBlock' to retrieve the needed UTxO from disk
-- and present only those to the ledger.
neededTxInsForBlock ::
  forall h era.
  EraSegWits era =>
  Block h era ->
  Set (TxIn (EraCrypto era))
neededTxInsForBlock (Block' _ txsSeq _) = Set.filter isNotNewInput allTxIns
  where
    txBodies = map (^. bodyTxL) $ toList $ fromTxSeq txsSeq
    allTxIns = Set.unions $ map (^. allInputsTxBodyF) txBodies
    newTxIds = Set.fromList $ map txIdTxBody txBodies
    isNotNewInput (TxIn txID _) = txID `Set.notMember` newTxIds

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it's just a bug.

This bug may be indicative of a general weakness in our code. getTransactionKeySets and getBlockKeySets can declare more txins than strictly necessary for block/tx (re-)application, which won't make our test suites fail, but it does mean that there will be more disk traffic than strictly necessary. The reverse (fewer txins than necessary) will probably make our tests fail. Still, I'd suggest we put some tests in place that check whether getTransactionKeySets and getBlockKeySets declare precisely those txins that are required, no more and no less


-- Ad hoc values to replace default ChainDB configurations
, srnSnapshotPolicyArgs :: SnapshotPolicyArgs
, srnLdbFlavorArgs :: Complete LedgerDbFlavorArgs m -- TODO this will contain a fs?? it should probably not as the node doesn't know about those
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reifying TODO

Copy link
Contributor

@nfrisby nfrisby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rest of the files I had claimed. So I only need to review the new Mempool Statemachine tests now.

@@ -0,0 +1,3 @@
# Overview

TODO
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reifying a TODO

Copy link
Contributor

@nfrisby nfrisby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the utxo-hd.md file

unstowLedgerTables :: l EmptyMK -> l ValuesMK
```

> ⚠️ It is very important to note that `EmptyMK` just means that _the ledger tables are empty_. This says nothing about whether there are values in the `NewEpochState`'s UTXO set. In the Consensus layer we take much care to ensure that the combination of `EmptyMK` having values in the internal UTXO set only happens at the Ledger layer boundary (via `stowLedgerTables`). Any other instance of `l EmptyMK` will mean that there are no values in the tables nor in the `NewEpochState`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we intend/hope for the Ledger codebase to ever add an mk to its types, so that this warning wouldn't be necessary? Perhaps an upstream Issue for that should be linked here?

Copy link
Contributor

@jorisdral jorisdral Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not something I've considered much, though I'm not sure it would be worth it for the ledger to include it. It would be a pretty pervasive change, and it's arguably a consensus implementation detail that we manipulate the UTXO set inside the ledger state

I think we currently have a reasonably safe API that prevent misuse. It could possibly be improved using moar types, e.g.:

  • Introducing something like StowedMK / UnstowedMK instead of just EmptyMK
  • Be more restrictive about when it is okay to extract a NewEpochState from a consensus LedgerState. For example, it's only okay to extract a NewEpochState when the consensus ledger state has type LedgerState _ StowedMK

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably this should be StowedMK / EmptyMK instead. EmptyMK meaning that the Ledger must have an empty UTxO map inside, StowedMK means that there are values in the UTxO set inside.

@@ -195,7 +196,7 @@ oracularLedgerDB p =
}
, Extended.ledgerState = TB.TestLedger {
TB.lastAppliedPoint = p
, TB.payloadDependentState = ()
, TB.payloadDependentState = TB.EmptyPLDS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can tell, this benchmark was used to compare performance before and after changes made in #823. For comparative benchmarks, it does not matter much whether we use trivially empty tables or not, and it makes sense to use trivially empty tables here here since it's the smallest change to make here. If this benchmark were to be used in the future as an "absolute" measure of performance, then we should reconsider using actual tables. Though arguably we should use Cardano blocks instead of TestBlock if realistic measurements were to be the goal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking along the same lines: maybe the goal of this benchmark should be specified somewhere. We can do that in a direct PR to main instead of in this PR

fullDiff :: DiffMK Token ()
fullDiff = DiffMK $ consumedDiff <> producedDiff

getPayloadKeySets tx = LedgerTables $ KeysMK $ consumed <> produced
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it's just a bug.

This bug may be indicative of a general weakness in our code. getTransactionKeySets and getBlockKeySets can declare more txins than strictly necessary for block/tx (re-)application, which won't make our test suites fail, but it does mean that there will be more disk traffic than strictly necessary. The reverse (fewer txins than necessary) will probably make our tests fail. Still, I'd suggest we put some tests in place that check whether getTransactionKeySets and getBlockKeySets declare precisely those txins that are required, no more and no less

@jasagredo jasagredo self-assigned this Nov 18, 2024
@@ -255,7 +256,7 @@ fromChain cfg initState chain =
anchorSnapshot NE.:| snapshots =
fmap (mkHeaderStateWithTime (configLedger cfg))
. NE.scanl
(flip (tickThenReapply (ExtLedgerCfg cfg)))
(\st blk -> applyDiffs st $ tickThenReapply (ExtLedgerCfg cfg) blk st)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it occurs rather often in testing code: What about having a combinator for \st -> applyDiffs st $ tickApplyLike st?

Comment on lines +112 to +116
instance (SimpleCrypto c, Typeable ext)
=> Arbitrary (LedgerState (SimpleBlock c ext) EmptyMK) where
arbitrary =
forgetLedgerTables
<$> arbitrary @(LedgerState (SimpleBlock c ext) ValuesMK)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this instance generate:

  • A ledger state with no ledger tables, without stowed UTXOs
  • A ledger state with no ledger tables, with stowed UTXOs

Right now it's generating the former. My guess is that the latter would be more interesting

Maybe this ambiguity is an argument in favour of having two types of EmptyMK: StowedEmptyMK and UnstowedEmptyMK, or something along those lines

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I replaced forgetLedgerTables with with stowLedgerTables, and the mock-test suite passes fine 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is worth concretizing this more until we introduce StowedMK and UnstowedMK.

I would favor names that explain not what happened with the data but instead what data is there, so something like NothingMK and LedgerMK

| otherwise
-> futureCheckCandidate chainSelEnv validatedChainDiff >>= \case
Left chainDiff'
| Diff.rollbackExceedsSuffix chainDiff'
-> return InsufficientSuffix
-> cleanup validatedChainDiff >> return InsufficientSuffix
| otherwise
-> return $ ValidPrefix chainDiff'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need to cleanup validatedChainDiff here, just as in the ValidPrefix case below?

(However, this code will be removed with #1269 anyways.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets ignore it for now, waiting for a rebase

Comment on lines 185 to 186
-- TODO(js_ldb): reenable
-- GetLedgerDB
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there already an issue for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jorisdral and the comment below

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, unfortunately changing the tests to work with the new LedgerDB went under the radar :P

When designing the LedgerDB API, the getLedgerDB function was removed from the ChainDB API, because LedgerDB was no longer a pure datastructure. It was replaced by more concrete functions, amongst which: getCurrentLedger, getImmutableLedger, getPastLedger, getHeaderStateHistory, getReadOnlyForkerAtPoint, getLedgerTablesAtFor, getStatistics.

Options I can think of for how to restore testing:

  • Test (a subset of) the concrete functions by creating a command for each.
  • Don't test the concrete functions, but use TestInternals to output some pure information about the LedgerDB. Maybe we could reconstruct a pure sequence of ledger states from the concrete LedgerDB impls (reading tables from disk too), and compare those against the model?

And no, there is no issue for it, but there probably should be

Comment on lines 466 to 491
-- | When the model is asked for the ledger DB, it reconstructs it by applying
-- the blocks in the current chain, starting from the initial ledger state.
-- Before the introduction of UTxO HD, this approach resulted in a ledger DB
-- equivalent to the one maintained by the SUT. However, after UTxO HD, this is
-- no longer the case since the ledger DB can be altered as the result of taking
-- snapshots or opening the ledger DB (for instance when we process the
-- 'WipeVolatileDB' command). Taking snapshots or opening the ledger DB cause
-- the ledger DB to be flushed, which modifies its sequence of volatile and
-- immutable states.
--
-- The model does not have information about when the flushes occur and it
-- cannot infer that information in a reliable way since this depends on the low
-- level details of operations such as opening the ledger DB. Therefore, we
-- assume that the 'GetLedgerDB' command should return a flushed ledger DB, and
-- we use this function to implement such command both in the SUT and in the
-- model.
--
-- When we compare the SUT and model's ledger DBs, by flushing we are not
-- comparing the immutable parts of the SUT and model's ledger DBs. However,
-- this was already the case in before the introduction of UTxO HD: if the
-- current chain contained more than K blocks, then the ledger states before the
-- immutable tip were not compared by the 'GetLedgerDB' command.
-- flush ::
-- (LedgerSupportsProtocol blk)
-- => DbChangelog.DbChangelog' blk -> DbChangelog.DbChangelog' blk
-- flush = snd . DbChangelog.splitForFlushing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this comment actually apply ATM given that the GetLedgerDB command is disabled?

Copy link
Contributor

@jorisdral jorisdral Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not apply currently, but it might if we decide to re-enable LedgerDB commands. Depending on which commands we include and how, we might still run into the problem that the V1.LedgerDB has a different set of states than the model has. However, I'd argue that the proper way to handle this discrepancy between the model and the real impl is to change the Eq instance for the result of flushing. We do a similar thing with IsValidResult

Comment on lines +112 to +116
instance (SimpleCrypto c, Typeable ext)
=> Arbitrary (LedgerState (SimpleBlock c ext) EmptyMK) where
arbitrary =
forgetLedgerTables
<$> arbitrary @(LedgerState (SimpleBlock c ext) ValuesMK)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I replaced forgetLedgerTables with with stowLedgerTables, and the mock-test suite passes fine 👍

Copy link
Member

@amesgen amesgen Nov 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes to this file are superfluous

main = runTests `race_` heartbeat
where
runTests = defaultMainWithTestEnv defaultTestEnvConfig tests
heartbeat = forever $ threadDelay (30 * 1_000_000) >> putChar '.' >> hFlush stdout
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is taking so long that we need this? (I suppose that it's for Hydra?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hydra was complaining because there was too much silence. Not sure if this will still be the case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This module seems to be LedgerDB.V1 specific, by the way, so maybe we should move it into there.

Comment on lines +197 to +211
{-------------------------------------------------------------------------------
From-to anti-diffs
-------------------------------------------------------------------------------}

fromAntiDiff :: Anti.Diff k v -> Diff k v
fromAntiDiff (Anti.Diff d) = Diff (Map.map (f . Anti.last) d)
where
f (Anti.Insert v) = Insert v
f Anti.Delete{} = Delete

toAntiDiff :: Diff k v -> Anti.Diff k v
toAntiDiff (Diff d) = Anti.Diff (Map.map f d)
where
f (Insert v) = Anti.singletonInsert v
f Delete = Anti.singletonDelete
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to my previous comment, but we can probably move these into DiffSeq as well, because these functions are only used in conjunction with a DiffSeq

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: 🏗 In progress
Development

Successfully merging this pull request may close these issues.

4 participants