Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add epoch stakepool size indexer #850

Merged
merged 9 commits into from
Jan 17, 2023
Merged

Add epoch stakepool size indexer #850

merged 9 commits into from
Jan 17, 2023

Conversation

eyeinsky
Copy link
Contributor

@eyeinsky eyeinsky commented Dec 5, 2022

PLT-171: marconi indexer: stake pool delegation of a particular epoch

Diff is easiest to review by looking at the separate commits as they do isolated things.

Also added a longer commit messages for the foldBlocks adaption commit 440ff75.

Test with: cabal run marconi-test -- -p '/prop_epoch_stakepool_size/'

Run on testnet or mainnet:

CONFIG="..." # node config is required for folding over ledger states
SOCKET="..."
MAGIC='--mainnet' # or --testne-magic 2 # 2 for preprod
DB_PATH="..."
# rm -f "$DB_PATH/*" # might need to remove previous content on second run

cabal run marconi:exe:marconi -- \
      --disable-utxo --disable-datum --disable-script-tx \
      $MAGIC \
      --node-config-path "$CONFIG" \
      --socket-path "$SOCKET" \
      -d "$DB_PATH"

Depends on these PRs on cardano-node repo (same content, one is just for the 1.35 release branch):

Pre-submit checklist:

  • Branch
    • Tests are provided (if possible)
    • Commit sequence broadly makes sense
    • Key commits have useful messages
    • Formatting, PNG optimization, etc. are updated
    • Important changes are reflected in changelog.d of the affected packages
  • PR
    • Self-reviewed the diff
    • Useful pull request description
    • Reference the ADR in the PR and reference the PR in the ADR (if revelant)
    • Reviewer requested

Copy link
Contributor

@koslambrou koslambrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main issue with foldBlocks is that you can't resume. You always need to start from genesis whenever you stop and restart the indexer. I guess we want to work on this later?

Also, thanks for the tests :) Learned more about stake pool registration

cardano-streaming/src/Cardano/Streaming.hs Outdated Show resolved Hide resolved
marconi/test/Helpers.hs Outdated Show resolved Hide resolved
marconi/test/Helpers.hs Outdated Show resolved Hide resolved
marconi/test/Helpers.hs Outdated Show resolved Hide resolved
marconi/test/Helpers.hs Show resolved Hide resolved
cardano-streaming/src/Cardano/Streaming/Helpers.hs Outdated Show resolved Hide resolved
changelog.d/20221205_182223_markus.lall_plt_171.rst Outdated Show resolved Hide resolved
changelog.d/20221205_182223_markus.lall_plt_171.rst Outdated Show resolved Hide resolved
@eyeinsky
Copy link
Contributor Author

eyeinsky commented Dec 6, 2022

Ran it on mainnet, preliminary info here https://input-output.atlassian.net/browse/PLT-171?focusedCommentId=136344

marconi/test/Helpers.hs Outdated Show resolved Hide resolved
cardano-streaming/src/Cardano/Streaming.hs Outdated Show resolved Hide resolved
cardano-streaming/src/Cardano/Streaming.hs Outdated Show resolved Hide resolved
cardano-streaming/src/Cardano/Streaming.hs Outdated Show resolved Hide resolved
cardano-streaming/src/Cardano/Streaming.hs Show resolved Hide resolved
cardano-streaming/src/Cardano/Streaming.hs Show resolved Hide resolved
cardano-streaming/src/Cardano/Streaming.hs Outdated Show resolved Hide resolved
cardano-streaming/src/Cardano/Streaming.hs Show resolved Hide resolved
cardano-streaming/src/Cardano/Streaming/Helpers.hs Outdated Show resolved Hide resolved
cardano-streaming/src/Cardano/Streaming/Helpers.hs Outdated Show resolved Hide resolved
@eyeinsky
Copy link
Contributor Author

eyeinsky commented Dec 7, 2022

@raduom The comment on rollbackRingBuffer went outdated as it was removed, but the answer to why it was here in the first place is here #850 (comment)

@eyeinsky
Copy link
Contributor Author

eyeinsky commented Dec 7, 2022

My main issue with foldBlocks is that you can't resume. You always need to start from genesis whenever you stop and restart the indexer. I guess we want to work on this later?

Here are some more elaborate ideas that could help with it:

  • library-ize cardano-node: make it a library which could be used to build an indexer. It would have an API that would stream blocks and ledger states within the same process (= no double RAM use). I.e, cardano-node would be compiled into the indexer, and had an appropriate API.

    The node's library part would take care of networking, discovering and talking to other nodes, getting up to sync (essentially doing what currently the node running side-by-side with an indexer is doing). And addressing the security aspect mentioned in the other feature request ("The node is part of the trusted base of the system"), it shouldn't participate actively in the network, it will just be a passive listener. I think it would still need to have ledger state, as that is used to validate incoming blocks (transactions). If validation is not required then perhaps ledger state could be done away with entirely.

  • Since cardano-node is already able to serialize ledger state somehow (because it can be restarted) then using cardano-node as a library would solve the restart issue.

Aside from issue linked above, there has been a proposal by @erikd to pass pieces of ledger state over the chainsync protocol. But I think having a protocol to pass info over a socket in a situation where both a node and dbsync are typically running on the same machine, is just overhead -- just compile the node into the indexer, and the problems of double memory use, serializing and deserializing ledger state, coming up with a protocol on how to query for pieces of ledger state, etc. would all go away.

@koslambrou
Copy link
Contributor

You still have some conflicts

@eyeinsky
Copy link
Contributor Author

Rebased on to lates master and pushed.

These PRs in cardano-node would still be needed to be merged:

The CI seems to fail for unrelated reasons there (as far as I can tell).

, base >=4.9 && <5
, base16-bytestring
, bytestring
, cardano-api
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't these IOG dependencies?

sqlite :: SQL.Connection -> S.Stream (S.Of Event) IO r -> S.Stream (S.Of Event) IO r
sqlite c source = do
lift $ SQL.execute_ c
"CREATE TABLE IF NOT EXISTS stakepool_delegation (poolId BLOB NOT NULL, lovelace INT NOT NULL, epochNo INT NOT NULL)"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we may need a primary key on this table, or some indicator that make a ROW unique, besides the implicit SQLite Row-Id.
We don't drop the table, we recreate if it doesn't exist, and start inserting it to it, so it is possible to insert the same events multiple times which I don't think is correct

…no-api

- Cardano.Streaming.Callbacks: Add primitive callback based versions
  of local chainsync to Cardano.Streaming. `blocksCallback` and
  `blocksCallbackPipelined` both take a callback when iterating over the
  received blocks from local chainsync connection

- Cardano.Streaming: streaming package's versions to stream blocks or
  ledger states. `foldLedgerState` does essentially what `foldBlocks`
  does in cardano-api:Cardano.Api.LedgerState, with the difference
  that it doesn't drop connection when it's up to date with regarding
  to the node, but keeps listening.
The `toEvents` and `getStakeMap` make up the core of the
implementation in Marconi.Index.EpochStakepoolSize.
@eyeinsky eyeinsky merged commit 6990714 into main Jan 17, 2023
@eyeinsky eyeinsky deleted the ml/plt-171 branch January 17, 2023 16:02
koslambrou pushed a commit that referenced this pull request Apr 6, 2023
* Add suffixes to sqlite databases

* Add ledger state streaming, equivalent to foldBlocks defined in cardano-api

- Cardano.Streaming.Callbacks: Add primitive callback based versions
  of local chainsync to Cardano.Streaming. `blocksCallback` and
  `blocksCallbackPipelined` both take a callback when iterating over the
  received blocks from local chainsync connection

- Cardano.Streaming: streaming package's versions to stream blocks or
  ledger states. `foldLedgerState` does essentially what `foldBlocks`
  does in cardano-api:Cardano.Api.LedgerState, with the difference
  that it doesn't drop connection when it's up to date with regarding
  to the node, but keeps listening.

* Add epoch stakepool size indexer

The `toEvents` and `getStakeMap` make up the core of the
implementation in Marconi.Index.EpochStakepoolSize.

* Refactor marconi test for reuse in epoch stakepool size indexer

* Add test for epoch stakepool size indexer

* Add changelog entry

* Disable test and add TODO

* Fix warnings

* Fix marconi.cabal
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants