[FR] - Query utxo indexer service #4678

disassembler · 2022-11-29T17:54:03Z

Internal/External
Internal if an IOHK staff member.

Area
Other Any other topic (Delegation, Ranking, ...).

Describe the feature you'd like
Do not use ledger state for query utxo.

The utxo on-disk work will make querying utxo from ledger state really slow (it's already really slow and getting slower as utxo size increases). What we need is a separate service that talks to the node socket to keep track of all utxos (or maybe a subset of interesting ones that the user defines). What we want to do is provide a similar interface, e.g.

cardano-cli query utxo --testnet-magic 1 --address addr_test1vzpwq95z3xyum8vqndgdd9mdnmafh3djcxnc6jemlgdmswcve6tkw

But not use the ledger state queries to get this information. Essentially, this will need to something like foldBlocks or marconi does indexing state transitions on the chain from genesis and keeping it separate from the node itself.

Describe alternatives you've considered

Slow querying to disk
in memory boolean in config for utxo to remain in memory
Additional context / screenshots
Add any other context or screenshots about the feature request here.

From @dcoutts

@abailly-iohk yes, the only way to make it fast is to use an index. The original feature was a quick convenient hack that got totally out of hand. (My bad.) And yes, sqlite would be perfectly good as an implementation of such and index, and the right place to do that is in an indexing client, and not within the node process itself.

One of our general architectural principles for the node and its external clients is that we make all chain information available to clients, but we do not store information in the node (and provide client access) that the node does not itself need. The node is part of the trusted base of the system. It is complex and has strong security and performance requirements. It is the wrong place to build database features. External clients are the perfect place to build database features.

The right solution is a client that maintains an index and provides query access. There are then multiple choices: provide a minimal one that just replicates what the CLI provided before (via the node), or use some existing more general purpose indexer, or some combo to satisfy different use cases.

The text was updated successfully, but these errors were encountered:

abailly-iohk · 2022-11-30T16:28:40Z

There already exists quite a few indexers out there. And I can recommend one from a former colleague of mine: https://github.com/CardanoSolutions/kupo
What would probably be the best route moving forward would be to define a common interface for this kind of query that both clients (cardano-cli) and servers (kupo, marconi, oura, blockfrost...) would agree on implementing. Then the cardano-cli could be pointed at various services, possibly with fallbacks.

xoriole · 2022-12-05T13:34:47Z

`cardano-ledger-index` service for Cardano

cardano-ledger-index service provides helpful queries for ledger state. It maintains extra indexes for faster querying of ledger state. Built on Haskell. Uses SQLite for persistence.

Reference: Marconi
https://github.com/input-output-hk/plutus-apps/tree/main/marconi

Support for queries:

query utxo by address
query utxo by stake-address

Future queries support:

query balance by address or stake address
query addresses by stake-address
query pool live stakes
query pool delegations.

How is it different from dbsync

cardano-db-sync indexes everything on the blockchain, and maintains all the past data. Finding current state requires multiple aggregations and is intensive.
cardano-ledger-index indexes and maintains only current state of blockchain. It will be faster to query current ledger state of address, utxos, delegations, pools etc.

disassembler · 2022-12-05T15:26:59Z

@abailly-iohk is our head of architecture. He'll be providing some feedback on how he wants this to be architected.

Jimbo4350 · 2022-12-05T18:06:38Z

I'd like to petition to keep the UTxO query for the case of local testnets. We have tests in cardano-node that utilize the UTxO query and in this instance performance is not an issue since the UTxO is small. The UTxO query is also useful for quick and dirty debugging especially when integrating a new era. Not having to setup an additional indexer would save myself and QA additional work.

abailly-iohk · 2022-12-05T18:19:18Z

The goal is to keep the UTxO query in the cardano-cli in order to not disrupt (too much) downstream users, but to move the work needed to execute the query outside of the node/consensus.
I think that for the simple (testnets) use case the role of an indexer could be fulfilled by some stateless query tool able to read directly the node's Chain DB, much like what the existing db-analyzer command-line tool does.

abailly-iohk · 2022-12-05T18:36:26Z

@disassembler We would need some numbers on the current vs. forecasted performance of this (and possibly other) kind of queries, and a better understanding of what's the target/requirement from users' perspective. I think @dnadales or @jasagredo have already run or will run some benchmarks for the utxo-hd case.

abailly-iohk · 2022-12-06T19:09:52Z

Also, we really would want to have some some kind of impact study which implies talking to various node users to understand what would be an acceptable solution.

abailly-iohk · 2022-12-12T07:48:59Z

@disassembler Shouldn't we reclassify this issue as Spike or Experiment based on DQ's work?

dQuadrantDev · 2022-12-21T11:09:36Z

While being stuck on ouroboros-network, we have explored the alternative approach of working on the rpc command integration to cardano-cli.

A non-production version for the marconi-mamba with rpcs and update of cardano-cli seems doable within this week. By rpc we are refering to json rpc simlar to bitcoin and ethereum

We'll update the marconi-mamba application to support rpc for balance and utxo query on indexed db.
We'll update cardano-cli with new subcommand rpc so commane like:
cardano-cli rpc ada_getutxos or cardano-cli rpc ada_getbalance
will return the response via the rpc.

Meanwhile, we can also explore the ouroboros-network for the proxying part where a meeting with ouroboros-network team would greatly help.

Marconi

Marconi is updated on dQuadrant repo so that marconi-mamba could be integrated to test the utxos query
Some optimization and pretty print are remaining on the utxo side.
Repo: GitHub - dQuadrant/plutus-apps: The Plutus application platform

mesudip · 2022-12-21T15:08:52Z

We we were looking at ways to intercept and interpret cli-to-node communication and return query result using different service.

But it seems that the setting up an intercepting service was harder than we thought. cardano-api didn't have enough interface for this purpose and we looked at the ouroboros-network repo.

Can somebody point out how to setup an node-to-client service with ouroboros-network library but different handlers for the protocol messages?

Also, all the query apis in cardano-api seems to close connection to the node-socket after the query, Is there a way to implement the query service with persisted connection?

dQuadrantDev · 2022-12-26T11:48:14Z

@disassembler The non-production demo is available here.
https://github.com/dQuadrant/plutus-apps/blob/feature/rpc/Readme.md

mesudip · 2023-01-27T09:17:30Z

I've created a PR. #4810

gitmachtl · 2023-01-30T21:36:12Z

Can we have an output, that replicates the

cardano-cli query utxo --testnet-magic 1 --address addr_test1vzpwq95z3xyum8vqndgdd9mdnmafh3djcxnc6jemlgdmswcve6tkw

one? With that i mean a plaintext output like the original one. Many tools on the cli rely on this kind of output, because of the still existing bug in jq to work with really large numbers. For that reason also f.e. koios sends the amounts as strings. The current output from cardano-cli query utxo - if you send it to an out-file - is json, but tools like jq (most common on the cli) is having a problem with those large numbers (lovelaces). So, many tools are just using the plaintext output that cardano-cli provides and read the values from there. It would be extremly awesome and helpful if there would be a "cli compatible" output mode to it can just replace the old command.

rdlrt · 2023-01-31T02:34:40Z

The addition from PR seems a bit less than ideal to my eyes - essentially proposal is using cardano-cli for curl substitution just for user facing convenience, calling out small deficiencies that I see :

I find it strange to mention integration of generic RPC without any CIP standard for methods and formats alignment. Moreover, It seems this is intended to initially integrate via marconi (for instance, earlier referred Kupo provides exactly similar functionalities). From history : work with different teams often ends up becoming a bottleneck across updates (supporting versions/forks/feature additions/modifications, etc) - if teams are not aligned and working against a standard, these incompatibilities would become difficult to manage.
The solutions based on sqlite do not scale well (horizontally/vertically/clustering), and can be easily corrupted - while this is only intended to keep 'live view' which is substantially smaller subset of data - populating these could still take a while(?). Having said that - it could be a seperate issue if going down generic RPC lane.
There's often a lot of repetition for creating HTTP layer (not just a working GET/POST interface) from scratch which eventually becomes handing off basic security functionality to proxy layers (like nginx/haproxy/caddy). If not considered as standard practice - would lead insecure instances operating in public.
Likewise, it would also be essential for client (CLI) to negotiate with these transport mechanisms accordingly in ways that are compatible with common security additions (at minimum negotiation over TLS, providing error-mechanisms, etc)

mesudip · 2023-02-06T06:12:29Z

@gitmachtl @rdlrt
Thank you for your feedback on the PR. I would like to clarify that this is not a final PR, It was a quick implementation based on what's already available to show what can be done. I can understand your concerns regarding compatibility, scalability and security.

Based on the above feedback I have listed following points to be considered for final implementation

Prepare CIP standard for set of JSON RPC methods that a cardano-indexing-solution can support. An implementation may support a subset of methods.
Use marconi for connecting with cardano-node, but prepare an interface for backend storage that can be easily swapped. We can start with a backend that supports SQL standard.
In our cardano-cli RPC command PR, we have already included option to add basic-auth. We will add Basic-Auth configuration to the rpc-server. I think we can leave out SSL/TLS part to the proxy layer.
Regarding the output format, we can start with JSON, as it's the most recognized format. Using plaintext output for building wrapper services is one of the issues currently in cardano cli/ecosystem in general. I think we should try to standardize the output format of CLI, maybe by adding --format for the output format as suggested in [FR] - CLI improvements for formatting and consistency #3213

Given the considerations outlined, do you believe that this is the right way to go forward with the proposed changes?

Regards,
- mesudip

github-actions · 2023-03-14T01:49:56Z

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 120 days.

disassembler added the type: enhancement An improvement on the existing functionality label Nov 29, 2022

disassembler changed the title ~~[FR] -~~ [FR] - Query utxo indexer service Nov 29, 2022

disassembler mentioned this issue Nov 30, 2022

[FR] - Provide a query UTxO by address query that does not depend on the node #4541

Closed

eyeinsky mentioned this issue Dec 7, 2022

Add epoch stakepool size indexer IntersectMBO/plutus-apps#850

Merged

9 tasks

dorin100 added type: internal feature Non user-facing functionality user type: internal Created by an IOG employee labels Jan 5, 2023

jasagredo mentioned this issue Jan 5, 2023

Bring query utxo by address command performance on par with current version IntersectMBO/ouroboros-consensus#205

Closed

github-actions bot added the Stale label Mar 14, 2023

jorisdral removed the Stale label Jul 18, 2023

This was referenced Aug 8, 2023

Add documentation for the general public IntersectMBO/ouroboros-consensus#252

Merged

UTxO-HD for-developers documentation IntersectMBO/ouroboros-consensus#250

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FR] - Query utxo indexer service #4678

[FR] - Query utxo indexer service #4678

disassembler commented Nov 29, 2022 •

edited

Loading

abailly-iohk commented Nov 30, 2022

xoriole commented Dec 5, 2022

disassembler commented Dec 5, 2022

Jimbo4350 commented Dec 5, 2022

abailly-iohk commented Dec 5, 2022

abailly-iohk commented Dec 5, 2022 •

edited

Loading

abailly-iohk commented Dec 6, 2022

abailly-iohk commented Dec 12, 2022

dQuadrantDev commented Dec 21, 2022

mesudip commented Dec 21, 2022

dQuadrantDev commented Dec 26, 2022

mesudip commented Jan 27, 2023

gitmachtl commented Jan 30, 2023 •

edited

Loading

rdlrt commented Jan 31, 2023

mesudip commented Feb 6, 2023 •

edited

Loading

github-actions bot commented Mar 14, 2023

[FR] - Query utxo indexer service #4678

[FR] - Query utxo indexer service #4678

Comments

disassembler commented Nov 29, 2022 • edited Loading

abailly-iohk commented Nov 30, 2022

xoriole commented Dec 5, 2022

cardano-ledger-index service for Cardano

How is it different from dbsync

disassembler commented Dec 5, 2022

Jimbo4350 commented Dec 5, 2022

abailly-iohk commented Dec 5, 2022

abailly-iohk commented Dec 5, 2022 • edited Loading

abailly-iohk commented Dec 6, 2022

abailly-iohk commented Dec 12, 2022

dQuadrantDev commented Dec 21, 2022

mesudip commented Dec 21, 2022

dQuadrantDev commented Dec 26, 2022

mesudip commented Jan 27, 2023

gitmachtl commented Jan 30, 2023 • edited Loading

rdlrt commented Jan 31, 2023

mesudip commented Feb 6, 2023 • edited Loading

github-actions bot commented Mar 14, 2023

disassembler commented Nov 29, 2022 •

edited

Loading

`cardano-ledger-index` service for Cardano

abailly-iohk commented Dec 5, 2022 •

edited

Loading

gitmachtl commented Jan 30, 2023 •

edited

Loading

mesudip commented Feb 6, 2023 •

edited

Loading