-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Pre-generate runtime metadata, remove from runtime binary? #10057
Comments
I did some tinkering today and it doesn't look like it's the polkadot node that is slow here. I ran 10 connections on 10 threads for 30min against a local, synced, polkadot node, querying
This is http and not ws (which is what polkadot.js uses), but it seems like the node is pretty snappy. So I'm not convinced we necessarily need to intervene on the substrate node side here, unless we want to make changes to offload the js client; things like doing the decoding on the node side, strip out doc strings, compressing the payload (likely not helping though), or some other filtering of what portion of the metadata to return. |
@dvdplm I believe that the main point here is to split the metadata from the runtime itself, to lower the overall runtime code size, which affects other things beyond raw performance of retrieving the metadata (runtime upgrade transaction size, PoV blocks, etc). |
For me the main question is still, where to put the metadata? If this question is solved, the rest is easy. How complicated is it to fetch the metadata for example from ipfs @jacogr ? |
Substrate node is possible to serve IPFS content right? So one possible solution is have the native client serving the metadata over IPFS. |
And how does the client gets the metadata? |
The runtime includes the metadata multihash, and client just fetch it on IPFS network and hopefully can be served by one of the nodes? Network operators can also pin the metadata to ensure the availability. |
Not sure about the IPFS status in Substrate. |
@tomusdrw No doubt there are optimizations to be made here. The point I was trying to make is that on the user side of things, in polkadot.js and sidecar, fetching metadata is ~1000 times slower than getting the raw metdata over the RPC. That said there are other good reasons for working on this, as you point out. |
@bkchr As long as it has a hash, it can make an ipfs retrieval. Worst-case, if available on the node, can also store it "elsewhere" and return via a different RPC call that gets it from "elsewhere". (Wherever that may be - file system, runtime, offchain DB, ipfs, etc.) The only issue is making sure it is available exactly at the point the upgrade is live, and available to all. @dvdplm Indeed via the JS API, it is much, much slower. The bottleneck from a JS API perspective is not the node, well, we had some issues with historic metadata at some point, but for latest, it certainly is never an issue. (Also the JS side is even slower if you are fetching a non-recent version of the metadata since there are conversion steps in-between). On my side I don't focus on performance at all, the prime directive with the time I have available is"getting it right". |
Okay. Then I would propose that we do the following:
|
I'm extremely opposed to adding the metadata on IPFS. The IPFS protocol is extremely complicated, and you can only access it in three ways:
The first solution is problematic. Users that want to access a Substrate chain would have to install two binaries: the Substrate node client and an IPFS client. The IPFS node is also not embeddable in web pages, which makes it a no-go for substrate-connect. The second solution goes against the idea of decentralization. If the gateway goes down or is blocked, IPFS becomes inaccessible. The third solution would add thousands of hours of development and create its own problems, and is generally not realistic. |
I thought there was already a lot of progress in Rust implementations of IPFS, would it really be thousands of hours? |
We're talking about replacing something simple that works well and without any issue with something super complicated and not ready for prime time. This is not a good trade off. As a side topic, I also want to point out that Substrate chains already have a DHT with the exact same algorithm as IPFS does. It's used to store the authority discovery records, but storing other things on it is trivial. All the functions are there and are robust. |
Following on from #8615, the size of the metadata has increased (because of all the type information). Here are some benchmarks I did during the PR:
#8370 (comment)
#8615 (comment)
#8615 (comment)
There was some discussion #8370 (comment) (/cc @tomusdrw, @apopiak, @bkchr, @dvdplm) about whether it would be worth pre-generating the metadata and storing it somewhere other than in the runtime binary. For the purposes of getting the PR done sooner rather than later, we decided we could handle the larger binary size in the short term.
One solution would be to generate the metadata during build time, which might involve:
std
std
and then invoke agenerate_metadata
functionI'm not intimately familiar with the existing build process, but I believe this workflow is not possible for the existing
dev
chain runtime where the wasm bytes are embedded directly into a constant for thestd
build.One alternative might be to include the metadata only in the
std
build and have the node only allow thestate_getMetadata
query when running a native runtime, and then do some caching there. Although I think I heard that native runtimes might be deprecated in the future.So the open questions here are:
Linking to #10056, which can be solved by this.
The text was updated successfully, but these errors were encountered: