-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
http requests fails with timeout on large block (json.hpp
issue?)
#677
Comments
This works fine in nodeos 3.1. So regression here. |
|
Seems to be a deliberate but poorly documented change in #96 |
I don't know why we would want to have limits on get_block. If the block is stored and requested we want to send it. |
My (limited) understanding is that if the node is used for block production, we may not want it failing to do so because its main thread is busy serving long http requests. If the node is used mainly for serving data requests, yes probably these limits should be relaxed. |
block production nodes should never be serving general http requests anyway. |
So just set a very high limit on the api node. Or do you want to avoid long running |
In an ENF backlog refinement session, we discussed this issue at length, but didn’t come to a clear consensus on the best path forward. We did all agree that the current behavior is undesirable and that more comprehensive thought is required around api architecture and where serialization should be handled. For now, we will simply communicate a workaround. The workaround: A more complete statement with context: If a node operator would like to keep a lower maximum allowed http response times for endpoints that don’t have this problem, one option is to use a proxy to direct GetBlock requests to a dedicated nodeos instance with a higher max-http-response-time, and all other requests to a nodeos instance with your preferred max-http-response-time. |
Time for the abi serializer to_variant vs the conversion to json for EOS mainnet block 291953778 on my i9-12900K
|
Well, I guess it shows that the issue, if there is one, is in |
Yeah, that is as expected. Although that to json is rather terrible as well. |
For reference the same block without exploding it via abi_serializer (what a
|
Move most of the abi_serialization off the main thread:
|
Wow, did that speed up the abi serialization a lot? Above you reported that |
Yes. This uses an abi cache so that an abi is only pulled out of chainbase and an abi_serializer created once per account. In the existing implementation that is done over and over again each time the account is referenced in the block. |
OK, that makes more sense, looks like an great speedup, over 4x faster. So the speedup is because of the cache, not because it is done on a different thread. I'll have to check out the cache you implemented. Is it a lru cache? And does it need to support concurrent access from multiple threads? |
I believe so, yes.
Nothing that complicated is used. The cache only lives for the life of the get_block call. Since the gather abi which creates the cache is less than 1ms in this example; I think that is fine as a final solution. |
… issue with invalid abi should not throw exception.
Should we consider this closed by #696 ? |
Yes, I think so, but maybe @matthewdarwin thinks otherwise? With your cache the request is is significantly faster so that should lessen the problem. |
Yes, on my machine for the test block cut it in half. |
Wow, that's great. So for that block, the total request processing time is about 8x faster than before your changes? |
The
Now that is ~350us on my machine. |
For having a good idea of the progress made in the PR, I think it would be very helpful if you provided the updated times for the following two measurements:
|
This must mean that the cache saves us a lot of duplicate |
Current PR (34baed9) perf:
Nothing in the PR should make any difference in The exciting thing about this change is how much less time is spent on the main thread. |
Great, so we are 5x faster on to_variant, 4x faster overall (including to_json). And almost all off the main thread. That is a great PR! |
Move get_block abi serialization off the main thread
in eosq block explorer.
The message mentions
json.hpp:63
, which is in astring to_string()
function. Maybe we have an issue where creating json output takes excessively long for large trees?The text was updated successfully, but these errors were encountered: