-
Notifications
You must be signed in to change notification settings - Fork 622
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add a time limit for processing transactions in produce_chunk #10310
Conversation
To produce a chunk the node fetches transactions from the transaction pool and adds them to the chunk. Adding a transaction to the chunk can be computationally expensive - the transaction is converted to receipt, which burns gas. Because of that the number of transactions has to be limited. The node will fetch transactions until some kind of limit is reached, and then decide that the chunk is ready and publish it. Currently we have a limit of gas that can be used for preparing transactions, but there is a big problem with it - it doesn't take into account discarding invalid transactions from the pool. Processing an invalid transaction doesn't burn gas, so the node could spend a lot of time discarding invalid transactinos without ever reaching the limit. This would make chunk production very slow and the others might decide to just skip the chunk instead of waiting for it. Let's add another limit to `prepare_transactions`: a time limit. When producing a chunk the node will be allowed to add new transactions to the chunk for up to X seconds. This will ensure that a chunk is produced quickly, even when there are many invalid transactions. The limit is an `Option` because there might be cases where we don't want to use a time limit. For example there might be tests for `prepare_transactions` which test the gas_limit. In such tests we wouldn't want the time limit to get in the way, it would only make the tests more unreliable.
Add a config option that allows to specify a time limit on adding transactions during chunk production. This option limits the amount of time that a node will spend adding transactions to the produced chunk. It's good to make this configurable - the default value seems reasonable, but somone might want to tweak it in the future. There's no ideal value here, choosing a limit that's too small would lead to chunks with very few transactions, while choosing a value that's too big would lead to timeouts and skipped chunks.
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #10310 +/- ##
==========================================
+ Coverage 68.19% 71.81% +3.61%
==========================================
Files 700 712 +12
Lines 141495 143085 +1590
Branches 141495 143085 +1590
==========================================
+ Hits 96489 102750 +6261
+ Misses 40455 35632 -4823
- Partials 4551 4703 +152
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM thanks!
cc @posvyatokum we may want to get it included in 1.37
@@ -1020,11 +1021,21 @@ impl RuntimeAdapter for KeyValueRuntime { | |||
transactions: &mut dyn PoolIterator, | |||
_chain_validate: &mut dyn FnMut(&SignedTransaction) -> bool, | |||
_current_protocol_version: ProtocolVersion, | |||
time_limit: Option<Duration>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the time_limit should be ignored in the kv runtime. This runtime is only used in tests and there we don't want to have anything depend on the real wall clock time.
/// A node produces a chunk by adding transactions from the transaction pool until | ||
/// some limit is reached. This time limit ensures that adding transactions won't take | ||
/// longer than the specified duration, which helps to produce the chunk quickly. | ||
pub produce_chunk_add_transactions_time_limit: Duration, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessarily in this PR but it may be a good idea to make this config dynamic - so it is possible to update it without restarting the node. If you're interested you can search for MutableConfigValue or "expected_shutdown" which is an example of a dynamic field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I made it dynamic.
@@ -384,6 +389,7 @@ impl ClientConfig { | |||
"state_split_config", | |||
), | |||
tx_routing_height_horizon: 4, | |||
produce_chunk_add_transactions_time_limit: Duration::from_millis(200), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I slightly dislike that the default is defined in two separate places. Is it done the same way for other fields?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a config for test purposes and all other values are also specified by hand.
For example, this config has:
header_sync_stall_ban_timeout: Duration::from_secs(30),
Even though the default value is different:
fn default_header_sync_stall_ban_timeout() -> Duration {
Duration::from_secs(120)
}
I agree that it would be nicer to use the real default value instead of handwritten ones. I will modify it to use the default value function.
The only downside is that it will introduce a cyclical dependency (Config
<-> ClientConfig
). Some people dislike that, but that feels very purist, I think it's okay in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, the circular dependency would be pretty severe here.
The functions with default config values are in the crate nearcore
, while this test configuration is in the crate near-chain-configs
. So it would make the configs crate depend on the entirety of nearcore
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we could move the default value functions to near-chain-configs
(?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved the default values for ClientConfig to the same module as ClientConfig, this makes it possible to use the default value in ClientConfig::test
without any circular dependencies
@@ -711,7 +712,13 @@ impl RuntimeAdapter for NightshadeRuntime { | |||
pool_iterator: &mut dyn PoolIterator, | |||
chain_validate: &mut dyn FnMut(&SignedTransaction) -> bool, | |||
current_protocol_version: ProtocolVersion, | |||
time_limit: Option<Duration>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it an option here but not an option in the neard config?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's an Option
in the function because someone might want to use this function without a time limit (e.g for tests).
For the main config I didn't see any real reason not to use a time limit, so I didn't make it optional.
I can make it an Option
in the config as well, just to keep things consistent.
@@ -763,6 +770,7 @@ impl RuntimeAdapter for NightshadeRuntime { | |||
while total_gas_burnt < transactions_gas_limit | |||
&& total_size < size_limit | |||
&& transactions.len() < new_receipt_count_limit | |||
&& !time_limit_reached() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a bit atypical, I was expecting something like all of the above e.g. std::time::Instant::now() < time_limit
but yours is cool too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the time_limit
is an option so it can't be easily compared. It would have to be something like
time_limit.map(|t| t >= start_time.elapsed()).unwrap_or(false)
Which doesn't feel readable to me.
I could also set the time_limit
to +inf when the option is None, but this felt hacky.
I think time_limit_reached()
is the most readable option here.
Some more general thoughts:
|
Is this protocol change, since it somewhat changes expectation? (E.g. a chunk may be formed without using all gas capacity when reaching timeout limit). I don't think it needs NEP since:
But still just want to make sure |
I agree, I don't think NEP is needed, it's not a protocol change. As you said different clients can choose to use different configuration or even implementation for choosing transactions from the pool and it's totally fine. |
Hmm do we really want a warning? Reaching a limit is a normal thing under congestion. When NEAR is at the capacity it should just gracefully handle it. Otherwise we will spam warnings on every block when the load gets high. There is no corresponding warning for reaching the |
kv_runtime is used only for testing. It doesn't make sense to use a time limit there. Gas limit also isn't used at all.
The default values for `ClientConfig` vere defined in the `nearcore` crate. Let's move them to the same module where `ClientConfig` is defined, `client_config.rs` in `near-chain-configs` crate. This makes it possible to use the default values in other crates without importing the entire `nearcore` crate. Defaults should be declared in the same module as the config, it's cleaner this way.
When creating a test instance of ClientConfig, let's use the default value for produce_chunk_add_transactions_time_limit instead of hardcoding a value. This is possible to do now because the previous commit moved default values to the same crate as ClientConfig.
…onfig Make this config value optional. It makes it possible to disable the time limit using config.json
Make it possible to update this config option while the node is running. I manually verified that sending a SIGHUP signal to a running neard process reads the new value from config.json and starts using it.
v2:
|
Imo it's better to log more than less. It's easy enought to grep it out if needed and at least we'll have quick signal when people start asking why their transactions are not being picked up.
I'd say let's add a generic warning when we fail to get all the transactions from the pool for whatever reason, and include that reason in the error message. Again easy enough to get rid of it and it provides quick signal. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
In my understanding a warning is something that shouldn't really be happening, but isn't a critical error. Reaching the limit doesn't fit this definition, it's a normal event in the system. I'm worried that in the future NEAR will be used so heavily that all chunks will be congested and the warning will be printed on every single produced chunk. It's easy to grep out, but grepping things out is annoying. I think ideally the logs should be readable by humans. Maybe instead of a warning we could add some kind of metric. For example we could add two counters:
This would allow us to create a Grafana dashboard showing what percent of chunks is congested during a given timerange. Metrics feel like a better fit for observability than looking at warnings in the logs. |
I agree and it boils down to how we think about congestion - whether it is normal or abnormal. Etiher way of thinking can easily be argued for. My thinking is that congestion is temporary and undesirable situation, that may require intervention. As such it should be signalled every possible way and any extra info is desired. I agree that warning on every produced chunk may be a bit of an overkill. I'm fine with either, only metrics or metrics and warnings.
Yeah that sounds good to me.
An alternative set of metrics could be (total, congested) instead of (non-congested, congested). Those are equivalent and mutually derivable of course but I'm guessing that total is already defined and overall it seems more natural to work with it like that. |
Alright sounds good, in that case I will merge this PR and go implement the metrics. I don't like the idea of the warnings, so for now I'll just go with metrics only, we'll see how readable they turn out. (although a commit implementing the warnings is available here, in case we decide to go through with it: jancionear@b31728c) |
…sactions from the pool (#10422) When a chunk is produced, Client calls `prepare_transactions()`, which fetches transactions from the transaction pool and adds them to the chunk. `prepare_transactions()` adds transactions until some limit is hit, a limit on gas, time, number of transactions, chunk size, etc. Currently there's no way to know if some limit was hit and which limit was it. Let's add a way to access this information, it's useful to know what's the bottleneck of chunk production. This PR adds a new metric which counts how many of the produced chunks couldn't fit all transactions from the transaction pool due to hitting a limit. It has two labels: `shard_id` - chunk's shard and `limited_by` - which of the limits was hit in `prepare_transactions()`. The need for this metric was discussed in #10310 (comment). The hope is that it'll allow us to figure out what's the bottleneck of chunk production in scenarios of high chain congestion. Right now in cases where `produce_chunk` takes a lot of time we can only make theories why that's the case, but there's no observability into what's going inside. This metric could help with that. To test that the metric works I set up a mocknet network with a build from this branch and put a load on it using locust. I reduced `produce_chunk_add_transactions_time_limit` from 200ms to 2ms to easily trigger the metric. It worked, as can be observed on the grafana dashboard: https://nearinc.grafana.net/d/eada7f01-b2dc-4df8-8a9f-ec4ec411159e/jancio-mocknet-stats?orgId=1&from=1705135200000&to=1705135800000 ![Screenshot 2024-01-13 at 15 21 12](https://github.com/near/nearcore/assets/149345204/51ba6a04-c2ff-4773-a8d7-9cfbff8a7fa5)
To produce a chunk the node fetches transactions from the transaction pool and adds them to the chunk. Adding a transaction to the chunk can be computationally expensive - the transaction is converted to receipt, which burns gas. Because of that the number of transactions has to be limited. The node will fetch transactions until some kind of limit is reached, and then decide that the chunk is ready and publish it.
Currently we have a limit of gas that can be used for preparing transactions, but there is a big problem with it - it doesn't take into account discarding invalid transactions from the pool. Processing an invalid transaction doesn't burn gas, so the node could spend a lot of time discarding invalid transactions without ever reaching the limit. This would make chunk production very slow and the others might decide to just skip the chunk instead of waiting for it.
Let's add another limit to
prepare_transactions
: a time limit. When producing a chunk the node will be allowed to add new transactions to the chunk for up to X milliseconds. This will ensure that a chunk is produced quickly, even when there are many invalid transactions.Choosing a good value for the limit is difficult, so a new config option is added to allow changing it on the fly:
produce_chunk_add_transactions_time_limit
.The default value for the time limit is 200ms. The reasoning for choosing this value is as follows:
Assuming that we want to produce ~1 block per second, 200ms is the largest value that would support that. If my understanding of the flow is correct, a 200ms chunk production would look like this:
This gives us a 1s time to produce a block.
Looking at the metrics from mainnet, the P99 time to produce a chunk is ~60ms (and maximum is similar). 200ms is more than three times that, so it shouldn't break anything on the chain: https://nearinc.grafana.net/d/cd668436-f19a-4f3d-b4a5-a35ebd3701b0/chunk-production-stats?from=now-12h&to=now&refresh=1h&orgId=1
During the minting of NEAT, the P99 time to produce a chunk jumped to ~860ms. This is unacceptable, as this will cause the chain to produce less than one block per second. This limit would prevent that. https://nearinc.grafana.net/d/cd668436-f19a-4f3d-b4a5-a35ebd3701b0/chunk-production-stats?from=1701298800000&to=1701342000000&orgId=1
For a more detailed discussion, see the comments in #10278
Fixes: #10278