feat: add a time limit for processing transactions in produce_chunk #10310

jancionear · 2023-12-07T17:25:51Z

To produce a chunk the node fetches transactions from the transaction pool and adds them to the chunk. Adding a transaction to the chunk can be computationally expensive - the transaction is converted to receipt, which burns gas. Because of that the number of transactions has to be limited. The node will fetch transactions until some kind of limit is reached, and then decide that the chunk is ready and publish it.

Currently we have a limit of gas that can be used for preparing transactions, but there is a big problem with it - it doesn't take into account discarding invalid transactions from the pool. Processing an invalid transaction doesn't burn gas, so the node could spend a lot of time discarding invalid transactions without ever reaching the limit. This would make chunk production very slow and the others might decide to just skip the chunk instead of waiting for it.

Let's add another limit to prepare_transactions: a time limit. When producing a chunk the node will be allowed to add new transactions to the chunk for up to X milliseconds. This will ensure that a chunk is produced quickly, even when there are many invalid transactions.

Choosing a good value for the limit is difficult, so a new config option is added to allow changing it on the fly: produce_chunk_add_transactions_time_limit.

The default value for the time limit is 200ms. The reasoning for choosing this value is as follows:

Assuming that we want to produce ~1 block per second, 200ms is the largest value that would support that. If my understanding of the flow is correct, a 200ms chunk production would look like this:
- Chunk producer produces a chunk (200ms)
- Chunk producer publishes the chunk (50ms (?))
- Validator receives the chunk
- Validator validates the transactions, redoing chunk producer's work (200ms)
- Validator runs receipts (500ms)
This gives us a 1s time to produce a block.
Looking at the metrics from mainnet, the P99 time to produce a chunk is ~60ms (and maximum is similar). 200ms is more than three times that, so it shouldn't break anything on the chain: https://nearinc.grafana.net/d/cd668436-f19a-4f3d-b4a5-a35ebd3701b0/chunk-production-stats?from=now-12h&to=now&refresh=1h&orgId=1
During the minting of NEAT, the P99 time to produce a chunk jumped to ~860ms. This is unacceptable, as this will cause the chain to produce less than one block per second. This limit would prevent that. https://nearinc.grafana.net/d/cd668436-f19a-4f3d-b4a5-a35ebd3701b0/chunk-production-stats?from=1701298800000&to=1701342000000&orgId=1

For a more detailed discussion, see the comments in #10278

Fixes: #10278

To produce a chunk the node fetches transactions from the transaction pool and adds them to the chunk. Adding a transaction to the chunk can be computationally expensive - the transaction is converted to receipt, which burns gas. Because of that the number of transactions has to be limited. The node will fetch transactions until some kind of limit is reached, and then decide that the chunk is ready and publish it. Currently we have a limit of gas that can be used for preparing transactions, but there is a big problem with it - it doesn't take into account discarding invalid transactions from the pool. Processing an invalid transaction doesn't burn gas, so the node could spend a lot of time discarding invalid transactinos without ever reaching the limit. This would make chunk production very slow and the others might decide to just skip the chunk instead of waiting for it. Let's add another limit to `prepare_transactions`: a time limit. When producing a chunk the node will be allowed to add new transactions to the chunk for up to X seconds. This will ensure that a chunk is produced quickly, even when there are many invalid transactions. The limit is an `Option` because there might be cases where we don't want to use a time limit. For example there might be tests for `prepare_transactions` which test the gas_limit. In such tests we wouldn't want the time limit to get in the way, it would only make the tests more unreliable.

Add a config option that allows to specify a time limit on adding transactions during chunk production. This option limits the amount of time that a node will spend adding transactions to the produced chunk. It's good to make this configurable - the default value seems reasonable, but somone might want to tweak it in the future. There's no ideal value here, choosing a limit that's too small would lead to chunks with very few transactions, while choosing a value that's too big would lead to timeouts and skipped chunks.

codecov · 2023-12-07T17:48:01Z

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (8985a96) 68.19% compared to head (01da6d4) 71.81%.
Report is 3 commits behind head on master.

Files	Patch %	Lines
nearcore/src/dyn_config.rs	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #10310      +/-   ##
==========================================
+ Coverage   68.19%   71.81%   +3.61%     
==========================================
  Files         700      712      +12     
  Lines      141495   143085    +1590     
  Branches   141495   143085    +1590     
==========================================
+ Hits        96489   102750    +6261     
+ Misses      40455    35632    -4823     
- Partials     4551     4703     +152

Flag	Coverage Δ
backward-compatibility	`0.08% <0.00%> (?)`
db-migration	`0.08% <0.00%> (?)`
genesis-check	`1.26% <78.82%> (?)`
integration-tests	`36.20% <98.82%> (?)`
linux	`71.69% <98.82%> (+3.49%)`	⬆️
linux-nightly	`71.38% <98.82%> (?)`
macos	`54.48% <98.82%> (?)`
pytests	`1.49% <78.82%> (?)`
sanity-checks	`1.29% <78.82%> (?)`
unittests	`68.18% <98.82%> (-0.01%)`	⬇️
upgradability	`0.13% <0.00%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

wacban

LGTM thanks!

cc @posvyatokum we may want to get it included in 1.37

wacban · 2023-12-08T15:58:17Z

chain/chain/src/test_utils/kv_runtime.rs

@@ -1020,11 +1021,21 @@ impl RuntimeAdapter for KeyValueRuntime {
        transactions: &mut dyn PoolIterator,
        _chain_validate: &mut dyn FnMut(&SignedTransaction) -> bool,
        _current_protocol_version: ProtocolVersion,
+        time_limit: Option<Duration>,


I think the time_limit should be ignored in the kv runtime. This runtime is only used in tests and there we don't want to have anything depend on the real wall clock time.

wacban · 2023-12-08T16:01:22Z

core/chain-configs/src/client_config.rs

+    /// A node produces a chunk by adding transactions from the transaction pool until
+    /// some limit is reached. This time limit ensures that adding transactions won't take
+    /// longer than the specified duration, which helps to produce the chunk quickly.
+    pub produce_chunk_add_transactions_time_limit: Duration,


Not necessarily in this PR but it may be a good idea to make this config dynamic - so it is possible to update it without restarting the node. If you're interested you can search for MutableConfigValue or "expected_shutdown" which is an example of a dynamic field.

Ok I made it dynamic.

wacban · 2023-12-08T16:03:05Z

core/chain-configs/src/client_config.rs

@@ -384,6 +389,7 @@ impl ClientConfig {
                "state_split_config",
            ),
            tx_routing_height_horizon: 4,
+            produce_chunk_add_transactions_time_limit: Duration::from_millis(200),


I slightly dislike that the default is defined in two separate places. Is it done the same way for other fields?

It's a config for test purposes and all other values are also specified by hand.
For example, this config has:

header_sync_stall_ban_timeout: Duration::from_secs(30),

Even though the default value is different:

fn default_header_sync_stall_ban_timeout() -> Duration { Duration::from_secs(120) }

I agree that it would be nicer to use the real default value instead of handwritten ones. I will modify it to use the default value function.
The only downside is that it will introduce a cyclical dependency (Config <-> ClientConfig). Some people dislike that, but that feels very purist, I think it's okay in this case.

Ah, the circular dependency would be pretty severe here.
The functions with default config values are in the crate nearcore, while this test configuration is in the crate near-chain-configs. So it would make the configs crate depend on the entirety of nearcore.

I guess we could move the default value functions to near-chain-configs (?)

Moved the default values for ClientConfig to the same module as ClientConfig, this makes it possible to use the default value in ClientConfig::test without any circular dependencies

wacban · 2023-12-08T16:04:07Z

nearcore/src/runtime/mod.rs

@@ -711,7 +712,13 @@ impl RuntimeAdapter for NightshadeRuntime {
        pool_iterator: &mut dyn PoolIterator,
        chain_validate: &mut dyn FnMut(&SignedTransaction) -> bool,
        current_protocol_version: ProtocolVersion,
+        time_limit: Option<Duration>,


Why is it an option here but not an option in the neard config?

It's an Option in the function because someone might want to use this function without a time limit (e.g for tests).
For the main config I didn't see any real reason not to use a time limit, so I didn't make it optional.
I can make it an Option in the config as well, just to keep things consistent.

wacban · 2023-12-08T16:07:05Z

nearcore/src/runtime/mod.rs

@@ -763,6 +770,7 @@ impl RuntimeAdapter for NightshadeRuntime {
        while total_gas_burnt < transactions_gas_limit
            && total_size < size_limit
            && transactions.len() < new_receipt_count_limit
+            && !time_limit_reached()


That's a bit atypical, I was expecting something like all of the above e.g. std::time::Instant::now() < time_limit but yours is cool too.

the time_limit is an option so it can't be easily compared. It would have to be something like

time_limit.map(|t| t >= start_time.elapsed()).unwrap_or(false)

Which doesn't feel readable to me.

I could also set the time_limit to +inf when the option is None, but this felt hacky.

I think time_limit_reached() is the most readable option here.

wacban · 2023-12-08T16:09:20Z

Some more general thoughts:

Would be good to print a warning if the time limit is reached (can be included here)
Would be good to add some dedicated metrics for prepare_transactions (better done separately)

walnut-the-cat · 2023-12-08T16:14:43Z

Is this protocol change, since it somewhat changes expectation? (E.g. a chunk may be formed without using all gas capacity when reaching timeout limit).

I don't think it needs NEP since:

it doesn't have real visible user impact
it doesn't require all validators to adopt this feature.

But still just want to make sure

wacban · 2023-12-08T17:37:47Z

I agree, I don't think NEP is needed, it's not a protocol change. As you said different clients can choose to use different configuration or even implementation for choosing transactions from the pool and it's totally fine.

jancionear · 2023-12-08T18:05:56Z

Would be good to print a warning if the time limit is reached (can be included here)

Hmm do we really want a warning? Reaching a limit is a normal thing under congestion. When NEAR is at the capacity it should just gracefully handle it. Otherwise we will spam warnings on every block when the load gets high. There is no corresponding warning for reaching the gas_limit.

kv_runtime is used only for testing. It doesn't make sense to use a time limit there. Gas limit also isn't used at all.

The default values for `ClientConfig` vere defined in the `nearcore` crate. Let's move them to the same module where `ClientConfig` is defined, `client_config.rs` in `near-chain-configs` crate. This makes it possible to use the default values in other crates without importing the entire `nearcore` crate. Defaults should be declared in the same module as the config, it's cleaner this way.

When creating a test instance of ClientConfig, let's use the default value for produce_chunk_add_transactions_time_limit instead of hardcoding a value. This is possible to do now because the previous commit moved default values to the same crate as ClientConfig.

…onfig Make this config value optional. It makes it possible to disable the time limit using config.json

Make it possible to update this config option while the node is running. I manually verified that sending a SIGHUP signal to a running neard process reads the new value from config.json and starts using it.

jancionear · 2023-12-08T20:03:28Z

v2:

Don't use time_limit in kv_runtime, it's only for tests
Move the default values for ClientConfig to the same module where ClientConfig is defined, in the near-chain-configs crate. This makes it possible to use them in other crates without importing the whole nearcore crate
Use the default value for produce_chunk_add_transactions_time_limit in ClientConfig::test. No more specifying the same value in two places.
produce_chunk_add_transactions_time_limit is now an Option<Duration>. It's now possible to disable the time limit by setting it to null in config.json
Made the produce_chunk_add_transactions_time_limit config option mutable. I manually verified that sending a SIGHUP signal to a running neard process reloads the config and uses the new value from this moment on.

wacban · 2023-12-11T12:05:56Z

Hmm do we really want a warning? Reaching a limit is a normal thing under congestion. When NEAR is at the capacity it should just gracefully handle it. Otherwise we will spam warnings on every block when the load gets high.

Imo it's better to log more than less. It's easy enought to grep it out if needed and at least we'll have quick signal when people start asking why their transactions are not being picked up.

There is no corresponding warning for reaching the gas_limit.

I'd say let's add a generic warning when we fail to get all the transactions from the pool for whatever reason, and include that reason in the error message. Again easy enough to get rid of it and it provides quick signal.

wacban

LGTM

jancionear · 2023-12-11T13:43:24Z

Imo it's better to log more than less. It's easy enought to grep it out if needed and at least we'll have quick signal when people start asking why their transactions are not being picked up.

In my understanding a warning is something that shouldn't really be happening, but isn't a critical error. Reaching the limit doesn't fit this definition, it's a normal event in the system. I'm worried that in the future NEAR will be used so heavily that all chunks will be congested and the warning will be printed on every single produced chunk. It's easy to grep out, but grepping things out is annoying. I think ideally the logs should be readable by humans.

Maybe instead of a warning we could add some kind of metric. For example we could add two counters:

The number of produced chunks which weren't congested (the limit wasn't reached)
The number of produced chunks that were congested (with a label saying which of the limits was reached)

This would allow us to create a Grafana dashboard showing what percent of chunks is congested during a given timerange. Metrics feel like a better fit for observability than looking at warnings in the logs.

wacban · 2023-12-11T14:13:42Z

In my understanding a warning is something that shouldn't really be happening, but isn't a critical error. Reaching the limit doesn't fit this definition, it's a normal event in the system. I'm worried that in the future NEAR will be used so heavily that all chunks will be congested and the warning will be printed on every single produced chunk. It's easy to grep out, but grepping things out is annoying. I think ideally the logs should be readable by humans.

I agree and it boils down to how we think about congestion - whether it is normal or abnormal. Etiher way of thinking can easily be argued for. My thinking is that congestion is temporary and undesirable situation, that may require intervention. As such it should be signalled every possible way and any extra info is desired.

I agree that warning on every produced chunk may be a bit of an overkill. I'm fine with either, only metrics or metrics and warnings.

metrics

Yeah that sounds good to me.

please do it in a separate PR
please follow up and add a relevant panel to a relevant grafana dashboard - we need to make sure that the oncall knows about those new metrics, otherwise it's not very useful

An alternative set of metrics could be (total, congested) instead of (non-congested, congested). Those are equivalent and mutually derivable of course but I'm guessing that total is already defined and overall it seems more natural to work with it like that.

jancionear · 2023-12-11T14:20:30Z

Alright sounds good, in that case I will merge this PR and go implement the metrics.

I don't like the idea of the warnings, so for now I'll just go with metrics only, we'll see how readable they turn out.

(although a commit implementing the warnings is available here, in case we decide to go through with it: jancionear@b31728c)

…sactions from the pool (#10422) When a chunk is produced, Client calls `prepare_transactions()`, which fetches transactions from the transaction pool and adds them to the chunk. `prepare_transactions()` adds transactions until some limit is hit, a limit on gas, time, number of transactions, chunk size, etc. Currently there's no way to know if some limit was hit and which limit was it. Let's add a way to access this information, it's useful to know what's the bottleneck of chunk production. This PR adds a new metric which counts how many of the produced chunks couldn't fit all transactions from the transaction pool due to hitting a limit. It has two labels: `shard_id` - chunk's shard and `limited_by` - which of the limits was hit in `prepare_transactions()`. The need for this metric was discussed in #10310 (comment). The hope is that it'll allow us to figure out what's the bottleneck of chunk production in scenarios of high chain congestion. Right now in cases where `produce_chunk` takes a lot of time we can only make theories why that's the case, but there's no observability into what's going inside. This metric could help with that. To test that the metric works I set up a mocknet network with a build from this branch and put a load on it using locust. I reduced `produce_chunk_add_transactions_time_limit` from 200ms to 2ms to easily trigger the metric. It worked, as can be observed on the grafana dashboard: https://nearinc.grafana.net/d/eada7f01-b2dc-4df8-8a9f-ec4ec411159e/jancio-mocknet-stats?orgId=1&from=1705135200000&to=1705135800000 ![Screenshot 2024-01-13 at 15 21 12](https://github.com/near/nearcore/assets/149345204/51ba6a04-c2ff-4773-a8d7-9cfbff8a7fa5)

jancionear added 2 commits December 7, 2023 18:03

jancionear requested review from aborg-dev and wacban December 7, 2023 17:25

jancionear requested a review from a team as a code owner December 7, 2023 17:25

wacban approved these changes Dec 8, 2023

View reviewed changes

jancionear added 5 commits December 8, 2023 20:24

Don't use time_limit in kv_runtime

6f9f50d

kv_runtime is used only for testing. It doesn't make sense to use a time limit there. Gas limit also isn't used at all.

Make produce_chunk_add_transactions_time_limit optional in the main C…

68ccb3b

…onfig Make this config value optional. It makes it possible to disable the time limit using config.json

Make produce_chunk_add_transactions_time_limit a MutableConfigValue

4471ef4

Make it possible to update this config option while the node is running. I manually verified that sending a SIGHUP signal to a running neard process reads the new value from config.json and starts using it.

jancionear requested a review from wacban December 8, 2023 20:04

Remove extra newline, added accidentally

01da6d4

wacban approved these changes Dec 11, 2023

View reviewed changes

jancionear added this pull request to the merge queue Dec 11, 2023

Merged via the queue into near:master with commit 9d0d7df Dec 11, 2023
19 checks passed

jancionear deleted the limit-preparing-time-v2 branch December 11, 2023 14:42

jancionear mentioned this pull request Jan 13, 2024

feat: add a metric which counts how many chunks couldn't fit all transactions from the pool #10422

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add a time limit for processing transactions in produce_chunk #10310

feat: add a time limit for processing transactions in produce_chunk #10310

jancionear commented Dec 7, 2023 •

edited

Loading

codecov bot commented Dec 7, 2023 •

edited

Loading

wacban left a comment

wacban Dec 8, 2023

wacban Dec 8, 2023

jancionear Dec 8, 2023

wacban Dec 8, 2023

jancionear Dec 8, 2023

jancionear Dec 8, 2023

jancionear Dec 8, 2023

jancionear Dec 8, 2023

wacban Dec 8, 2023

jancionear Dec 8, 2023 •

edited

Loading

wacban Dec 8, 2023 •

edited

Loading

jancionear Dec 8, 2023 •

edited

Loading

wacban commented Dec 8, 2023

walnut-the-cat commented Dec 8, 2023 •

edited

Loading

wacban commented Dec 8, 2023

jancionear commented Dec 8, 2023 •

edited

Loading

jancionear commented Dec 8, 2023 •

edited

Loading

wacban commented Dec 11, 2023

wacban left a comment

jancionear commented Dec 11, 2023 •

edited

Loading

wacban commented Dec 11, 2023

jancionear commented Dec 11, 2023

feat: add a time limit for processing transactions in produce_chunk #10310

feat: add a time limit for processing transactions in produce_chunk #10310

Conversation

jancionear commented Dec 7, 2023 • edited Loading

codecov bot commented Dec 7, 2023 • edited Loading

Codecov Report

wacban left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jancionear Dec 8, 2023 • edited Loading

Choose a reason for hiding this comment

wacban Dec 8, 2023 • edited Loading

Choose a reason for hiding this comment

jancionear Dec 8, 2023 • edited Loading

Choose a reason for hiding this comment

wacban commented Dec 8, 2023

walnut-the-cat commented Dec 8, 2023 • edited Loading

wacban commented Dec 8, 2023

jancionear commented Dec 8, 2023 • edited Loading

jancionear commented Dec 8, 2023 • edited Loading

wacban commented Dec 11, 2023

wacban left a comment

Choose a reason for hiding this comment

jancionear commented Dec 11, 2023 • edited Loading

wacban commented Dec 11, 2023

jancionear commented Dec 11, 2023

jancionear commented Dec 7, 2023 •

edited

Loading

codecov bot commented Dec 7, 2023 •

edited

Loading

jancionear Dec 8, 2023 •

edited

Loading

wacban Dec 8, 2023 •

edited

Loading

jancionear Dec 8, 2023 •

edited

Loading

walnut-the-cat commented Dec 8, 2023 •

edited

Loading

jancionear commented Dec 8, 2023 •

edited

Loading

jancionear commented Dec 8, 2023 •

edited

Loading

jancionear commented Dec 11, 2023 •

edited

Loading