Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add a time limit for processing transactions in produce_chunk #10310

Merged
merged 8 commits into from
Dec 11, 2023

Conversation

jancionear
Copy link
Contributor

@jancionear jancionear commented Dec 7, 2023

To produce a chunk the node fetches transactions from the transaction pool and adds them to the chunk. Adding a transaction to the chunk can be computationally expensive - the transaction is converted to receipt, which burns gas. Because of that the number of transactions has to be limited. The node will fetch transactions until some kind of limit is reached, and then decide that the chunk is ready and publish it.

Currently we have a limit of gas that can be used for preparing transactions, but there is a big problem with it - it doesn't take into account discarding invalid transactions from the pool. Processing an invalid transaction doesn't burn gas, so the node could spend a lot of time discarding invalid transactions without ever reaching the limit. This would make chunk production very slow and the others might decide to just skip the chunk instead of waiting for it.

Let's add another limit to prepare_transactions: a time limit. When producing a chunk the node will be allowed to add new transactions to the chunk for up to X milliseconds. This will ensure that a chunk is produced quickly, even when there are many invalid transactions.

Choosing a good value for the limit is difficult, so a new config option is added to allow changing it on the fly: produce_chunk_add_transactions_time_limit.

The default value for the time limit is 200ms. The reasoning for choosing this value is as follows:

For a more detailed discussion, see the comments in #10278

Fixes: #10278

To produce a chunk the node fetches transactions from the transaction
pool and adds them to the chunk. Adding a transaction to the chunk
can be computationally expensive - the transaction is converted to
receipt, which burns gas. Because of that the number of transactions
has to be limited. The node will fetch transactions until some kind of
limit is reached, and then decide that the chunk is ready and publish it.

Currently we have a limit of gas that can be used for preparing transactions,
but there is a big problem with it - it doesn't take into account discarding
invalid transactions from the pool. Processing an invalid transaction doesn't
burn gas, so the node could spend a lot of time discarding invalid transactinos
without ever reaching the limit. This would make chunk production very slow and
the others might decide to just skip the chunk instead of waiting for it.

Let's add another limit to `prepare_transactions`: a time limit. When producing
a chunk the node will be allowed to add new transactions to the chunk for up to X seconds.
This will ensure that a chunk is produced quickly, even when there are many invalid transactions.

The limit is an `Option` because there might be cases where we don't want to use a time limit.
For example there might be tests for `prepare_transactions` which test the gas_limit. In such
tests we wouldn't want the time limit to get in the way, it would only make the tests more unreliable.
Add a config option that allows to specify a time limit on adding transactions during
chunk production. This option limits the amount of time that a node will spend adding
transactions to the produced chunk.

It's good to make this configurable - the default value seems reasonable,
but somone might want to tweak it in the future. There's no ideal value here,
choosing a limit that's too small would lead to chunks with very few transactions,
while choosing a value that's too big would lead to timeouts and skipped chunks.
@jancionear jancionear requested a review from a team as a code owner December 7, 2023 17:25
Copy link

codecov bot commented Dec 7, 2023

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (8985a96) 68.19% compared to head (01da6d4) 71.81%.
Report is 3 commits behind head on master.

Files Patch % Lines
nearcore/src/dyn_config.rs 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #10310      +/-   ##
==========================================
+ Coverage   68.19%   71.81%   +3.61%     
==========================================
  Files         700      712      +12     
  Lines      141495   143085    +1590     
  Branches   141495   143085    +1590     
==========================================
+ Hits        96489   102750    +6261     
+ Misses      40455    35632    -4823     
- Partials     4551     4703     +152     
Flag Coverage Δ
backward-compatibility 0.08% <0.00%> (?)
db-migration 0.08% <0.00%> (?)
genesis-check 1.26% <78.82%> (?)
integration-tests 36.20% <98.82%> (?)
linux 71.69% <98.82%> (+3.49%) ⬆️
linux-nightly 71.38% <98.82%> (?)
macos 54.48% <98.82%> (?)
pytests 1.49% <78.82%> (?)
sanity-checks 1.29% <78.82%> (?)
unittests 68.18% <98.82%> (-0.01%) ⬇️
upgradability 0.13% <0.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@wacban wacban left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks!

cc @posvyatokum we may want to get it included in 1.37

@@ -1020,11 +1021,21 @@ impl RuntimeAdapter for KeyValueRuntime {
transactions: &mut dyn PoolIterator,
_chain_validate: &mut dyn FnMut(&SignedTransaction) -> bool,
_current_protocol_version: ProtocolVersion,
time_limit: Option<Duration>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the time_limit should be ignored in the kv runtime. This runtime is only used in tests and there we don't want to have anything depend on the real wall clock time.

/// A node produces a chunk by adding transactions from the transaction pool until
/// some limit is reached. This time limit ensures that adding transactions won't take
/// longer than the specified duration, which helps to produce the chunk quickly.
pub produce_chunk_add_transactions_time_limit: Duration,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessarily in this PR but it may be a good idea to make this config dynamic - so it is possible to update it without restarting the node. If you're interested you can search for MutableConfigValue or "expected_shutdown" which is an example of a dynamic field.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I made it dynamic.

@@ -384,6 +389,7 @@ impl ClientConfig {
"state_split_config",
),
tx_routing_height_horizon: 4,
produce_chunk_add_transactions_time_limit: Duration::from_millis(200),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I slightly dislike that the default is defined in two separate places. Is it done the same way for other fields?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a config for test purposes and all other values are also specified by hand.
For example, this config has:

header_sync_stall_ban_timeout: Duration::from_secs(30),

Even though the default value is different:

fn default_header_sync_stall_ban_timeout() -> Duration {
    Duration::from_secs(120)
}

I agree that it would be nicer to use the real default value instead of handwritten ones. I will modify it to use the default value function.
The only downside is that it will introduce a cyclical dependency (Config <-> ClientConfig). Some people dislike that, but that feels very purist, I think it's okay in this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, the circular dependency would be pretty severe here.
The functions with default config values are in the crate nearcore, while this test configuration is in the crate near-chain-configs. So it would make the configs crate depend on the entirety of nearcore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we could move the default value functions to near-chain-configs (?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved the default values for ClientConfig to the same module as ClientConfig, this makes it possible to use the default value in ClientConfig::test without any circular dependencies

@@ -711,7 +712,13 @@ impl RuntimeAdapter for NightshadeRuntime {
pool_iterator: &mut dyn PoolIterator,
chain_validate: &mut dyn FnMut(&SignedTransaction) -> bool,
current_protocol_version: ProtocolVersion,
time_limit: Option<Duration>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it an option here but not an option in the neard config?

Copy link
Contributor Author

@jancionear jancionear Dec 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an Option in the function because someone might want to use this function without a time limit (e.g for tests).
For the main config I didn't see any real reason not to use a time limit, so I didn't make it optional.
I can make it an Option in the config as well, just to keep things consistent.

@@ -763,6 +770,7 @@ impl RuntimeAdapter for NightshadeRuntime {
while total_gas_burnt < transactions_gas_limit
&& total_size < size_limit
&& transactions.len() < new_receipt_count_limit
&& !time_limit_reached()
Copy link
Contributor

@wacban wacban Dec 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a bit atypical, I was expecting something like all of the above e.g. std::time::Instant::now() < time_limit but yours is cool too.

Copy link
Contributor Author

@jancionear jancionear Dec 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the time_limit is an option so it can't be easily compared. It would have to be something like

time_limit.map(|t| t >= start_time.elapsed()).unwrap_or(false)

Which doesn't feel readable to me.

I could also set the time_limit to +inf when the option is None, but this felt hacky.

I think time_limit_reached() is the most readable option here.

@wacban
Copy link
Contributor

wacban commented Dec 8, 2023

Some more general thoughts:

  • Would be good to print a warning if the time limit is reached (can be included here)
  • Would be good to add some dedicated metrics for prepare_transactions (better done separately)

@walnut-the-cat
Copy link
Contributor

walnut-the-cat commented Dec 8, 2023

Is this protocol change, since it somewhat changes expectation? (E.g. a chunk may be formed without using all gas capacity when reaching timeout limit).

I don't think it needs NEP since:

  1. it doesn't have real visible user impact
  2. it doesn't require all validators to adopt this feature.

But still just want to make sure

@wacban
Copy link
Contributor

wacban commented Dec 8, 2023

I agree, I don't think NEP is needed, it's not a protocol change. As you said different clients can choose to use different configuration or even implementation for choosing transactions from the pool and it's totally fine.

@jancionear
Copy link
Contributor Author

jancionear commented Dec 8, 2023

Would be good to print a warning if the time limit is reached (can be included here)

Hmm do we really want a warning? Reaching a limit is a normal thing under congestion. When NEAR is at the capacity it should just gracefully handle it. Otherwise we will spam warnings on every block when the load gets high. There is no corresponding warning for reaching the gas_limit.

kv_runtime is used only for testing. It doesn't make sense
to use a time limit there. Gas limit also isn't used at all.
The default values for `ClientConfig` vere defined in the `nearcore` crate.
Let's move them to the same module where `ClientConfig` is defined, `client_config.rs`
in `near-chain-configs` crate. This makes it possible to use the default values in other
crates without importing the entire `nearcore` crate. Defaults should be declared in the same
module as the config, it's cleaner this way.
When creating a test instance of ClientConfig, let's use the default value
for produce_chunk_add_transactions_time_limit instead of hardcoding a value.
This is possible to do now because the previous commit moved default values
to the same crate as ClientConfig.
…onfig

Make this config value optional.
It makes it possible to disable the time limit using config.json
Make it possible to update this config option while the node is running.
I manually verified that sending a SIGHUP signal to a running neard process
reads the new value from config.json and starts using it.
@jancionear
Copy link
Contributor Author

jancionear commented Dec 8, 2023

v2:

  • Don't use time_limit in kv_runtime, it's only for tests
  • Move the default values for ClientConfig to the same module where ClientConfig is defined, in the near-chain-configs crate. This makes it possible to use them in other crates without importing the whole nearcore crate
  • Use the default value for produce_chunk_add_transactions_time_limit in ClientConfig::test. No more specifying the same value in two places.
  • produce_chunk_add_transactions_time_limit is now an Option<Duration>. It's now possible to disable the time limit by setting it to null in config.json
  • Made the produce_chunk_add_transactions_time_limit config option mutable. I manually verified that sending a SIGHUP signal to a running neard process reloads the config and uses the new value from this moment on.

@wacban
Copy link
Contributor

wacban commented Dec 11, 2023

Hmm do we really want a warning? Reaching a limit is a normal thing under congestion. When NEAR is at the capacity it should just gracefully handle it. Otherwise we will spam warnings on every block when the load gets high.

Imo it's better to log more than less. It's easy enought to grep it out if needed and at least we'll have quick signal when people start asking why their transactions are not being picked up.

There is no corresponding warning for reaching the gas_limit.

I'd say let's add a generic warning when we fail to get all the transactions from the pool for whatever reason, and include that reason in the error message. Again easy enough to get rid of it and it provides quick signal.

Copy link
Contributor

@wacban wacban left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jancionear
Copy link
Contributor Author

jancionear commented Dec 11, 2023

Imo it's better to log more than less. It's easy enought to grep it out if needed and at least we'll have quick signal when people start asking why their transactions are not being picked up.

In my understanding a warning is something that shouldn't really be happening, but isn't a critical error. Reaching the limit doesn't fit this definition, it's a normal event in the system. I'm worried that in the future NEAR will be used so heavily that all chunks will be congested and the warning will be printed on every single produced chunk. It's easy to grep out, but grepping things out is annoying. I think ideally the logs should be readable by humans.

Maybe instead of a warning we could add some kind of metric. For example we could add two counters:

  • The number of produced chunks which weren't congested (the limit wasn't reached)
  • The number of produced chunks that were congested (with a label saying which of the limits was reached)

This would allow us to create a Grafana dashboard showing what percent of chunks is congested during a given timerange. Metrics feel like a better fit for observability than looking at warnings in the logs.

@wacban
Copy link
Contributor

wacban commented Dec 11, 2023

In my understanding a warning is something that shouldn't really be happening, but isn't a critical error. Reaching the limit doesn't fit this definition, it's a normal event in the system. I'm worried that in the future NEAR will be used so heavily that all chunks will be congested and the warning will be printed on every single produced chunk. It's easy to grep out, but grepping things out is annoying. I think ideally the logs should be readable by humans.

I agree and it boils down to how we think about congestion - whether it is normal or abnormal. Etiher way of thinking can easily be argued for. My thinking is that congestion is temporary and undesirable situation, that may require intervention. As such it should be signalled every possible way and any extra info is desired.

I agree that warning on every produced chunk may be a bit of an overkill. I'm fine with either, only metrics or metrics and warnings.

metrics

Yeah that sounds good to me.

  • please do it in a separate PR
  • please follow up and add a relevant panel to a relevant grafana dashboard - we need to make sure that the oncall knows about those new metrics, otherwise it's not very useful

An alternative set of metrics could be (total, congested) instead of (non-congested, congested). Those are equivalent and mutually derivable of course but I'm guessing that total is already defined and overall it seems more natural to work with it like that.

@jancionear
Copy link
Contributor Author

Alright sounds good, in that case I will merge this PR and go implement the metrics.

I don't like the idea of the warnings, so for now I'll just go with metrics only, we'll see how readable they turn out.

(although a commit implementing the warnings is available here, in case we decide to go through with it: jancionear@b31728c)

@jancionear jancionear added this pull request to the merge queue Dec 11, 2023
Merged via the queue into near:master with commit 9d0d7df Dec 11, 2023
19 checks passed
@jancionear jancionear deleted the limit-preparing-time-v2 branch December 11, 2023 14:42
github-merge-queue bot pushed a commit that referenced this pull request Jan 17, 2024
…sactions from the pool (#10422)

When a chunk is produced, Client calls `prepare_transactions()`, which
fetches transactions from the transaction pool and adds them to the
chunk. `prepare_transactions()` adds transactions until some limit is
hit, a limit on gas, time, number of transactions, chunk size, etc.

Currently there's no way to know if some limit was hit and which limit
was it. Let's add a way to access this information, it's useful to know
what's the bottleneck of chunk production.

This PR adds a new metric which counts how many of the produced chunks
couldn't fit all transactions from the transaction pool due to hitting a
limit. It has two labels: `shard_id` - chunk's shard and `limited_by` -
which of the limits was hit in `prepare_transactions()`.

The need for this metric was discussed in
#10310 (comment). The
hope is that it'll allow us to figure out what's the bottleneck of chunk
production in scenarios of high chain congestion.
Right now in cases where `produce_chunk` takes a lot of time we can only
make theories why that's the case, but there's no observability into
what's going inside. This metric could help with that.

To test that the metric works I set up a mocknet network with a build
from this branch and put a load on it using locust.
I reduced `produce_chunk_add_transactions_time_limit` from 200ms to 2ms
to easily trigger the metric. It worked, as can be observed on the
grafana dashboard:
https://nearinc.grafana.net/d/eada7f01-b2dc-4df8-8a9f-ec4ec411159e/jancio-mocknet-stats?orgId=1&from=1705135200000&to=1705135800000
![Screenshot 2024-01-13 at 15 21
12](https://github.com/near/nearcore/assets/149345204/51ba6a04-c2ff-4773-a8d7-9cfbff8a7fa5)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

transaction pool iteration timeout
3 participants