Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

p2p: Deprecate TTFB, RESP_TIMEOUT, introduce rate limiting recommenda… #3767

Open
wants to merge 4 commits into
base: dev
Choose a base branch
from

Conversation

arnetheduck
Copy link
Contributor

…tions

As part of the discussions surrounding EIP-7594 (peerdas), it was highlighted that during sampling and/or data requests, the sampler does not have timing information for when a samplee will have data available. It is desireable to not introduce a deadline, since this artificially introduces latency for the typical scenario where data becomes available earlier than an agreed-upon deadline.

Similarly, when a client issues a request for blocks, it does often not know what rate limiting policy of the serving end and must either pessimistically rate limit itself or run the risk of getting disconnected for spamming the server - outcomes which lead to unnecessarily slow syncing as well as testnet mess with peer scoring and disconnection issues.

This PR solves both problems by:

  • removing the time-to-first-byte and response timeouts allowing requesters to optimistically queue requests - the timeouts have historically not been implemented fully in clients to this date
  • introducing a hard limit in the number of concurrent requests that a client may issue, per protocol
  • introducing a recommendation for rate limiting that allows optimal bandwidth usage without protocol changes or additional messaging roundtrips

On the server side, an "open" request does not consume significant resources while it's resting, meaning that allowing the server to manage resource allocation by slowing down data serving is safe, as long as concurrency is adequately limited.

On the client side, clients must be prepared to handle slow servers already and they can simply apply their existing strategy both to uncertainty and rate-limiting scenarios (how long before timeout, what to do in "slow peer" scenarios).

Token / leaky buckets are a classic option for rate limiting with desireable properties both for the case when we're sending requests to many clients concurrently (getting good burst performance) and when the requestees are busy (by keeping long-term resource usage in check and fairly serving clients)

…tions

As part of the discussions surrounding EIP-7594 (peerdas), it was
highlighted that during sampling and/or data requests, the sampler does
not have timing information for when a samplee will have data available.
It is desireable to not introduce a deadline, since this artificially
introduces latency for the typical scenario where data becomes available
earlier than an agreed-upon deadline.

Similarly, when a client issues a request for blocks, it does often not
know what rate limiting policy of the serving end and must either
pessimistically rate limit itself or run the risk of getting
disconnected for spamming the server - outcomes which lead to
unnecessarily slow syncing as well as testnet mess with peer scoring and
disconnection issues.

This PR solves both problems by:

* removing the time-to-first-byte and response timeouts allowing
requesters to optimistically queue requests - the timeouts have
historically not been implemented fully in clients to this date
* introducing a hard limit in the number of concurrent requests that a
client may issue, per protocol
* introducing a recommendation for rate limiting that allows optimal
bandwidth usage without protocol changes or additional messaging
roundtrips

On the server side, an "open" request does not consume significant
resources while it's resting, meaning that allowing the server to manage
resource allocation by slowing down data serving is safe, as long as
concurrency is adequately limited.

On the client side, clients must be prepared to handle slow servers
already and they can simply apply their existing strategy both to
uncertainty and rate-limiting scenarios (how long before timeout, what
to do in "slow peer" scenarios).

Token / leaky buckets are a classic option for rate limiting with
desireable properties both for the case when we're sending requests to
many clients concurrently (getting good burst performance) and when the
requestees are busy (by keeping long-term resource usage in check and
fairly serving clients)
@ppopth
Copy link
Member

ppopth commented May 21, 2024

As part of the discussions surrounding EIP-7594 (peerdas), it was highlighted that during sampling and/or data requests, the sampler does not have timing information for when a samplee will have data available. It is desireable to not introduce a deadline, since this artificially introduces latency for the typical scenario where data becomes available earlier than an agreed-upon deadline.

That is the reason why I proposed passive sampling at #3717. Would you mind checking it out?


If any of these timeouts fire, the requester SHOULD reset the stream and deem the req/resp operation to have failed.
If a timeout happens on the requesting side, they SHOULD reset the stream.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What kind of timeout can still happen? I think there is nothing left because you already removed TTFB_TIMEOUT and RESP_TIMEOUT. Or you left this line just in case we will want to re-enable this in the future?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as a consumer, you can implement whatever timeouts you want (in fact, you probably should) - this merely says what you should do to inform the server that you're no longer interested

@ppopth
Copy link
Member

ppopth commented May 21, 2024

Is there any rationale of having the timeouts in the first place? Having two constants, TTFB_TIMEOUT and RESP_TIMEOUT, rather than just a single constant like TIMEOUT quite surprised me.


Broadly, the requesting side does not know the capacity / limit of each server but can derive it from the rate of responses for the purpose of selecting the next peer for a request.

Because the server withholds the response until capacity is available, a client can optimistically send requests without risking running into negative scoring situations or sub-optimal rate polling.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a DoS vector here that the clients can fill out the server's buffer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the limit of open streams per protocol id replaces the previous mechanism (timeouts)

Co-authored-by: Pop Chunhapanya <haxx.pop@gmail.com>
@arnetheduck
Copy link
Contributor Author

Is there any rationale of having the timeouts in the first place?

to avoid too many concurrently open streams - it's a dubious mechanism at best to have in a spec - what timeout strategy is correct depends on how the consumer uses the request and/or the nature of the request

Having two constants, TTFB_TIMEOUT and RESP_TIMEOUT, rather than just a single constant like TIMEOUT quite surprised me.

ttfb is a way to get rid of stalling peers more quickly - imagine a request taking 10s to stream (because it's big and bandwidth is low) - ttfb allows closing the request more quickly if the responding end doesn't answer at all.

@arnetheduck
Copy link
Contributor Author

passive sampling

happy to when we get to that point in our implementation, though this PR is useful independently (given how TTFB is not actually used correctly today)

@nisdas
Copy link
Contributor

nisdas commented May 22, 2024

This PR would be pretty useful for us as we are seeking to revamp our rate limiting strategy. Currently with more topics where a large message size(blocks, blobs and soon columns) is being used it becomes messier in order to determine how to rate limit peers. Instead of trying to guess the other peer's rate limits it is much cleaner for us to simply wait(or terminate early if you do not want to wait). Every client has a different rate limiting strategy, rather than trying to cater to the most pessimistic one this allows each client to go at their own pace. Would be interested in other clients thoughts on this

@Nashatyrev
Copy link
Member

Great PR!
We may close that ancient PR #2690 when this one is merged

@Nashatyrev
Copy link
Member

Does it make sense to increase the number of concurrent requests to better utilize 'long fat' (high bandwidth + high latency) connections?

@arnetheduck
Copy link
Contributor Author

Does it make sense to increase the number of concurrent requests to better utilize 'long fat' (high bandwidth + high latency) connections?

eh, this is always a tricky one - my gut feeling is 90% of the benefit comes from the "second" in-flight request and anything more is likely to help only in the margins meaning that we'd see limited benefit and more complex rate limiting rulesets in clients if we allowed more than 2 but I haven't run the actual numbers (size of response vs bandwidth vs latency) - have you looked into it that way?

@Nashatyrev
Copy link
Member

have you looked into it that way?

I don't have any numbers at hand as well.
Anyway, I think we may start from 2 and later increase the number if there is the evidence that this could improve efficiency.
From the implementation perspective 2 or any other number >2 looks to me of the same complexity

@pawanjay176
Copy link
Contributor

pawanjay176 commented Sep 13, 2024

@arnetheduck Why not just allow a single stream per protocol at a time? It seems easier to implement.
Greater than 1 isn't significantly more complex but wanted to understand the intuition for 2

@arnetheduck
Copy link
Contributor Author

@arnetheduck Why not just allow a single stream per protocol at a time? It seems easier to implement.

right now, there is no limit so all clients implicitly already support this probably..

having 2 requests allows queueing one while the other is in transit which is a trivial way to improve performance and it's the least we can do in this direction, to allow it. most, if not all, implementations should already have some limit in place else they can be trivially dos:ed.

@@ -561,10 +560,9 @@ The request MUST be encoded according to the encoding strategy.
The requester MUST close the write side of the stream once it finishes writing the request message.
At this point, the stream will be half-closed.

The requester MUST wait a maximum of `TTFB_TIMEOUT` for the first response byte to arrive (time to first byte—or TTFB—timeout).
On that happening, the requester allows a further `RESP_TIMEOUT` for each subsequent `response_chunk` received.
The requester MUST NOT make more than two concurrent requests with the same ID.
Copy link
Member

@ppopth ppopth Sep 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about making the number 2 a config variable?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1
Embedded in the spec it looks like a fundamental constant which is not subject to change (kind a number of children in a binary tree 😄 )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants