Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

connection: skip read activate call when reading from transport socket if the connection is read disabled #14043

Conversation

antoniovicente
Copy link
Contributor

Commit Message:
connection: skip read activate call when reading from transport socket if the connection is read disabled

Also, rename setReadBufferReady to setTransportSocketIsReadable to make its intended use more clear.

Additional Description:
Risk Level: low
Testing: unit
Docs Changes: n/a
Release Notes: n/a
Platform Specific Features: n/a

Also, rename setReadBufferReady to setTransportSocketIsReadable to make its intended use more clear.

Signed-off-by: Antonio Vicente <avd@google.com>
davinci26
davinci26 previously approved these changes Nov 17, 2020
Copy link
Member

@davinci26 davinci26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good

Copy link
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tackling yet another subtle corner case here!

transport_wants_read_ = true;
// Only schedule a read activation if read is enabled. If read was disabled, read resumption will
// happen once read is re-enabled since transport_wants_read_ is now set to true.
if (read_disable_count_ == 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I see a test change to ensure that when read_disabled we don't activate here.
there's one other change in this PR, which is that in filterChainWantsData we no longer set transport_wants_read_ true. Is that behavior regression tested as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It stroke me odd as well so I dug a bit deeper into that and I think the current and the previous code are equivalent:

In the previous code we set transport_wants_read_ via calling setReadBufferReady in ReadDisable if:

(*) transport_wants_read_ is true already or buffer.length() > 0. This comes from the if-statement above. if (filterChainWantsData() && (read_buffer_.length() > 0 || transport_wants_read_))

Looking at the two cases individually:

  1. In the case that transport_wants_read_ is true already then it doesn't really matter that we do not set transport_wants_read_ to true.
  2. In the case that buffer.length() > 0 then the next call to ReadDisable(false) will trigger the read activation regardless of the value of transport_wants_read_ because of (*).

This means that even in the case where that we have a cycle of ReadDisable(true)/ReadEnable(false) (this used to leave connections at zombie state because the read activation was canceled) we will re-activate reads correctly.

+1 on testing though, the more the merrier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're referring to the case where we check filterChainWantsData() in readDisable(false) below. Setting transport_wants_read_ when resuming from readDisable(false) does not matter because it is either already set when the method is called or onReadReady clears it before it is possible to drain the read buffer and call readDisable(false) from within that method.

Also, setting transport_wants_read_ in readDisable(false) was known to be redundant as seen in this comment thread: #13772 (comment)

Longer version:
After a read from the transport socket the connection read buffer can be in one of the following states:

  1. over high-watermark, which implies transport_wants_read_ == true and high watermark triggered
  2. at exactly the high-watermark, which implies transport_wants_read_ == true but high watermark not triggered
  3. not empty but below high-watermark, implies short read and transport_wants_read_ == false
  4. empty, implies transport_wants_read_ == false

transport_wants_read_ == true in the cases above since shouldDrainReadBuffer() == true triggers a call to setTransportSocketIsReadable() from the transport.

Resumption on readDisable(false) can happen if:
a. read_buffer_.highWatermarkTriggered() is true and read_disable_count_ == 1
b. read_buffer_ is not empty and read_disable_count_ == 0
c. read_buffer_ is empty and transport_wants_read_ == true

Additional considerations:
Bytes are only ever added or removed from the read buffer under the call stack of onReadReady

Resumption case (a) can only happen after transport socket read case (1), so transport_wants_read_ == true in this case. The next call to onReadReady will return early because read_disable_count_ >= 1, so transport_wants_read_ remains == true.

The next call to onReadReady after resumption case (b) will perform a read from the transport because read_disable_count_ == 0. onReadReady will set transport_wants_read_ to false before doing reads from the transport or processing the contents of the read buffer. Nothing reads the value of transport_wants_read_ between the time the read resumption is scheduled until onReadReady sets it to false.

transport_wants_read_ is already set to true in case (c)

While explaining this I noticed an edge case that I hadn't considered before: If the read buffer is exactly at the buffer_limit_, shouldDrainReadBuffer() will return true but highWatermarkTriggered() returns false. In fact, this edge case is an optimization for the common case where the read limit is configured to a multiple of 16KB, so transport reads end up exactly hitting the configured buffer limit and narrowly avoid the expensive calls to readDisable(true)/readDisable(false). I'ld love to get rid of this edge case, but in order to do that we need to reduce the cost of going through a readDisable(true)/readDisable(false) cycle.

@@ -369,7 +378,7 @@ void ConnectionImpl::readDisable(bool disable) {
// buffer's high watermark has triggered.
ASSERT(read_buffer_.length() > 0 || read_disable_count_ == 0);
dispatch_buffered_data_ = true;
setReadBufferReady();
ioHandle().activateFileEvents(Event::FileReadyType::Read);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think it's worth a comment on why this case is different?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rearranged and expanded comments.

Signed-off-by: Antonio Vicente <avd@google.com>
Copy link
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome. Thanks for the detailed comments - I think they help explain a hairy and confusing area of code.

One typo nit and clang tidy, otherwise LGTM.

transport_wants_read_ = true;
// Only schedule a read activation if the connection is not read disabled to avoid spurious
// wakeups. When read disabled, the connection will not read from the transport, and limit
// dispatch to the current contents of the read buffer if it's high-watermark is triggered and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it's -> its?

Signed-off-by: Antonio Vicente <avd@google.com>
Signed-off-by: Antonio Vicente <avd@google.com>
@antoniovicente
Copy link
Contributor Author

/retest

@repokitteh-read-only
Copy link

Retrying Azure Pipelines:
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #14043 (comment) was created by @antoniovicente.

see: more, trace.

@alyssawilk
Copy link
Contributor

CI still cranky :-/

…ip_read_activate

Signed-off-by: Antonio Vicente <avd@google.com>
Signed-off-by: Antonio Vicente <avd@google.com>
@antoniovicente
Copy link
Contributor Author

/retest

@repokitteh-read-only
Copy link

Retrying Azure Pipelines:
Check envoy-presubmit didn't fail.

🐱

Caused by: a #14043 (comment) was created by @antoniovicente.

see: more, trace.

@antoniovicente
Copy link
Contributor Author

/retest

@repokitteh-read-only
Copy link

Retrying Azure Pipelines:
Check envoy-presubmit didn't fail.

🐱

Caused by: a #14043 (comment) was created by @antoniovicente.

see: more, trace.

@antoniovicente
Copy link
Contributor Author

/retest

for coverage

@repokitteh-read-only
Copy link

Retrying Azure Pipelines:
Check envoy-presubmit didn't fail.

🐱

Caused by: a #14043 (comment) was created by @antoniovicente.

see: more, trace.

…ip_read_activate

Signed-off-by: Antonio Vicente <avd@google.com>
@antoniovicente
Copy link
Contributor Author

/retest

@repokitteh-read-only
Copy link

Retrying Azure Pipelines:
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #14043 (comment) was created by @antoniovicente.

see: more, trace.

@antoniovicente antoniovicente merged commit 745ca24 into envoyproxy:master Dec 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants