upstream: only prefetching if the upstream is healthy #12758

alyssawilk · 2020-08-20T18:17:06Z

Risk Level: low (only affects hidden prefetching)
Testing: new unit tests
Docs Changes: n/a
Release Notes: n/a

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>

ggreenway · 2020-08-21T18:10:11Z

source/common/conn_pool/conn_pool_base.cc

@@ -34,6 +34,10 @@ void ConnPoolImplBase::destructAllConnections() {
 }

 bool ConnPoolImplBase::shouldCreateNewConnection() const {
+  // If the host is unhealthy, don't make it do extra work.
+  if (host_->health() != Upstream::Host::Health::Healthy) {


Should this include Degraded?

cc @snowp

ggreenway: nice observation.

The comment in the degraded enum definition suggests that they should not participate in prefetching.

envoy/include/envoy/upstream/upstream.h

Line 143 in bd2b989

* Host is healthy, but degraded. It is able to serve traffic, but hosts that aren't degraded

That said, prefetching seems like a newer operation. #5202 introduced this comment long ago. I think that same-host prefetching may make sense for degraded hosts. When implemented, look ahead prefetching for round-robin style load balancers may want to avoid degraded hosts; but it would naturally avoid degraded hosts since it would do the look ahead by looking at the contents of the non-degraded list of hosts.

I think we may want to update the comments and include degraded hosts in prefetching.

I could go either way, but I intentionally did this for !Healthy rather than Unhealthy for a couple of reasons. Late connection fetching could result in up to date health information if there were a change in health status and if lookahead prefetching will avoid degraded hosts, I think prefetching may be overenthusiastic given they're likely to be avoided. Basically I think when hosts are not healthy, the LB is less predictable so I lean towards taking the latency hit over a CPU hit but it's easy enough to go the other way if either of your care.

I would suggest a TODO to consider ideal prefetch behavior for degraded endpoints. Limiting this version to healthy endpoints seems fine.

Sounds reasonable. Maybe add some of that explanation as a code comment?

I'd lean against a TODO as I only know of one user of degraded upstream and don't know if they need prefetching. I'll note it could be extended and comment why we don't by default. SG?

Good enough for me. @antoniovicente ?

antoniovicente · 2020-08-21T19:07:29Z

source/common/conn_pool/conn_pool_base.cc

@@ -34,6 +34,10 @@ void ConnPoolImplBase::destructAllConnections() {
 }

 bool ConnPoolImplBase::shouldCreateNewConnection() const {
+  // If the host is unhealthy, don't make it do extra work.
+  if (host_->health() != Upstream::Host::Health::Healthy) {


cc @snowp

ggreenway: nice observation.

The comment in the degraded enum definition suggests that they should not participate in prefetching.

envoy/include/envoy/upstream/upstream.h

Line 143 in bd2b989

* Host is healthy, but degraded. It is able to serve traffic, but hosts that aren't degraded

That said, prefetching seems like a newer operation. #5202 introduced this comment long ago. I think that same-host prefetching may make sense for degraded hosts. When implemented, look ahead prefetching for round-robin style load balancers may want to avoid degraded hosts; but it would naturally avoid degraded hosts since it would do the look ahead by looking at the contents of the non-degraded list of hosts.

I think we may want to update the comments and include degraded hosts in prefetching.

antoniovicente · 2020-08-21T19:10:50Z

source/common/conn_pool/conn_pool_base.cc

-      tryCreateNewConnections();
+      // All requests will be purged. newStream may create new ones, at which
+      // point prefetching will happen there.
+      ASSERT(!shouldCreateNewConnection());


Note to self: ASSERT looks like a safe sanity check. Nothing unlikely to go terribly wrong if ASSERT fails.

antoniovicente · 2020-08-21T19:13:33Z

test/common/conn_pool/conn_pool_base_test.cc

+  pool_.destructAllConnections();
+}
+
+TEST_F(ConnPoolImplBaseTest, NoPrefetchIfUnhealthy) {


Would be good to have a test that covers the degraded case.

antoniovicente · 2020-08-21T19:15:55Z

test/common/conn_pool/conn_pool_base_test.cc

+  EXPECT_CALL(pool_, instantiateActiveClient).Times(1);
+  EXPECT_CALL(pool_, onPoolFailure).WillOnce(InvokeWithoutArgs([&]() -> void {
+    pool_.newStream(context_);
+  }));


What's the expected ordering of the two calls above? Seems to me that the execution of the newStream on onPoolFailure will trigger the instantiateActiveClient. It may be worth adding a "testing::InSequence s;" to this test.

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>

repokitteh-read-only · 2020-08-25T21:22:12Z

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to api/envoy/.
CC @envoyproxy/api-watchers: FYI only for changes made to api/envoy/.

🐱

Caused by: #12758 was synchronize by alyssawilk.

see: more, trace.

htuch · 2020-08-26T13:56:12Z

/lgtm api

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>

antoniovicente

Looks great, thanks for the improvement to the prefetch implementation.

antoniovicente · 2020-08-26T19:57:33Z

test/common/conn_pool/conn_pool_base_test.cc

@@ -119,5 +138,6 @@ TEST_F(ConnPoolImplBaseTest, NoPrefetchIfUnhealthy) {
  pool_.destructAllConnections();
 }

+


nit: extra newline.

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>

mattklein123

Nice!

upstream: only prefetching if the upstream is healthy

e42f6dc

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>

alyssawilk assigned antoniovicente Aug 20, 2020

ggreenway reviewed Aug 21, 2020

View reviewed changes

repokitteh-read-only bot added the api label Aug 21, 2020

antoniovicente reviewed Aug 21, 2020

View reviewed changes

reviewer comments

647b8ae

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>

repokitteh-read-only bot removed the api label Aug 26, 2020

alyssawilk added 2 commits August 26, 2020 15:45

comments

c6cad1f

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>

both

3b99ef5

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>

antoniovicente previously approved these changes Aug 26, 2020

View reviewed changes

hopefully final

5269125

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>

alyssawilk dismissed antoniovicente’s stale review via 5269125 August 27, 2020 13:03

revert broken assertion

6eb7f6e

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>

alyssawilk added the waiting label Aug 27, 2020

cleanup

c564fad

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>

repokitteh-read-only bot removed the waiting label Aug 27, 2020

antoniovicente approved these changes Aug 27, 2020

View reviewed changes

alyssawilk assigned mattklein123 Sep 2, 2020

mattklein123 approved these changes Sep 2, 2020

View reviewed changes

alyssawilk merged commit f4fd39d into envoyproxy:master Sep 2, 2020

alyssawilk mentioned this pull request Sep 17, 2020

Pre-establish upstream connections (connection prefetching) #2755

Closed

alyssawilk deleted the not_unhealthy branch December 10, 2020 19:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

upstream: only prefetching if the upstream is healthy #12758

upstream: only prefetching if the upstream is healthy #12758

alyssawilk commented Aug 20, 2020

ggreenway Aug 21, 2020

antoniovicente Aug 21, 2020

alyssawilk Aug 25, 2020

antoniovicente Aug 25, 2020

ggreenway Aug 25, 2020

alyssawilk Aug 26, 2020

ggreenway Aug 26, 2020

antoniovicente Aug 21, 2020

antoniovicente Aug 21, 2020

antoniovicente Aug 21, 2020

antoniovicente Aug 21, 2020

repokitteh-read-only bot commented Aug 25, 2020

htuch commented Aug 26, 2020

antoniovicente left a comment

antoniovicente Aug 26, 2020

mattklein123 left a comment

		@@ -119,5 +138,6 @@ TEST_F(ConnPoolImplBaseTest, NoPrefetchIfUnhealthy) {
		pool_.destructAllConnections();
		}

upstream: only prefetching if the upstream is healthy #12758

upstream: only prefetching if the upstream is healthy #12758

Conversation

alyssawilk commented Aug 20, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

repokitteh-read-only bot commented Aug 25, 2020

htuch commented Aug 26, 2020

antoniovicente left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mattklein123 left a comment

Choose a reason for hiding this comment