-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTTP health checker: handle GOAWAY from HTTP2 upstreams #13599
Merged
mattklein123
merged 51 commits into
envoyproxy:master
from
mpuncel:mpuncel/http2-hc-goaway
Jan 12, 2021
Merged
Changes from 12 commits
Commits
Show all changes
51 commits
Select commit
Hold shift + click to select a range
ce00ea3
HTTP health checker: handle GOAWAY from HTTP2 upstreams
mpuncel 9f83533
fix format of version history to be compliant with check_format.py
mpuncel 736d0ca
fix spelling mistake in comment
mpuncel f3e9da2
fix incorrect testcomment
mpuncel b76c46e
Merge branch 'master' into mpuncel/http2-hc-goaway
mpuncel 01fa365
fix alphabetical order in version.rst
mpuncel cd06c28
Merge branch 'master' into mpuncel/http2-hc-goaway
mpuncel f899f0a
Add runtime disable option to go back to old GOAWAY behavior
mpuncel 8f21af4
follow runtime naming convention, make new flag default to true, remo…
mpuncel c23a9a5
document runtime guard in release notes
mpuncel 447d07c
group common mock expectations into helper methods
mpuncel 04458ab
rename received_no_error_goaway_ to reuse_connection_, and consolidat…
mpuncel 0d57547
Merge branch 'master' into mpuncel/http2-hc-goaway
mpuncel 15d4dcd
use runtimeFeatureEnabled() and don't close() twice after reset stream
mpuncel 00503eb
add some integration tests covering GOAWAY
mpuncel bbd6e08
fix incorrect comments
mpuncel bb0cc29
Merge branch 'master' into mpuncel/http2-hc-goaway
mpuncel 2b00126
address PR feedback
mpuncel 2352719
attempt, no goaway in fake upstream
mpuncel ed7a0f1
Merge branch 'master' into HEAD
mpuncel 38cc1bf
various fixes
mpuncel 0bd20e6
make integration test send actual GOAWAY frames
mpuncel 38c4fb8
undo change to workflow
mpuncel b4496a9
fix github workflow
mpuncel 6dd4400
remove commented line
mpuncel 41a330b
address PR feedback
mpuncel bbb5b8e
wip amend
mpuncel 8cd1560
Merge branch 'master' into mpuncel/http2-hc-goaway
mpuncel ec669f2
move protocolErrorForTest to the non-legacy http2 codec
mpuncel 455034a
fix compilation error in health_check_fuzz_test_utils
mpuncel e75b30a
fix compilation errors in health checker unit test
mpuncel df8f734
add protocolErrorForTest to the legacy http2 impl
mpuncel 56efbce
undo changes to nghttp2 user callback
mpuncel 2b0d930
Merge branch 'master' into mpuncel/http2-hc-goaway
mpuncel 9e339ee
address PR feedback
mpuncel 0345dba
add unit test for protocolErrorForTest() on the non-legacy codec impl
mpuncel 06652ed
fix protocolErrorForTest() unit test when legacy codec impl is used
mpuncel 322e564
remove some unintended newlines
mpuncel deb3893
add TODO comment to clean up fake upstream when legacy h2 codec is re…
mpuncel 3ff4d9f
Merge branch 'master' into mpuncel/http2-hc-goaway
mpuncel bcb7346
add goaway_error stat too health checker, wait on it in integration t…
mpuncel f74ed68
fix wording of goaway_error doc
mpuncel 1c287c1
commit stat increment line
mpuncel 49e689d
remove unused lambda capture variable
mpuncel a7cfa59
PR feedback: remove new stat, use real sleep in integration test
mpuncel d15bbf3
Merge branch 'master' into mpuncel/http2-hc-goaway
mpuncel 31c0641
drop changes accommodating legacy h2 codec, it was removed
mpuncel 5e8b981
Merge branch 'master' into mpuncel/http2-hc-goaway
mpuncel 6a3ad8c
switch to real time
mpuncel eacd1f4
Merge branch 'master' into mpuncel/http2-hc-goaway
mpuncel 64cc217
Merge branch 'master' into mpuncel/http2-hc-goaway
mpuncel File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -256,11 +256,14 @@ void HttpHealthCheckerImpl::HttpActiveHealthCheckSession::onInterval() { | |
parent_.transportSocketMatchMetadata().get()); | ||
client_.reset(parent_.createCodecClient(conn)); | ||
client_->addConnectionCallbacks(connection_callback_impl_); | ||
client_->setCodecConnectionCallbacks(http_connection_callback_impl_); | ||
expect_reset_ = false; | ||
reuse_connection_ = parent_.reuse_connection_; | ||
} | ||
|
||
Http::RequestEncoder* request_encoder = &client_->newStream(*this); | ||
request_encoder->getStream().addCallbacks(*this); | ||
request_in_flight_ = true; | ||
|
||
const auto request_headers = Http::createHeaderMap<Http::RequestHeaderMapImpl>( | ||
{{Http::Headers::get().Method, "GET"}, | ||
|
@@ -279,15 +282,51 @@ void HttpHealthCheckerImpl::HttpActiveHealthCheckSession::onInterval() { | |
|
||
void HttpHealthCheckerImpl::HttpActiveHealthCheckSession::onResetStream(Http::StreamResetReason, | ||
absl::string_view) { | ||
request_in_flight_ = false; | ||
ENVOY_CONN_LOG(debug, "connection/stream error health_flags={}", *client_, | ||
HostUtility::healthFlagsToString(*host_)); | ||
if (client_ && !reuse_connection_) { | ||
client_->close(); | ||
} | ||
|
||
if (expect_reset_) { | ||
return; | ||
} | ||
|
||
ENVOY_CONN_LOG(debug, "connection/stream error health_flags={}", *client_, | ||
HostUtility::healthFlagsToString(*host_)); | ||
handleFailure(envoy::data::core::v3::NETWORK); | ||
} | ||
|
||
void HttpHealthCheckerImpl::HttpActiveHealthCheckSession::onGoAway( | ||
Http::GoAwayErrorCode error_code) { | ||
ENVOY_CONN_LOG(debug, "connection going away health_flags={}", *client_, | ||
HostUtility::healthFlagsToString(*host_)); | ||
|
||
// Runtime guard around graceful handling of NO_ERROR GOAWAY handling. The old behavior is to | ||
// ignore GOAWAY completely. | ||
if (!parent_.runtime_.snapshot().featureEnabled( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. runtimeFeatureEnabled() ? |
||
"envoy.reloadable_features.health_check.graceful_goaway_handling", 100UL)) { | ||
return; | ||
} | ||
|
||
if (request_in_flight_ && error_code == Http::GoAwayErrorCode::NoError) { | ||
// The server is starting a graceful shutdown. Allow the in flight request | ||
// to finish without treating this as a health check error, and then | ||
// reconnect. | ||
reuse_connection_ = false; | ||
return; | ||
} | ||
|
||
if (request_in_flight_) { | ||
// Record this as a failed health check. | ||
handleFailure(envoy::data::core::v3::NETWORK); | ||
} | ||
|
||
if (client_) { | ||
expect_reset_ = true; | ||
mpuncel marked this conversation as resolved.
Show resolved
Hide resolved
|
||
client_->close(); | ||
} | ||
} | ||
|
||
HttpHealthCheckerImpl::HttpActiveHealthCheckSession::HealthCheckResult | ||
HttpHealthCheckerImpl::HttpActiveHealthCheckSession::healthCheckResult() { | ||
uint64_t response_code = Http::Utility::getResponseStatus(*response_headers_); | ||
|
@@ -318,6 +357,8 @@ HttpHealthCheckerImpl::HttpActiveHealthCheckSession::healthCheckResult() { | |
} | ||
|
||
void HttpHealthCheckerImpl::HttpActiveHealthCheckSession::onResponseComplete() { | ||
request_in_flight_ = false; | ||
|
||
switch (healthCheckResult()) { | ||
case HealthCheckResult::Succeeded: | ||
handleSuccess(false); | ||
|
@@ -345,7 +386,7 @@ bool HttpHealthCheckerImpl::HttpActiveHealthCheckSession::shouldClose() const { | |
return false; | ||
} | ||
|
||
if (!parent_.reuse_connection_) { | ||
if (!reuse_connection_) { | ||
return true; | ||
} | ||
|
||
|
@@ -375,6 +416,7 @@ bool HttpHealthCheckerImpl::HttpActiveHealthCheckSession::shouldClose() const { | |
} | ||
|
||
void HttpHealthCheckerImpl::HttpActiveHealthCheckSession::onTimeout() { | ||
request_in_flight_ = false; | ||
if (client_) { | ||
host_->setActiveHealthFailureType(Host::ActiveHealthFailureType::TIMEOUT); | ||
ENVOY_CONN_LOG(debug, "connection/stream timeout health_flags={}", *client_, | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alyssawilk this is the one place it might be weird to just use a
reuse_connection_
bool, because in the old behavior it looks like it would not close the connection if the stream was reset? That means i'd have to change the behavior a bit, but it also might be a small bug that it wasn't closing the connection before?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll push the commit, we can always revert it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, closing if we can't reuse and the stream is reset seems fine, and it's behind the guard so even if someone gets surprised they can always revert the behavior
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some of this behavior change is not guarded. It looks like the same case came up when implementing GOAWAY handling in gRPC.
There is a slight behavior change if you have connection reuse disabled, and get a reset stream. Previously the connection would be reused! which seems wrong, but is an unguarded behavior change.
Reference PR: #11325