Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http conn man: Introduce preserve_upstream_date option #11077

Merged

Conversation

chradcliffe
Copy link
Contributor

Signed-off-by: Craig Radcliffe craig.radcliffe@broadcom.com

Commit Message:
http conn man: Introduce preserve_upstream_date option

Additional Description:
The preserve_upstream_date option allows the HTTP Connection Manager to be configured to pass through the original date header from the upstream response rather than overwriting it. The default behaviour for the date response header remains the same as before -- the header value will be overwritten by Envoy.

Risk Level: Low

Testing:

  • Unit testing
  • Manual testing

Docs Changes: N/A
Release Notes: Unsure whether this is required for a flag that defaults to current behaviour

Fixes #11030

Signed-off-by: Craig Radcliffe <craig.radcliffe@broadcom.com>
@repokitteh-read-only
Copy link

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to api/.
CC @envoyproxy/api-watchers: FYI only for changes made to api/.

🐱

Caused by: #11077 was opened by chradcliffe.

see: more, trace.

@chradcliffe chradcliffe marked this pull request as ready for review May 6, 2020 00:00
@chradcliffe chradcliffe marked this pull request as draft May 6, 2020 01:09
@alyssawilk alyssawilk self-assigned this May 6, 2020
@alyssawilk
Copy link
Contributor

Looks good overall - I'll do a full review pass once you've got CI sorted out :-)

Signed-off-by: Craig Radcliffe <craig.radcliffe@broadcom.com>
Copy link
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm api

@repokitteh-read-only repokitteh-read-only bot removed the api label May 6, 2020
@chradcliffe chradcliffe marked this pull request as ready for review May 6, 2020 16:04
@@ -1631,7 +1631,9 @@ void ConnectionManagerImpl::ActiveStream::encodeHeaders(ActiveStreamEncoderFilte
void ConnectionManagerImpl::ActiveStream::encodeHeadersInternal(ResponseHeaderMap& headers,
bool end_stream) {
// Base headers.
connection_manager_.config_.dateProvider().setDateHeader(headers);
if (!connection_manager_.config_.shouldPreserveUpstreamDate() || !headers.Date()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR looks great.
My only caveat is that I believe this will result in also preserving the date header when we serve out of cache.
Is your intent to do this (since at the end of the day the source of that date is upstream) or is the intent to only overwrite date headers served live from upstream (preserve date headers when they're relevant, but let cache results have an accurate up-to-date date)? I'd lean towards the latter, since I think that's a more clear interpretation of the config field, at which point I think you have to differentiate between the two.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the intention here would be to use a new date for serving cached content. Otherwise, we may end up with stale response dates that might be confusing to downstream components (e.g. the client).

Is there a convention around how to determine whether a cached response is being served?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was looking into that, and there's no an obvious direct check.

You can infer from the rc details being "via_upstream" but that's a string compare and icky. You can infer from the stream info upstreamHost being set, which is probably the easiest workaround.
I'd also be fine with something added to stream info to indicate the source of the reply - cache / Envoy generated / proxied from upstream. @mattklein123 any thoughts here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I would think for the caching case we might want to introduce a stream info flag. This seems useful/required anyway for logging purposes in a production ready caching solution.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can give it a try in this PR -- the "stream info flag" in this case would be a new value in StreamInfo::ResponseFlag, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You got it!

Copy link
Contributor

@toddmgreer toddmgreer May 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Responses served from cache should have the same date header they had when they got inserted into cache. The date header is defined to represent the time the message was originated, regardless of whether the message happens to spend some time in a cache.

IOW, if I request something then request it again later, and Envoy caches the first response and later serves me that entry, then the two responses I receive should have the same date header. Caches rely on this, and will compute incorrect freshness values if date headers are wrong.

This applies whether we're talking about a response served by Envoy's CacheFilter, or a response served by a separate caching proxy upstream.

The canonical way to know that a response came from an HTTP cache is to check for the presence of an age header. If a response has an age header, it's from a cache and we shouldn't change its date header.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh interesting.
That's not what's happening today. So in that case this PR LGTM, but someone should do the inverse and not set the date header if it passed through the cache filter. I'm happy to pick that up since I noticed the caching code fails to set rc details

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The date header is defined to represent the time the message was originated, regardless of whether the message happens to spend some time in a cache.

Interesting. I guess Envoy's default behavior is wrong then? In any case, I think we can address this out of the scope of this PR per other discussion.

alyssawilk
alyssawilk previously approved these changes May 7, 2020
Copy link
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattklein123 want to take a look or shall I merge?

@alyssawilk
Copy link
Contributor

Also I think this needs a master merge, sorry

Craig Radcliffe added 2 commits May 7, 2020 10:20
Signed-off-by: Craig Radcliffe <craig.radcliffe@broadcom.com>
Signed-off-by: Craig Radcliffe <craig.radcliffe@broadcom.com>
@chradcliffe
Copy link
Contributor Author

/retest

@repokitteh-read-only
Copy link

🐴 hold your horses - no failures detected, yet.

🐱

Caused by: a #11077 (comment) was created by @chradcliffe.

see: more, trace.

alyssawilk
alyssawilk previously approved these changes May 7, 2020
Signed-off-by: Craig Radcliffe <craig.radcliffe@broadcom.com>
@mattklein123 mattklein123 self-assigned this May 7, 2020
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM w/ small comments. Thank you!

/wait

setUpEncoderAndDecoder(false, false);
sendRequestHeadersAndData();
preserve_upstream_date_ = false;
const auto* modifiedHeaders = sendResponseHeaders(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s/modifiedHeaders/modified_headers here and below (I think we can get clang-tidy to check this at some point)

@@ -897,6 +899,56 @@ TEST_F(HttpConnectionManagerImplTest, RouteShouldUseSantizedPath) {
conn_manager_->onData(fake_input, false);
}

TEST_F(HttpConnectionManagerImplTest, PreseveUpstreamDateDisabledDateNotSet) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo "Preseve" here and below

Craig Radcliffe added 2 commits May 7, 2020 17:03
Signed-off-by: Craig Radcliffe <craig.radcliffe@broadcom.com>
…/preserve-response-date

Signed-off-by: Craig Radcliffe <craig.radcliffe@broadcom.com>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@mattklein123
Copy link
Member

/azp run envoy-presubmit

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mattklein123
Copy link
Member

/azp run envoy-presubmit

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mattklein123 mattklein123 merged commit 10c755e into envoyproxy:master May 8, 2020
mattklein123 added a commit that referenced this pull request May 8, 2020
This reverts commit 10c755e.

Signed-off-by: Matt Klein <mklein@lyft.com>
mattklein123 added a commit that referenced this pull request May 8, 2020
This reverts commit 10c755e.

Signed-off-by: Matt Klein <mklein@lyft.com>
alyssawilk pushed a commit that referenced this pull request May 14, 2020
…ard (#11132)

Commit Message: http conn man: always preserve upstream date response header

Additional Description:
Reintroduces the change to preserve the upstream date response header (introduced in #11077, reverted in #11116 ) but removes the configuration and adds a runtime guard instead (see #11110 )

Risk Level: Low
Testing: Unit testing
Docs Changes: N/A
Release Notes: yes
Runtime guard: http_connection_manager.preserve_upstream_date

Signed-off-by: Craig Radcliffe <craig.radcliffe@broadcom.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support preserving the date header in HTTP responses
5 participants