Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filters: add generic compressor filter #7057

Merged
merged 94 commits into from
Mar 5, 2020

Conversation

rojkov
Copy link
Member

@rojkov rojkov commented May 23, 2019

Description:

Add generic compressor_lib which can be reused by all HTTP compression filters to

  1. parse Accepted-Encoding;
  2. update HTTP headers;
  3. decide which encoding to use for compression in case there are more than one compression filters in the same chain.

Risk Level: High
Testing: unit tests, manual tests
Docs Changes: N/A
Release Notes: N/A

Contributes to #4429

Design doc: Envoy HTTP Filters: Generic Compressor and Decompressor

Generalize `GzipFilter` into `CompressorFilter` accepting
a compressor, e.g. gzip, performing actual data transformation.

Signed-off-by: Dmitry Rozhkov <dmitry.rozhkov@linux.intel.com>
@rojkov
Copy link
Member Author

rojkov commented May 23, 2019

/retest

@repokitteh-read-only
Copy link

🔨 rebuilding ci/circleci: coverage (failed build)
🔨 rebuilding ci/circleci: tsan (failed build)
🔨 rebuilding ci/circleci: clang_tidy (failed build)
🔨 rebuilding ci/circleci: asan (failed build)
🔨 rebuilding ci/circleci: compile_time_options (failed build)
🔨 rebuilding ci/circleci: release (failed build)
🔨 rebuilding ci/circleci: api (failed build)
🔨 rebuilding ci/circleci: format (failed build)
🔨 rebuilding ci/circleci: docs (failed build)
🔨 rebuilding ci/circleci: ipv6_tests (failed build)

🐱

Caused by: a #7057 (comment) was created by @rojkov.

see: more, trace.

Dmitry Rozhkov added 2 commits May 23, 2019 15:02
Signed-off-by: Dmitry Rozhkov <dmitry.rozhkov@linux.intel.com>
Signed-off-by: Dmitry Rozhkov <dmitry.rozhkov@linux.intel.com>
@dio
Copy link
Member

dio commented May 24, 2019

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@dio
Copy link
Member

dio commented May 24, 2019

@gsagula FYI.

@gsagula
Copy link
Member

gsagula commented May 24, 2019

@dio I can do the first pass if someone else is already doing it.

@dio
Copy link
Member

dio commented May 27, 2019

@gsagula yes, please. Thanks!

@dio dio requested a review from gsagula May 27, 2019 04:01
Copy link
Member

@gsagula gsagula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still looking at AcceptEncodingAllowed member function. I will add more comments later. Thanks!

Signed-off-by: Dmitry Rozhkov <dmitry.rozhkov@linux.intel.com>
@dio
Copy link
Member

dio commented May 29, 2019

@rojkov please merge master to pick up #7090. Thanks!

@rojkov
Copy link
Member Author

rojkov commented May 29, 2019

@dio Thanks! I've merged the master and added one more test case.

@dio
Copy link
Member

dio commented May 29, 2019

We have a formatting error in master due to s/absl::make_unique/std::make_unique check. The fix is here: #7081 (think it will be merged soon). Sorry.

@rojkov
Copy link
Member Author

rojkov commented May 29, 2019

Ok, I'll pull it in when it gets merged.

@stale
Copy link

stale bot commented Jun 6, 2019

This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@stale stale bot added the stale stalebot believes this issue/PR has not been touched recently label Jun 6, 2019
@rojkov
Copy link
Member Author

rojkov commented Jun 10, 2019

@gsagula are you still looking at the PR?

@stale stale bot removed the stale stalebot believes this issue/PR has not been touched recently label Jun 10, 2019
@gsagula
Copy link
Member

gsagula commented Jun 10, 2019

@rojkov Yes, sorry. I had a pretty busy week. I will look at the rest today. Thanks for pinging me.

@stale
Copy link

stale bot commented Jun 19, 2019

This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@mattklein123
Copy link
Member

@rojkov just to clarify you want to review this PR, right? Can you update the title/description and then ping me when it's ready to review? Thank you.

@rojkov rojkov changed the title gzip: generalize gzip into a general compression filter filters: add generic compressor filter Feb 27, 2020
@rojkov
Copy link
Member Author

rojkov commented Feb 27, 2020

@mattklein123 I've updated the description. Please take a look.

@rojkov
Copy link
Member Author

rojkov commented Feb 28, 2020

Also added a link to the design doc to put the PR in perspective.

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a ton for working on this. At a high level this LGTM but I would like to take another pass once these comments are addressed. Thank you!

/wait

/**
* A filter that compresses data dispatched from the upstream upon client request.
*/
class CompressorFilter : public Http::StreamFilter {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please derive from PassThroughFilter to avoid a bunch of boilerplate below.

Comment on lines 84 to 89
StringUtil::CaseUnorderedSet content_type_values_;
bool disable_on_etag_header_;
bool remove_accept_encoding_header_;
CompressorStats stats_;
Runtime::Loader& runtime_;
std::string content_encoding_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: please constify all possible member variables here and below, using the initializer list where possible if not already doing so.

accept_encoding_ = std::make_unique<std::string>(accept_encoding->value().getStringView());
}

if (config_->runtime().snapshot().featureEnabled(config_->featureName(), 100)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does feature name come from here exactly? Is this just for purposes of having a kill switch? In new code we prefer an explicit kill switch on the config if needed. See RuntimeFeatureFlag

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a slightly modified copy-paste from the gzip filter. I presume this is used for canary tests.

Probably RuntimeFractionalPercent would be a better candidate.

featureName is supposed to be a constant set in a class derived from CompressorFilterConfig, e.g. GzipFilterConfig.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think this is all before we had the explicit RuntimeFeatureFlag. Can you add this to the new config and use that instead if we are doing a new thing? That will be better for the future and we can remove this explicit featureName() variable as a hard coded thing. Thank you!

}

std::unique_ptr<CompressorFilter::EncodingDecision>
CompressorFilter::chooseEncoding(const Http::ResponseHeaderMap& headers) const {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function will need an explicit fuzzer. Also, should any of this be split out into a generic accept-encoding parser in HTTP utilities? Please add a TODO for writing the fuzzer. cc @asraa

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, should any of this be split out into a generic accept-encoding parser in HTTP utilities?

I'm not sure. Where such utility could be re-used?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anything that parses accept-encoding/q values? It's fine to keep here for now but I could see this easily being reimplemented later. cc @jmarantz

config_->stats().header_not_valid_.inc();
break;
default:
config_->stats().header_compressor_overshadowed_.inc();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does overshadowed mean? Note that this will need documentation once the filter itself is actually created.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the refactored gzip filter includes this stat documented in its .rst file. For now I added the description to compressor.h.

Comment on lines 157 to 159
// There could be many compressors registered for the same content encoding, e.g. consider a case
// when there are two gzip filters using different compression levels for different content sizes.
// In such case we ignore duplicates (or different filters for the same encoding) registered last.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In particular this method needs massively more comments, both at the high level and throughout. Please describe in detail how filter state is used, how things interact, etc.

and reduce its complexity a bit by moving stat updates to a
new private metod shouldCompress().

Signed-off-by: Dmitry Rozhkov <dmitry.rozhkov@linux.intel.com>
@rojkov
Copy link
Member Author

rojkov commented Mar 3, 2020

@mattklein123 Please take a look. The PR is ready for another pass.

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks this looks great to ship and iterate. Can you check/fix format and also update the killswitch mechanism and then I think we can ship? Thank you!

/wait

accept_encoding_ = std::make_unique<std::string>(accept_encoding->value().getStringView());
}

if (config_->runtime().snapshot().featureEnabled(config_->featureName(), 100)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think this is all before we had the explicit RuntimeFeatureFlag. Can you add this to the new config and use that instead if we are doing a new thing? That will be better for the future and we can remove this explicit featureName() variable as a hard coded thing. Thank you!

Signed-off-by: Dmitry Rozhkov <dmitry.rozhkov@linux.intel.com>
Dmitry Rozhkov added 3 commits March 4, 2020 13:22
Signed-off-by: Dmitry Rozhkov <dmitry.rozhkov@linux.intel.com>
Signed-off-by: Dmitry Rozhkov <dmitry.rozhkov@linux.intel.com>
@rojkov
Copy link
Member Author

rojkov commented Mar 4, 2020

@mattklein123 Thank you! I've added the kill switch.

Though the coverage check started to fail. But the failures don't seem to be related to this PR.

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks looks great, just a small spelling nit. Thank you!

/wait

@@ -401,6 +401,7 @@ boolean
booleans
bools
borks
br
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry but to avoid adding this, can you potentially try adding backticks around the thing that triggered this? I think this is what @zuercher did in the associated PR. @zuercher perhaps we should have some error text that guides people on different ways of fixing the issue so we avoid more things going in the dictionary that shouldn't?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I removed br completely.

Signed-off-by: Dmitry Rozhkov <dmitry.rozhkov@linux.intel.com>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@repokitteh-read-only repokitteh-read-only bot removed the api label Mar 5, 2020
@mattklein123 mattklein123 merged commit 884bc7d into envoyproxy:master Mar 5, 2020
@rojkov rojkov deleted the brotli-multi-filters branch March 9, 2020 14:40
@htuch
Copy link
Member

htuch commented Mar 9, 2020

I think a dedicate fuzzer for this filter makes sense, since there is a fair bit of parsing and dependent logic CC @asraa.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants