Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix AppSec crash when parsing integer http headers #3790

Merged

Conversation

vpellan
Copy link
Contributor

@vpellan vpellan commented Jul 17, 2024

What does this PR do?

This fixes a crash when malformed headers are sent (e.g. headers with numbers)

Motivation:

This fixes issue #3782

Additional Notes:

How to test the change?

bundle exec appraisal ruby-x.x-rack-x rake spec:appsec:rack

@vpellan vpellan self-assigned this Jul 17, 2024
@vpellan vpellan requested a review from a team as a code owner July 17, 2024 09:31
@github-actions github-actions bot added appsec Application Security monitoring product integrations Involves tracing integrations labels Jul 17, 2024
@vpellan vpellan force-pushed the vpellan/3782-appsec-crashes-when-parsing-integer-http-headers branch from 446e749 to 50a7ef9 Compare July 17, 2024 09:35
@pr-commenter
Copy link

pr-commenter bot commented Jul 17, 2024

Benchmarks

Benchmark execution time: 2024-07-23 09:32:24

Comparing candidate commit 2701a76 in PR branch vpellan/3782-appsec-crashes-when-parsing-integer-http-headers with baseline commit 6604ee1 in branch master.

Found 1 performance improvements and 0 performance regressions! Performance is the same for 9 metrics, 2 unstable metrics.

scenario:Gem loading

  • 🟩 throughput [+0.067op/s; +0.073op/s] or [+2.296%; +2.484%]

@codecov-commenter
Copy link

codecov-commenter commented Jul 17, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.91%. Comparing base (6604ee1) to head (2701a76).
Report is 31 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3790      +/-   ##
==========================================
- Coverage   97.91%   97.91%   -0.01%     
==========================================
  Files        1246     1246              
  Lines       74999    75006       +7     
  Branches     3627     3627              
==========================================
+ Hits        73436    73442       +6     
- Misses       1563     1564       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -41,7 +41,7 @@ def method

def headers
result = request.env.each_with_object({}) do |(k, v), h|
h[k.gsub(/^HTTP_/, '').downcase!.tr('_', '-')] = v if k =~ /^HTTP_/
h[k.gsub(/^HTTP_/, '').tap(&:downcase!).tap { |s| s.tr!('_', '-') }] = v if k =~ /^HTTP_/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test indicates that you are keeping all of the values, therefore in my opinion:

  1. There should also be a test for mixed-case string input, e.g. HTTP_Foo
  2. I think the logic as currently proposed will result in the mixed case being preserved which I think is not the right behavior

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you pointed in your second comment, HTTP header names are uppercased by Rack (as HTTP header names are case-insensitive according to RFC 2616) and Rack will also drop empty headers so 'HTTP_' should not be possible either. But there's the case of someone adding a middleware to Rack that changes the headers for exemple. I believe that in these case, we still want to send everything to the WAF, as an attack could still be present in the value of these headers.

@p-datadog
Copy link
Contributor

p-datadog commented Jul 17, 2024

Given the original report was in reference to invalid headers being sent to the application by external clients, would it also make sense to check for duplicate headers being sent? Including in varying case? E.g. HTTP_FOO and HTTP_Foo in the same request.

I suppose mixed case shouldn't be happening (since incoming HTTP header names I am guessing are uppercased by rack?) but if the idea is to be defensive I'd say it is not a bad idea to handle case problems here also.

And I'm thinking HTTP_FOO_2 and HTTP_2_FOO are also interesting cases to check/add to the test suite, and these should be actually obtainable in real world.

@vpellan vpellan force-pushed the vpellan/3782-appsec-crashes-when-parsing-integer-http-headers branch from 709bd1e to 86b882a Compare July 19, 2024 14:43
@vpellan
Copy link
Contributor Author

vpellan commented Jul 19, 2024

Given the original report was in reference to invalid headers being sent to the application by external clients, would it also make sense to check for duplicate headers being sent? Including in varying case? E.g. HTTP_FOO and HTTP_Foo in the same request.

I added support for duplicate headers, including in varying case, as stated by RFC 2616, header keys are case insensitive, so HTTP_FOO and HTTP_Foo are the same header key, and thus it is semantically equal to HTTP_FOO: "value1, value2" (for HTTP_FOO: value1 and HTTP_Foo: value2). It is also possible to send the values as an array to the WAF but that would mean replacing all the strings by arrays in tests.

And I'm thinking HTTP_FOO_2 and HTTP_2_FOO are also interesting cases to check/add to the test suite, and these should be actually obtainable in real world.

Thank you for that suggestion ! I added these tests too

Copy link
Contributor

@p-datadog p-datadog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see several optimization opportunities, don't know how performance-critical you think this code is. Functionally I think it's good to go.

# When multiple headers with the same name are present, they are concatenated with a comma
# https://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.2
# Because headers are case insensitive, HTTP_FOO and HTTP_Foo is the same, and should be merged
if k =~ /^HTTP_/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can use start_with?

# Because headers are case insensitive, HTTP_FOO and HTTP_Foo is the same, and should be merged
if k =~ /^HTTP_/
key = k.gsub(/^HTTP_/, '').tap(&:downcase!).tap { |s| s.tr!('_', '-') }
h[key] = h[key].nil? ? v : "#{h[key]}, #{v}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assigning h[key] to a variable will save one hash access

# https://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.2
# Because headers are case insensitive, HTTP_FOO and HTTP_Foo is the same, and should be merged
if k =~ /^HTTP_/
key = k.gsub(/^HTTP_/, '').tap(&:downcase!).tap { |s| s.tr!('_', '-') }
Copy link
Contributor

@p-datadog p-datadog Jul 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And, technically, the regular expression should use \A to refer to the beginning of the string in Ruby, not ^.


key = k.delete_prefix('HTTP_').tap(&:downcase!).tap { |s| s.tr!('_', '-') }
current_val = h[key]
h[key] = current_val.nil? ? v : "#{current_val}, #{v}"
Copy link
Contributor

@lloeki lloeki Jul 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should work to transform an individual string for the single header case into an array, delegating the meaning of having multiple identical HTTP headers to whatever is downstream of the Reactive Engine.

Suggested change
h[key] = current_val.nil? ? v : "#{current_val}, #{v}"
h[key] = current_val.nil? ? v : [current_val, v]

h[k.gsub(/^HTTP_/, '').downcase!.tr('_', '-')] = v if k =~ /^HTTP_/
# When multiple headers with the same name are present, they are concatenated with a comma
# https://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.2
# Because headers are case insensitive, HTTP_FOO and HTTP_Foo is the same, and should be merged
Copy link
Contributor

@lloeki lloeki Jul 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When could this happen?

  • The Rack spec means upcasing HTTP headers into the HTTP_UPCASED_HEADER key format.
  • Given that request.env.each_with_object iterates on the rack env (which is a hash) there can be no duplicates except for mixed case, which cannot happen when Rack compliant due to the above.
  • The Rack env does not concatenate multiple identical headers into a single hash key, instead making use of a single one (can't recall if first or last).

It follows that the only thing that could fill in some HTTP_MiXeD_CaSe is some non-Rack compliant middleware or monkeypatch that inject pseudo headers that actually don't exist on the HTTP request AND would execute before Datadog's middlewares, which are designed to be run first at the top of the Rack stack.

According to the HTTP spec concatenation MAY happen but MUST only happen for headers that are lists. As is this code does so for ALL headers so it's very much incorrect, and possibly prone to injections or opportunities for a WAF bypass (e.g Authorization headers being concatenated in a way that make it valid for libddwaf rules but would in isolation - as seen by the application - be dangerous)

I'm OK with making code robust in face of strange stuff, but trying to make sense of hypothetical non-compliant nonsense can only lead to problems (GIGO), therefore I question the cost of bearing the handling of these hypothetical use cases on every request.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this is an extremely rare edge case, that would be introduced by some sort of weird middleware, and I believe that we should not be responsible for a client modifying the core behaviour of Rack, especially when it is only hypothetical. And a quick benchmark shows that this is 50x slower than your first suggestion with a bit of modification (replacing gsub by delete_prefix and regex in condition by start_with), which I believe is more important for our clients.

@vpellan vpellan merged commit 5d1295c into master Jul 24, 2024
171 checks passed
@vpellan vpellan deleted the vpellan/3782-appsec-crashes-when-parsing-integer-http-headers branch July 24, 2024 08:18
@github-actions github-actions bot added this to the 2.3.0 milestone Jul 24, 2024
@TonyCTHsu TonyCTHsu mentioned this pull request Aug 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
appsec Application Security monitoring product integrations Involves tracing integrations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants