-
Notifications
You must be signed in to change notification settings - Fork 375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix AppSec crash when parsing integer http headers #3790
Fix AppSec crash when parsing integer http headers #3790
Conversation
446e749
to
50a7ef9
Compare
BenchmarksBenchmark execution time: 2024-07-23 09:32:24 Comparing candidate commit 2701a76 in PR branch Found 1 performance improvements and 0 performance regressions! Performance is the same for 9 metrics, 2 unstable metrics. scenario:Gem loading
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #3790 +/- ##
==========================================
- Coverage 97.91% 97.91% -0.01%
==========================================
Files 1246 1246
Lines 74999 75006 +7
Branches 3627 3627
==========================================
+ Hits 73436 73442 +6
- Misses 1563 1564 +1 ☔ View full report in Codecov by Sentry. |
@@ -41,7 +41,7 @@ def method | |||
|
|||
def headers | |||
result = request.env.each_with_object({}) do |(k, v), h| | |||
h[k.gsub(/^HTTP_/, '').downcase!.tr('_', '-')] = v if k =~ /^HTTP_/ | |||
h[k.gsub(/^HTTP_/, '').tap(&:downcase!).tap { |s| s.tr!('_', '-') }] = v if k =~ /^HTTP_/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test indicates that you are keeping all of the values, therefore in my opinion:
- There should also be a test for mixed-case string input, e.g. HTTP_Foo
- I think the logic as currently proposed will result in the mixed case being preserved which I think is not the right behavior
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you pointed in your second comment, HTTP header names are uppercased by Rack (as HTTP header names are case-insensitive according to RFC 2616) and Rack will also drop empty headers so 'HTTP_' should not be possible either. But there's the case of someone adding a middleware to Rack that changes the headers for exemple. I believe that in these case, we still want to send everything to the WAF, as an attack could still be present in the value of these headers.
Given the original report was in reference to invalid headers being sent to the application by external clients, would it also make sense to check for duplicate headers being sent? Including in varying case? E.g. HTTP_FOO and HTTP_Foo in the same request. I suppose mixed case shouldn't be happening (since incoming HTTP header names I am guessing are uppercased by rack?) but if the idea is to be defensive I'd say it is not a bad idea to handle case problems here also. And I'm thinking HTTP_FOO_2 and HTTP_2_FOO are also interesting cases to check/add to the test suite, and these should be actually obtainable in real world. |
709bd1e
to
86b882a
Compare
I added support for duplicate headers, including in varying case, as stated by RFC 2616, header keys are case insensitive, so HTTP_FOO and HTTP_Foo are the same header key, and thus it is semantically equal to HTTP_FOO: "value1, value2" (for HTTP_FOO: value1 and HTTP_Foo: value2). It is also possible to send the values as an array to the WAF but that would mean replacing all the strings by arrays in tests.
Thank you for that suggestion ! I added these tests too |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see several optimization opportunities, don't know how performance-critical you think this code is. Functionally I think it's good to go.
# When multiple headers with the same name are present, they are concatenated with a comma | ||
# https://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.2 | ||
# Because headers are case insensitive, HTTP_FOO and HTTP_Foo is the same, and should be merged | ||
if k =~ /^HTTP_/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can use start_with?
# Because headers are case insensitive, HTTP_FOO and HTTP_Foo is the same, and should be merged | ||
if k =~ /^HTTP_/ | ||
key = k.gsub(/^HTTP_/, '').tap(&:downcase!).tap { |s| s.tr!('_', '-') } | ||
h[key] = h[key].nil? ? v : "#{h[key]}, #{v}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assigning h[key]
to a variable will save one hash access
# https://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.2 | ||
# Because headers are case insensitive, HTTP_FOO and HTTP_Foo is the same, and should be merged | ||
if k =~ /^HTTP_/ | ||
key = k.gsub(/^HTTP_/, '').tap(&:downcase!).tap { |s| s.tr!('_', '-') } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And, technically, the regular expression should use \A
to refer to the beginning of the string in Ruby, not ^
.
|
||
key = k.delete_prefix('HTTP_').tap(&:downcase!).tap { |s| s.tr!('_', '-') } | ||
current_val = h[key] | ||
h[key] = current_val.nil? ? v : "#{current_val}, #{v}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should work to transform an individual string for the single header case into an array, delegating the meaning of having multiple identical HTTP headers to whatever is downstream of the Reactive Engine.
h[key] = current_val.nil? ? v : "#{current_val}, #{v}" | |
h[key] = current_val.nil? ? v : [current_val, v] |
h[k.gsub(/^HTTP_/, '').downcase!.tr('_', '-')] = v if k =~ /^HTTP_/ | ||
# When multiple headers with the same name are present, they are concatenated with a comma | ||
# https://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.2 | ||
# Because headers are case insensitive, HTTP_FOO and HTTP_Foo is the same, and should be merged |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When could this happen?
- The Rack spec means upcasing HTTP headers into the
HTTP_UPCASED_HEADER
key format. - Given that
request.env.each_with_object
iterates on the rack env (which is a hash) there can be no duplicates except for mixed case, which cannot happen when Rack compliant due to the above. - The Rack env does not concatenate multiple identical headers into a single hash key, instead making use of a single one (can't recall if first or last).
It follows that the only thing that could fill in some HTTP_MiXeD_CaSe
is some non-Rack compliant middleware or monkeypatch that inject pseudo headers that actually don't exist on the HTTP request AND would execute before Datadog's middlewares, which are designed to be run first at the top of the Rack stack.
According to the HTTP spec concatenation MAY happen but MUST only happen for headers that are lists. As is this code does so for ALL headers so it's very much incorrect, and possibly prone to injections or opportunities for a WAF bypass (e.g Authorization
headers being concatenated in a way that make it valid for libddwaf rules but would in isolation - as seen by the application - be dangerous)
I'm OK with making code robust in face of strange stuff, but trying to make sense of hypothetical non-compliant nonsense can only lead to problems (GIGO), therefore I question the cost of bearing the handling of these hypothetical use cases on every request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that this is an extremely rare edge case, that would be introduced by some sort of weird middleware, and I believe that we should not be responsible for a client modifying the core behaviour of Rack, especially when it is only hypothetical. And a quick benchmark shows that this is 50x slower than your first suggestion with a bit of modification (replacing gsub by delete_prefix and regex in condition by start_with), which I believe is more important for our clients.
What does this PR do?
This fixes a crash when malformed headers are sent (e.g. headers with numbers)
Motivation:
This fixes issue #3782
Additional Notes:
How to test the change?
bundle exec appraisal ruby-x.x-rack-x rake spec:appsec:rack