Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(cors) properly validate flat strings #3872

Merged
merged 2 commits into from
Dec 3, 2018
Merged

fix(cors) properly validate flat strings #3872

merged 2 commits into from
Dec 3, 2018

Conversation

subnetmarco
Copy link
Member

@subnetmarco subnetmarco commented Oct 18, 2018

Closes #3832.

This pull request changes the behavior of the CORS plugin as follows:

These changes seem reasonable to me, but happy to hear feedback.

for _, domain in ipairs(conf.origins) do
local from, _, err = re_find(req_origin, domain, "jo")
local from, _, err = re_find(parsed_req_origin, "^"..(url.parse(domain).host or domain).."$", "jo")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we cache the value of url.parse here?

@subnetmarco
Copy link
Member Author

subnetmarco commented Oct 19, 2018

Added some caching as per @p0pr0ck5 suggestion with 628ef54

@@ -16,6 +16,9 @@ CorsHandler.PRIORITY = 2000
CorsHandler.VERSION = "0.1.0"


local parsed_domains = { }
Copy link
Member Author

@subnetmarco subnetmarco Oct 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if adding this variable here is the intended style.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This table can grow indefinitely and become a vector for an OOM attack since it is populated from the client. We either need to specify it as a weak table, or better yet, use a lua-resty-lrucache instance.

@@ -16,6 +16,9 @@ CorsHandler.PRIORITY = 2000
CorsHandler.VERSION = "0.1.0"


local parsed_domains = { }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This table can grow indefinitely and become a vector for an OOM attack since it is populated from the client. We either need to specify it as a weak table, or better yet, use a lua-resty-lrucache instance.

parsed_domains[domain] = parsed_domain
end

local from, _, err = re_find(parsed_req_origin, "^"..parsed_domain.."$", "jo")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: we use spaces around or concatenation operators (..). This can be fixed at merge time.

@@ -50,8 +53,18 @@ local function configure_origin(ngx, conf)

local req_origin = ngx.var.http_origin
if req_origin then

local parsed_req_origin = url.parse(req_origin).host or req_origin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

url.parse() can return nil (https://github.com/diegonehab/luasocket/blob/master/src/url.lua#L148), which, since the call is currently enclosed within an if req_origin condition, should never happen as of today. But it is safer not to index .host unconditionally in case the surrounding code changes in the future.

Same for the below call to url.parse() in the loop.

@subnetmarco
Copy link
Member Author

subnetmarco commented Oct 24, 2018

Last commit addresses @thibaultcha review: d76ab4c

@@ -16,6 +17,11 @@ CorsHandler.PRIORITY = 2000
CorsHandler.VERSION = "0.1.0"


-- per-worker cache of parsed origins
local CACHE_SIZE = 10 ^ 4
Copy link
Member Author

@subnetmarco subnetmarco Oct 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will store at 10000 origin values at a time, seems reasonable.

if not parsed_domain then
local parsed_domain_obj = url.parse(domain)
parsed_domain = parsed_domain_obj and parsed_domain_obj.host or domain
parsed_domains[domain] = parsed_domain
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a proper use of lua-resty-lrucache. The module offers set() and get() methods to interact with the cache: https://github.com/openresty/lua-resty-lrucache#methods

@@ -50,8 +56,25 @@ local function configure_origin(ngx, conf)

local req_origin = ngx.var.http_origin
if req_origin then

local parsed_req_origin_obj = url.parse(req_origin)
local parsed_req_origin = parsed_req_origin_obj and parsed_req_origin_obj.host
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RFC 6454 says on "Comparing Origins" that:

Two origins are "the same" if, and only if, they are identical. In
particular:
If the two origins are scheme/host/port triples, the two origins
are the same if, and only if, they have identical schemes, hosts,
and ports.

Also see:

Q: Why not just use the host?
A: Including the scheme in the origin tuple is essential for
security. If user agents did not include the scheme, there would be
no isolation between http://example.com and https://example.com
because the two have the same host. However, without this isolation,
an active network attacker could corrupt content retrieved from
http://example.com and have that content instruct the user agent to
compromise the confidentiality and integrity of content retrieved
from https://example.com, bypassing the protections afforded by TLS
[RFC5246].

The comparisons we are doing now do not take scheme and port into consideration (nor do we have tests asserting them). See section 3.2.1 of RFC 6454 for examples of resources with similar and different origins, we could use this list to write our tests.

@thibaultcha thibaultcha added the pr/changes requested Changes were requested to this PR by a maintainer. Please address them and ping back once done. label Oct 31, 2018
@CLAassistant
Copy link

CLAassistant commented Nov 13, 2018

CLA assistant check
All committers have signed the CLA.

@subnetmarco subnetmarco added pr/please review and removed pr/changes requested Changes were requested to this PR by a maintainer. Please address them and ping back once done. labels Nov 13, 2018
@subnetmarco subnetmarco self-assigned this Nov 13, 2018
@thibaultcha thibaultcha added this to the 1.0.0rc4 milestone Nov 30, 2018
thibaultcha pushed a commit that referenced this pull request Dec 1, 2018
@thibaultcha:

Prior to this patch, configuring the plugin with, e.g.:

    "config": {
      "origins": [
        "http://my-site.com",
        "http://my-other-site.com"
      ]
    }

Would make the following request Origin validate comparison and
injecting the ACAO response header:

    curl -X OPTIONS \
      http://localhost:8000 \
      -H 'Origin: http://my-site.com.bad-guys.com'

This patch's main goal is to anchor the comparison regex in order to
ensure we match the full configured origin.

Additionally, we now ensure that protocol, host, and port are all taken
in consideration when comparing origins, as per RFC 3654 on "Comparing
Origins".

Fix #3832
From #3872

Signed-off-by: Thibault Charbonnier <thibaultcha@me.com>
@thibaultcha
Copy link
Member

I have added a commit to this PR. Should be good to go now, but waiting for any additional review just in case. Will be merged for 1.0rc4.

@thibaultcha
Copy link
Member

CI reports a failing test for this plugin. Will have a look shortly.

subnetmarco and others added 2 commits December 3, 2018 11:32
@thibaultcha:

Prior to this patch, configuring the plugin with, e.g.:

    "config": {
      "origins": [
        "http://my-site.com",
        "http://my-other-site.com"
      ]
    }

Would make the following request Origin validate comparison and
injecting the ACAO response header:

    curl -X OPTIONS \
      http://localhost:8000 \
      -H 'Origin: http://my-site.com.bad-guys.com'

This patch's main goal is to anchor the comparison regex in order to
ensure we match the full configured origin.

Additionally, we now ensure that protocol, host, and port are all taken
in consideration when comparing origins, as per RFC 3654 on "Comparing
Origins".

Fix #3832
From #3872

Signed-off-by: Thibault Charbonnier <thibaultcha@me.com>
A refactor on top of the previous commit. We now distinguish the
requests origins' LRU cache from the plugin's configured normalized
origins.
@thibaultcha thibaultcha merged commit 62f6f30 into master Dec 3, 2018
@thibaultcha thibaultcha deleted the fix/cors branch December 3, 2018 21:04
thibaultcha pushed a commit that referenced this pull request Dec 3, 2018
@thibaultcha:

Prior to this patch, configuring the plugin with, e.g.:

    "config": {
      "origins": [
        "http://my-site.com",
        "http://my-other-site.com"
      ]
    }

Would make the following request Origin validate comparison and
injecting the ACAO response header:

    curl -X OPTIONS \
      http://localhost:8000 \
      -H 'Origin: http://my-site.com.bad-guys.com'

This patch's main goal is to anchor the comparison regex in order to
ensure we match the full configured origin.

Additionally, we now ensure that protocol, host, and port are all taken
in consideration when comparing origins, as per RFC 3654 on "Comparing
Origins".

Fix #3832
From #3872

Signed-off-by: Thibault Charbonnier <thibaultcha@me.com>
hishamhm added a commit that referenced this pull request Jan 29, 2019
This reverts a regression introduced in #3872, in which regexes were being
forcibly anchored and matched against normalized domains, leading to a
breaking change with regard to the 0.x behavior.

In 0.x, regexes such as `(.*[.])?foo\.test` would accept subdomain entries
(but were subject to bug #3832 as it would also accept `foo.test.evil.test`);
in 1.0.2, the latter is not accepted, but the regular uses failed as well,
because regexes were translated to `^(.*[.])?foo\.test$` but were then matched
against the normalized domains in a way that always included the port.

With this commit, both configured origins and the input header value are
normalized so that default ports are not an issue for both regex and non-regex
cases. We verify if each configured origin is a regex or not, then:

* for non-regex entries, we do plain equality matching against the normalized
  domain
* For regex entries,
  * if the regex contains `:`, we do an anchored match against the normalized
    domain
  * otherwise, we do an anchored match against the host component only
    (to account for the 0.x behavior where ports were not considered)

Matching domains with regexes must be done with care (for starters,
note that dots must be escaped), so we recommend using plain-text full-domain
matching whenever possible.

This change in behavior is arguably a breaking change, but it is a compromise
between a workable behavior and backwards compatibility. Good domain-matching
regexes such as `(.*[.])?foo\.test` will remain matching against the host
component as intended (without being subject to #3832). Naive regexes such as
`.foo.test` will stop "working", but these were vulnerable to #3832 anyway. In
particular, thorough regexes such as
`^https?://(.*[.])?foo\\.test(:(80|90))?$` that performed their own anchoring
remain working as well.
thibaultcha pushed a commit that referenced this pull request Jan 30, 2019
This reverts a regression introduced in #3872, in which regexes were
being forcibly anchored and matched against normalized domains, leading
to a breaking change with regard to the 0.x behavior.

In 0.x, regexes such as `(.*[.])?foo\.test` would accept subdomain
entries (but were subject to bug #3832 as it would also accept
`foo.test.evil.test`); in 1.0.2, the latter is not accepted, but the
regular uses failed as well, because regexes were translated to
`^(.*[.])?foo\.test$` but were then matched against the normalized
domains in a way that always included the port.

With this commit, both configured origins and the input header value are
normalized so that default ports are not an issue for both regex and
non-regex cases. We verify if each configured origin is a regex or not,
then:

* for non-regex entries, we do plain equality matching against the normalized
  domain
* For regex entries,
  * if the regex contains `:`, we do an anchored match against the normalized
    domain
  * otherwise, we do an anchored match against the host component only
    (to account for the 0.x behavior where ports were not considered)

Matching domains with regexes must be done with care (for starters, note
that dots must be escaped), so we recommend using plain-text full-domain
matching whenever possible.

This change in behavior is arguably a breaking change, but it is a
compromise between a workable behavior and backwards compatibility. Good
domain-matching regexes such as `(.*[.])?foo\.test` will remain matching
against the host component as intended (without being subject to #3832).
Naive regexes such as `.foo.test` will stop "working", but these were
vulnerable to #3832 anyway. In particular, thorough regexes such as
`^https?://(.*[.])?foo\\.test(:(80|90))?$` that performed their own
anchoring remain working as well.

From #4261

Signed-off-by: Thibault Charbonnier <thibaultcha@me.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants