Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify multiple slash in URI path detection #54

Merged
merged 6 commits into from
Dec 1, 2022

Conversation

weierophinney
Copy link
Contributor

Q A
Bug fix? yes
New feature? yes
BC breaks? potentially
Deprecations? no
Related tickets fixes laminas/laminas-diactoros#74, closes laminas/laminas-diactoros#77
Documentation (TBD)
License MIT

What's in this PR?

Per an issue created for Diactoros its related pull request, and the discussion to that pull request, this patch does the following:

  • It modifies testPathWithMultipleSlashes() to only validate that multiple slashes not at the beginning of a path are retained intact.
  • It adds testProperlyTrimsLeadingSlashesToPreventXSS(), which validates that when multiple leading slashes are present in a path, they are reduced to a single slash.

Why?

This approach is done to mitigate ZF2015-05 which was also reported as CVE-2015-3257.
While RFC 3986 allows for multiple slashes anywhere in the path, when security conflicts with a specification, security concerns win.
Without the mitigation, an implementation is vulnerable to XSS and open redirects if only the path portion of a URI is used within HTML content (common!) or within headers (also common).

Example Usage

When testing a PSR-7 imlementation, ensure that URLs with multiple leading slashes in the path are rewritten to a single leading slash in order to pass the integration suite tests.

Checklist

  • Updated CHANGELOG.md to describe BC breaks / deprecations | new feature | bugfix
  • Documentation pull request created (if not simply a bugfix)

Per an [issue created for Diactoros](laminas/laminas-diactoros#74) its [related pull request](laminas/laminas-diactoros#77), and the discussion to that pull request, this patch does the following:

- It modifies `testPathWithMultipleSlashes()` to only validate that multiple slashes _not at the beginning_ of a path are retained intact.
- It adds `testProperlyTrimsLeadingSlashesToPreventXSS()`, which validates that when multiple leading slashes are present in a path, they are reduced to a single slash.

This approach is done to mitigate [ZF2015-05](https://framework.zend.com/security/advisory/ZF2015-05.html) which was also reported as [CVE-2015-3257](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2015-3257).
While RFC 3986 allows for multiple slashes anywhere in the path, when security conflicts with a specification, security concerns win.
Without the mitigation, an implementation is vulnerable to XSS and open redirects if only the path portion of a URI is used within HTML content (common!) or within headers (also common).

Signed-off-by: Matthew Weier O'Phinney <matthew@weierophinney.net>
Copy link
Contributor

@dbu dbu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems convincing to me.

@Nyholm any objections?

$this->assertSame('/valid///path', $uri->getPath());
}

public function testProperlyTrimsLeadingSlashesToPreventXSS()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add a comment here with the reference to the vulnerability warnings?

you did a very good commit message explaining the change, but this seems relevant enough to also have a doc comment explaining why we want this.

@nyamsprod
Copy link
Contributor

I have a genuine question about this PR. If I understand correctly the issue is not when using or calling Uri::__toString method but when using Uri::getPath.

so the correct fix should be to apply the fix on the Uri::getPath method only not on the full representation.

If we follow RFC3986 if the authority and the scheme is removed an URI will be invalid if it starts with a path with multiple /

TL;DR:

Using a UriFactory class I believe the following should be true for all implementing URI class:

$uri = $uriFactory->createUri('http://example.com///google.com');
echo $uri, PHP_EOL; // returns 'http://example.com///google.com'
echo $uri->getPath(); // returns '/google.com'; (multiple leading slashes strip for security
//and 
$uri->withScheme('')->withHost(''); // should throw and InvalidArgumentException because
//If there is no authority the path `///google.com` can not start with a `//`

Is my interpretation correct ?

@kelunik
Copy link

kelunik commented Sep 9, 2022

@nyamsprod I don't see why getPath()should strip the leading slashes while removing scheme and host doesn't do it.

@nyamsprod
Copy link
Contributor

nyamsprod commented Sep 9, 2022

@kelunik the issue is that UriInterface::getPath is used directly on redirection not UriInterface::__toString.

Most HTTP Clients that I see when redirecting on the same base URI will only do

echo "GET ". $uri->getPath(); //pseudo code

RFC3986 allows: https://example.com///wrong-website.tld but the same RFC disallows the full URI with only the following path: ///wrong-website.tld to avoid the security issue described by @weierophinney.

IMO to avoid redirecting the client to the wrong website you only need to strip multiple leading slashes on UriInterface::getPath not on the full URI.

Removing the scheme and the host is already cover by the RFC which states that a URI without authority and without scheme can not have its path starting with multiple forward slashes.
What's not covered is how you return the path for an URI object whose string representation is https://example.com///wrong-website.tld.

To me at least, normalizing the full URI only to secure getPath behaviour seems overkill. We should secure getPath and all will be OK.

Hope this is more clear.

last but not least why removing the scheme or the host should affect the path. the issue is with the path and no other URI component.

@weierophinney
Copy link
Contributor Author

The approach to only normalize on getPath() makes sense to me, except in one scenario: an instance where there is no scheme or authority defined. In that case, the string representation contains only the path, which in this case would have multiple leading slashes, leading to the vulnerability.

HOWEVER, that said, the PSR-7 spec does account for this in the docblock of the request __toString() method:

  • If the path is starting with more than one "/" and no authority is present, the starting slashes MUST be reduced to one.

Regarding this:

What's not covered is how you return the path for an URI object whose string representation is https://example.com///wrong-website.tld

The URI is considered valid per RFC-3986, but currently triggers the CVE noted in the patch description when emitting only the path without normalizing the leading slashes. That's what we're trying to avoid here. If we go the route you are suggesting, getPath() would return /wrong-website.tld, while __toString() would return it verbatim. The full URI string representation is fine; it's only when emitting only the path that the CVE is triggered, and that includes using it in Location headers, or when using origin-form for creating an HTTP request (where the path + query string is used).

As such, I'd be willing to update this patch as follows:

  • Validate that getPath() normalizes the path to reduce multiple leading slashes to one.
  • Validate that __toString() normalizes the path to reduce multiple leading slashes to one IF AND ONLY IF no authority is present in the instance; otherwise, the any leading slashes would be emitted as-is.
  • Validate that RequestInterface::getRequestTarget(), when emitting origin-form, normalizes the path to reduce multiple leading slashes to one. (This should always be the case if the implementation uses getPath(), but it's good to validate.)

If everyone agrees to this, I'll make the changes for the final review.

@kelunik
Copy link

kelunik commented Sep 9, 2022

To me at least, normalizing the full URI only to secure getPath behaviour seems overkill. We should secure getPath and all will be OK.

IMO manually querying the path should have the same serialization as making the URL relative by removing host and scheme and then serializing the whole URL.

@kelunik
Copy link

kelunik commented Sep 9, 2022

@weierophinney Sounds good to me.

@nyamsprod
Copy link
Contributor

an instance where there is no scheme or authority defined. In that case, the string representation contains only the path, which in this case would have multiple leading slashes, leading to the vulnerability.

@weierophinney according to RFC3986 that instance is already invalid see section https://www.rfc-editor.org/rfc/rfc3986#section-3

When authority is present, the path must either be empty or begin with a slash ("/") character. When authority is not present, the path cannot begin with two slash characters ("//").

So to my understanding this is already covered (ie the instance should throw or be invalid). Either way I am OK with your suggestions.

- `testPathWithMultipleSlashes()` modified to have a path with multiple slashes only in the middle of the path.
- Renames `testProperlyTrimsLeadingSlashesToPreventXSS()` to `testGetPathNormalizesMultipleLeadingSlashesToSingleSlashToPreventXSS()`, and has it assert only against results of `getPath()`; on completion, it returns an associative array with the expected URI string and the URI instance.
- Adds `testStringRepresentationWithMultipleSlashes()`; depends on `testGetPathNormalizesMultipleLeadingSlashesToSingleSlashToPreventXSS()`, and asserts that the string representation is identical to the original URI string.

Signed-off-by: Matthew Weier O'Phinney <matthew@weierophinney.net>
Adds RequestIntegrationTest::testGetRequestTargetInOriginFormNormalizesUriWithMultipleLeadingSlashesInPath(), which verifies that calling `getRequestTarget()` with a URI that contains a path with multiple leading slashes normalizes those slases to a single leading slash, in order to prevent XSS attacks.

Signed-off-by: Matthew Weier O'Phinney <matthew@weierophinney.net>
Signed-off-by: Matthew Weier O'Phinney <matthew@weierophinney.net>
@weierophinney
Copy link
Contributor Author

Finally finished up this patch. I've covered all test cases discussed previously for URIs containing paths with multiple leading slashes:

  • UriInterface::getPath() should normalize multiple leading slashes to a single slash.
  • UriInterface::__toString() should leave the path intact.
  • RequestInterface::getRequestTarget() should normalize multiple leading slashes to a single slash when returning origin-form (which is the default specified by the spec).

I modified the original test, UriIntegrationTest::testPathWithMultipleSlashes(), such that the path does NOT contain multiple leading slashes. Two new tests were added to that test case to cover the first two bullet points, and one new test added to RequestIntegrationTest to cover the last.

Copy link
Contributor

@dbu dbu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you, i agree with the changes.

can you please

  • add a note to the changelog
  • add your explanation of the CVE to the XSS test, and also mention where we are ok to keep the multiple slashes because there is no risk?

- Documents the CVE that led to the new tests.
- Documents the differences in behavior between using UriInterface::getPath(), UriInterface::__toString(), and RequestInterface::getRequestTarget().
@weierophinney
Copy link
Contributor Author

can you please

  • add a note to the changelog
  • add your explanation of the CVE to the XSS test, and also mention where we are ok to keep the multiple slashes because there is no risk?

Done!

Copy link
Contributor

@dbu dbu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you!

@dbu dbu merged commit d11c832 into php-http:master Dec 1, 2022
@dbu
Copy link
Contributor

dbu commented Dec 1, 2022

should i do a new release with this change?

@weierophinney
Copy link
Contributor Author

should i do a new release with this change?

Yes, please. When you do, I'll run laminas-diactoros against it.

@weierophinney weierophinney deleted the hotfix/uri-xss-double-slash-test branch December 1, 2022 15:05
@dbu
Copy link
Contributor

dbu commented Dec 1, 2022

weierophinney added a commit to weierophinney/laminas-diactoros that referenced this pull request Dec 14, 2022
Updates to 1.2.0, which adds the tests we wrote for mitigating ZF2015-05, with a few changes:

- When creating the string representation of the URL, we DO NOT normalize the path to remove multiple leading slashes.
  In its absolute form, this is not necessary.
- All normalization is done via `getPath()`; this mitigates the common XSS scenario.
- It adds a test to validate that when using origin-form during a `RequestInterface::getRequestTarget()` call, it will use the results of `getPath()`, as this is a scenario where the XSS could also occur.

I have removed one test from `UriTest`, as it contradicts the first point above.
Since the scenario is covered in the PSR-7 integration tests, we are covered.

See php-http/psr7-integration-tests#54 for more details.

Signed-off-by: Matthew Weier O'Phinney <matthew@weierophinney.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PSR-7 Integration Tests Failing
4 participants