Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http: decode username and password before encoding #31450

Closed
wants to merge 4 commits into from

Conversation

addaleax
Copy link
Member

@nodejs/url @nodejs/security I’m marking this as blocked, because I’d like somebody to take a look at this who is somewhat familiar with the security implications of this – there are no obvious downsides to me here, but I know this can be quite a sensitive area.

Fixes: #31439

/cc @izonder

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • commit message follows commit guidelines

@addaleax addaleax added http Issues or PRs related to the http subsystem. blocked PRs that are blocked by other issues or PRs. whatwg-url Issues and PRs related to the WHATWG URL implementation. labels Jan 21, 2020
@mcollina
Copy link
Member

I think this PR should also change the docs: https://nodejs.org/api/url.html#url_url_username.

Technically this can be seen as a patch or semver-major change, and I'm uncertain between the two.

@addaleax
Copy link
Member Author

I think this PR should also change the docs: https://nodejs.org/api/url.html#url_url_username.

Done, PTAL

@lpinca
Copy link
Member

lpinca commented Jan 22, 2020

semver-patch in my opinion.

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

- version: REPLACEME
pr-url: https://github.com/nodejs/node/pull/31450
description: When used with `http.request()`, this field will now be
percent-decoded.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing closing -->.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The section on http.request() in http.md seems like a more logical place to me for this note.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The section on http.request() in http.md seems like a more logical place to me for this note.

Why not both? :)

- version: REPLACEME
pr-url: https://github.com/nodejs/node/pull/31450
description: When used with `http.request()`, this field will now be
percent-decoded.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The section on http.request() in http.md seems like a more logical place to me for this note.

@@ -1284,7 +1284,8 @@ function urlToOptions(url) {
options.port = Number(url.port);
}
if (url.username || url.password) {
options.auth = `${url.username}:${url.password}`;
options.auth =
`${decodeURIComponent(url.username)}:${decodeURIComponent(url.password)}`;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This introduces a spec violation. From RFC 7617, section 2:

 Furthermore, a user-id containing a colon character is invalid, as
 the first colon in a user-pass string separates user-id and password
 from one another; text after the first colon is part of the password.
 User-ids containing colons cannot be encoded in user-pass strings.

With this change, I can now write something like:

https.get('https://not%3Agood:god@example.com')

And the server will interpret that as user="not" and pass="good:god" instead of pass="god" (which is the password we still all use for our root accounts, right? Right!)

Copy link

@izonder izonder Jan 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

User-ids containing colons cannot be encoded in user-pass strings.

But what would be proper behavior in case of passing : in username? Encoding username - the wrong way for the computation Authorization header. Pre-validation with throwing an exception? Maybe...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t really want to introduce new exceptions here … maybe just leave it encoded?

The same question also pops up for e.g. %2F for / in the password.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I think : and / are the only problematic ones; for @ it's "last at-sign wins."

It's probably okay to leave those encoded, that's no worse from how it was before.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bnoordhuis Done, PTAL

@jasnell
Copy link
Member

jasnell commented Jan 23, 2020

I'm generally unconvinced we should do this; and if we do make any changes here, it should be done in a manner that is consistent with the WHATWG URL parser. For instance, given the input: https://not%3Agood:g%3aod@example.com, the WHATWG URL parser produces:

URL {
  href: 'https://not%3Agood:g%3aod@example.com/',
  origin: 'https://example.com',
  protocol: 'https:',
  username: 'not%3Agood',
  password: 'g%3aod',
  host: 'example.com',
  hostname: 'example.com',
  port: '',
  pathname: '/',
  search: '',
  searchParams: URLSearchParams {},
  hash: ''
}

(note that https://not%3Agood:g:od@example.com produces the same result.)

@addaleax
Copy link
Member Author

if we do make any changes here, it should be done in a manner that is consistent with the WHATWG URL parser. For instance, given the input: https://not%3Agood:g%3aod@example.com, the WHATWG URL parser produces:

@jasnell I’m not 100 % sure what “consistent with the WHATWG URL parser” means here – the parser itself is not changed in any way. If you do pass in special characters as part of the original URL that are not : or /, this will be treated properly:

> url = new URL('https://not"good:g^od@example.com')
URL {
  href: 'https://not%22good:g%5Eod@example.com/',
  origin: 'https://example.com',
  protocol: 'https:',
  username: 'not%22good',
  password: 'g%5Eod',
  host: 'example.com',
  hostname: 'example.com',
  port: '',
  pathname: '/',
  search: '',
  searchParams: URLSearchParams {},
  hash: ''
}
> urlToOptions(url)
{
  protocol: 'https:',
  hostname: 'example.com',
  hash: '',
  search: '',
  pathname: '/',
  path: '/',
  href: 'https://not%22good:g%5Eod@example.com/',
  auth: 'not"good:g^od'
}

@Trott
Copy link
Member

Trott commented Jan 25, 2020

(needs a rebase)

@jasnell
Copy link
Member

jasnell commented Jan 27, 2020

@addaleax, what I mean is that any decoding/encoding that happens with the username and password here should be identical to what is produced when parsing/serializing with the WHATWG URL parser. This is close, the URL standard does specify some specifics for username and password that are not covered by decodeURIComponent and encodeURIComponent.

@izonder
Copy link

izonder commented Feb 20, 2020

Any updates on the MR?

@@ -1265,6 +1265,10 @@ function domainToUnicode(domain) {
return _domainToUnicode(`${domain}`);
}

function decodeAuth(str) {
return decodeURIComponent(str).replace(':', '%3A').replace('/', '%2F');

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think you need to use global replace at least ;)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@addaleax This is still unaddressed (and, apparently, untested.)

@@ -1265,6 +1265,10 @@ function domainToUnicode(domain) {
return _domainToUnicode(`${domain}`);
}

function decodeAuth(str) {
return decodeURIComponent(str).replace(':', '%3A').replace('/', '%2F');

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be unexpected outcome with undefined url.password or url.username

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible for url.username to be undefined? Isn't it an empty string instead?

@jasnell
Copy link
Member

jasnell commented May 29, 2020

@addaleax ... I took another look at this and I want to remove any objection I have on this. The change looks good

@BridgeAR BridgeAR force-pushed the master branch 2 times, most recently from 8ae28ff to 2935f72 Compare May 31, 2020 12:19
@ronag
Copy link
Member

ronag commented Jun 21, 2020

@addaleax: Is this still blocked?

@addaleax
Copy link
Member Author

@ronag No, I don’t think so, although it’s not ready to land either (there’s a few valid comments above and this needs a rebase)

@@ -1265,6 +1265,10 @@ function domainToUnicode(domain) {
return _domainToUnicode(`${domain}`);
}

function decodeAuth(str) {
return decodeURIComponent(str).replace(':', '%3A').replace('/', '%2F');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/addaleax/node/blob/5ebbb79a3bd56b99821081c23e1f5db9c879fce8/lib/internal/url.js#L1296

const forwardSlashRegEx = /\//g;
+const colonRegEx = /:/g;
Suggested change
return decodeURIComponent(str).replace(':', '%3A').replace('/', '%2F');
return decodeURIComponent(str).replace(colonRegEx, '%3A').replace(forwardSlashRegEx, '%2F');

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only thing todo here I think...

@szmarczak
Copy link
Member

Friendly ping :)

@ronag
Copy link
Member

ronag commented Aug 15, 2020

Conflicts

@ronag
Copy link
Member

ronag commented Aug 15, 2020

I'm removing the blocked label as I suspect it might make collaborators less likely to engage.

@ronag No, I don’t think so, although it’s not ready to land either (there’s a few valid comments above and this needs a rebase)

@ronag ronag removed the blocked PRs that are blocked by other issues or PRs. label Aug 15, 2020
Copy link
Member

@ronag ronag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a request changes to avoid landing while there are still things to address.

@sindresorhus
Copy link

@addaleax Would you be able to address the remaining feedback? 🙏

@ronag
Copy link
Member

ronag commented Dec 27, 2020

@addaleax friendly reminder

@szmarczak
Copy link
Member

Friendly ping :)

@Uzlopak
Copy link
Contributor

Uzlopak commented Jan 6, 2024

@jasnell

Do you think we can close this old PR, as the underlying issue was closed by you few years ago?

@mcollina mcollina closed this Jan 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
http Issues or PRs related to the http subsystem. whatwg-url Issues and PRs related to the WHATWG URL implementation.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Passing username from URL object to http.clientRequest without decoding