Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add URI encoding to mongo auth parameters #986

Merged
merged 4 commits into from
Mar 17, 2016
Merged

Conversation

bgw
Copy link
Contributor

@bgw bgw commented Mar 12, 2016

The mongodb driver requires auth values be URI encoded: mongodb/node-mongodb-native@0440630

This uses node's built-in url module to encode the auth portion, by parsing and re-formatting it, which causes special characters to get URI encoded properly: https://nodejs.org/api/url.html#url_escaped_characters

This is all a bit silly since mongodb just takes our passed uri, and runs it through the same url parser again, but not before explicitly erroring on '@' characters in the uri.

This is similiar to #148 (reverted by #297), but with much less code, and hopefully less breakage. Also, note that uri_decode_auth is no longer needed. That was removed in the above referenced node-mongodb-native commit.

I've tested this on usernames and passwords with @, !, +, and a space.

Presumably this would also work with usernames and passwords that are already URI encoded (since parseUrl will simply unescape it, and formatUrl will escape it again).

@codecov-io
Copy link

Current coverage is 91.63%

Merging #986 into master will increase coverage by +0.01% as of 3be01d7

@@            master    #986   diff @@
======================================
  Files           73      73       
  Stmts         4432    4434     +2
  Branches       888     888       
  Methods          0       0       
======================================
+ Hit           4061    4063     +2
  Partial         10      10       
  Missed         361     361       

Review entire Coverage Diff as of 3be01d7

Powered by Codecov. Updated on successful CI builds.

@drew-gross
Copy link
Contributor

Can you add a test that includes a replicaset URI and a URI with @ in the password?

@facebook-github-bot
Copy link

@bgw updated the pull request.

@facebook-github-bot
Copy link

@bgw updated the pull request.

@bgw
Copy link
Contributor Author

bgw commented Mar 14, 2016

You probably don't want to merge this.

In writing a test for the replica set, I discovered that url.parse breaks on the replica set syntax. I included a patch to node's url module that works around this.

I understand that we probably don't want to merge this (for a lot of reasons), but at least this shows exactly where the issue is, and will hopefully yield some useful discussion.

(code >= 65/*A*/ && code <= 90/*Z*/) ||
code === 43/*+*/ ||
code === 95/*_*/ ||
/* BEGIN MONGO URI PATCH */
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's where the patch to url is.

@drew-gross
Copy link
Contributor

Patching the url module does seem rather drastic. In go, the replicaset URL just works with the default parser. In Ruby, we remove characters from the end of the URL one at a time until it parses successfully, do the modifications we need (for Ruby we are just replacing the password with ********) then add back the characters we removed. @bgw what do you think of that approach?

@bgw
Copy link
Contributor Author

bgw commented Mar 14, 2016

@drew-gross The parser doesn't throw an exception. Instead, parse followed by format converts

mongodb://test:testpass@ds056315-a0.mongolab.com:59325,ds059315-a1.mongolab.com:59315/testDBname?replicaSet=rs-ds059415

to

mongodb://test:testpass@ds056315-a0.mongolab.com:59315/:59325,ds059315-a1.mongolab.com/testDBname?replicaSet=rs-ds059415

and the Url object is:

Url {
  protocol: 'mongodb:',
  slashes: true,
  auth: 'test:testpass',
  host: 'ds056315-a0.mongolab.com:59315',
  port: '59315',
  hostname: 'ds056315-a0.mongolab.com',
  hash: null,
  search: '?replicaSet=rs-ds059415',
  query: 'replicaSet=rs-ds059415',
  pathname: '/:59325,ds059315-a1.mongolab.com/testDBname',
  path: '/:59325,ds059315-a1.mongolab.com/testDBname?replicaSet=rs-ds059415',
  href: 'mongodb://test:testpass@ds056315-a0.mongolab.com:59315/:59325,ds059315-a1.mongolab.com/testDBname?replicaSet=rs-ds059415' }

While unlikely, it's possible that the resulting hostname and path are the result of a correct parsing operation. I could implement a heuristic to figure out when to cut characters off of the end (eg, does the pathname match /^\/:\d+,/?), but I feel that that would also be fragile.

Browsers have a URL constructor that gives a more-easily identifiable wrong result:

URL { href: "mongodb://test:testpass@ds056315-a0…", origin: "null", protocol: "mongodb:", username: "", password: "", host: "", hostname: "", port: "", pathname: "//test:testpass@ds056315-a0.mongola…", search: "" }

but unfortunately node doesn't include it (presumably because it's considered a DOM API).

@flovilmart
Copy link
Contributor

It seems that the mongoclient properly interpolates those replica URL's, http://mongodb.github.io/node-mongodb-native/2.1/reference/connecting/connection-settings/ didnt we start that thread for improperly parsed special characters in usernames/passwords?

The solution you were moving forward was involving a first parser to make sure those values would be properly encoded. Let's drop URL parser, and interpolate the username/password ourselves no?

@bgw
Copy link
Contributor Author

bgw commented Mar 14, 2016

@flovilmart node-mongodb-native has a nearly 400 line function for parsing these URIs using a combination of url.parse and a pile of .split() and regex. They explicitly disallow @ symbols, and I believe that's because it will cause issues with their later split call.

It would be great if the mongo client had a proper parser, but that's a much larger task.

@flovilmart
Copy link
Contributor

Unless we encode the @. Should we check if we have multiple @ chars in the URL and throw in that case?

@bgw
Copy link
Contributor Author

bgw commented Mar 14, 2016

Unless we encode the @.

Right, but how do you find what @ symbols should be encoded and which shouldn't, short of writing a real parser.

Should we check if we have multiple @ chars in the URL and throw in that case?

That's what's happening already in master. This PR is because users have @ symbols in their password or username, and we want to auto-encode them, like parse.com does.

IMHO, another valid approach might be to simply throw a more descriptive error, and/or improve the docs, so users can now to replace @ with %40.

@flovilmart
Copy link
Contributor

@ and : are problematic. I recall another user having that char in their URI provided by mLab.

@drew-gross
Copy link
Contributor

There seems to be a handful of mongo url parsers on npm, could we use one of those? If not I would be fine with either vendoring and patching the url library, or with throwing a better error and/or noting the issue in the docs.

@bgw
Copy link
Contributor Author

bgw commented Mar 15, 2016

@drew-gross

There seems to be a handful of mongo url parsers on npm, could we use one of those?

None of these seem to support symbols in usernames/password properly.

If you think it's a good idea, I could pick whatever one least-bad and submit a patch. The first and last seem to just be an old version of the parser in the node driver extracted into an npm module. The second is developed my mongolab, but looks like a similar but different implementation than the one used in the upstream driver. The third is a literate-coffeescript implementation, and uses a bunch of regex.

As another possibility, I could simply write a real parser with peg.js (I have experience with this kind of stuff). If I could make it fully compatible with the mongo driver's existing parser (which is already well unit tested), we could possibly get support for symbols in the username and password fixed upstream, and clean up some of their code in the process. The downside here is of course that it would take a few days, as opposed to up to a few hours to vendor the url library.

I'm also okay with vendoring the url library at this point (this was supposed to be an easy task!), but I'd like to know your opinion of where the best place to stick it would be.

parse-mongo-url

require("parse-mongo-url")("mongodb://user!with@+ symbols:password!with@+ symbols@localhost:1234/parse")

selection_030

mongodb-uri

var mongodbUri = require("mongodb-uri")
var uri = mongodbUri.format(mongodbUri.parse('mongodb://user!with@+ symbols:password!with@+ symbols@localhost:1234/parse'));

selection_033

mongo-uri

require("mongo-uri").parse("mongodb://user!with@+ symbols:password!with@+ symbols@localhost:1234/parse")

selection_031

mongodb-url

require("mongodb-url")("mongodb://user!with@+ symbols:password!with@+ symbols@localhost:1234/parse")

selection_032

@drew-gross
Copy link
Contributor

Yeah, I definitely underestimated the difficulty of this task, sorry about that :( vendoring something seems like the idea solution to me right now, I'd say into a new src/vendor folder.

bgw added 4 commits March 16, 2016 14:31
The mongodb driver requires auth values be URI encoded:
mongodb/node-mongodb-native@0440630

This uses node's built-in url module to encode the auth portion, by
parsing and re-formatting it, which causes special characters to get URI
encoded properly:
https://nodejs.org/api/url.html#url_escaped_characters

This is all a bit silly since mongodb just takes our passed uri, and
runs it through the same url parser again, but not before explicitly
erroring on '@' characters in the uri.

This is similiar to #148 (reverted by #297), but with much less code,
and hopefully less breakage. Also, note that `uri_decode_auth` is no
longer needed. That was removed in the above referenced
node-mongodb-native commit.

I've tested this on usernames and passwords with @, !, +, and a space.

Presumably this would also work with usernames and passwords that are
already URI encoded (since parseUrl will simply unescape it, and
formatUrl will escape it again).
This uses a *slightly* patched version of node's uri module to allow
commas and colons in hostnames, which causes the parsed representation
of replica sets to be less-awful.

Hostname are still somewhat broken in this representation, because we
have an array of hosts expressed as a list, but this is the
minimum-effort solution to getting format to be able to reprint a parsed
replica set correctly.

I understand that we probably don't want to merge this (for a lot of
reasons), but at least this shows exactly where the issue is, and yields
some useful discussion.
And add a README to src/vendor
@bgw bgw force-pushed the mongo-uri-encode-auth branch from 7a295fa to 530fad5 Compare March 16, 2016 21:55
@bgw
Copy link
Contributor Author

bgw commented Mar 16, 2016

Vendor-ified. Let me know if you want me to squash these commits before merging.

@facebook-github-bot
Copy link

@bgw updated the pull request.

@drew-gross
Copy link
Contributor

Thanks! Glad to see this issue finally put to rest. Normally we ask to squash if the PR contains many unrelated or reverted commits, but this one seems fine as is. Up to you. I've given you write access to this repo as well, we also have the reviewer-merge tag here, just like the dashboard repo.

bgw added a commit that referenced this pull request Mar 17, 2016
Add URI encoding to mongo auth parameters
@bgw bgw merged commit 14d3062 into master Mar 17, 2016
@drew-gross drew-gross deleted the mongo-uri-encode-auth branch March 17, 2016 18:02
@vladkosarev
Copy link

Is this supposed to be live now (2.2.6)? I had the password problem with a complex password which were resolved by manually URL encoding the password.

@drew-gross
Copy link
Contributor

Yes it should be. Can you post a mongo url similar the one you were using that fails? We can add it to our test cases.

@vladkosarev
Copy link

This is the password that won't work properly (complains about access to _SCHEMA) -
^U7#V6W$NA$nE2bf

It also behaves differently if you remove the first character (can't auth) -
U7#V6W$NA$nE2bf

Both work fine if I manually encode them.

@tony-pizza
Copy link

Running into a similar issue with a password that contains uppercase letters before a forward slash /. It works if I manually encode the slash. Otherwise everything before the slash is lowercased:

before formatting: MY/PASSWORD
 after formatting: my/PASSWORD

@tony-pizza tony-pizza mentioned this pull request Apr 14, 2016
3 tasks
@drew-gross
Copy link
Contributor

@vladkosarev @peeinears I think we are going to start recommending manual encoding. We've already had to vendor and patch a URL parsing library, and I think further patches are only going to cause a mess later.

@vladkosarev
Copy link

@drew-gross I'm all for that. If official docs mentioned encoding of passwords it would've saved me hours of frustration.

@vitaly-t
Copy link
Contributor

vitaly-t commented Mar 24, 2019

Could all this be simplified by using the generic connection-string? That MongoDB legacy parser is an overkill anyhow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants