-
Notifications
You must be signed in to change notification settings - Fork 7.3k
Handling IDNA by adding Punycode encoding in urlParse. #1149 #1174
Conversation
Looking good. Punycode stuff should be moved to a separate file with its own copyright header and the LICENSE file should be updated. |
I would like to not remove those two tests. They were added because people use the url.parse function to screen out "valid" urls from user input, so keeping invalid characters out of the hostname is important. I agree with @ry that punycode should be in lib/punycode.js. We can update the invalid host chars regexps to only screen out the invalid characters that are in the ascii range. |
…k 2 test cases and update urlParse to make sure we skip only invalid chars in the ASCII range.
I've enable back the 2 test cases. I've tweak urlParse to skip only chars within the ASCII range, maybe the way I did it is a little bit crappy? By doing this I can apply IDNA encode after the validation and I'm sure the encoding is done on the hostname only. |
I'm uncomfortable with pulling code off a random website like this and attaching a random copyright header to it. Especially since the author says he created it by modifying a python version. Who created the python version - what's its license? Does it allow that? Can we please rewrite this from scratch using only the RFC? |
He derived his work from the C algorithm that come from the RFC (see: http://tools.ietf.org/html/rfc3492#appendix-C) If you think it's not enough, I'll rewrite the whole implementation of this based on the RFC and the C sample provided in it. |
@ry Punycode or full IDNA? |
@jeremys I have some C code I could tidy up and publish but I'll hold off if you pick up this one. |
The one-stop MIT-licensed solution to all your JS punycode encoding and decoding needs: https://gist.github.com/1035853 |
Ok, I'll replace the concerned code with your gist into my pull request. Thank you very much @bnoordhuis. |
…ib to handle parsing of domain before encoding.
So I've replaced the punycode lib with @bnoordhuis's one. I've just added one more test case. And updated your code to pass JSLint. I've also moved the parsing of the domain and detection to use punycode into url.js. Let me know if it needs anything else. |
Hey there, is there something wrong with my pull request that I should change? |
@jeremys: Ryan is on holiday so it's quiet on the node front. |
Ok @bnoordhuis, thanks for the info, I was just wondering if I was holding up something :) |
@ry with the release of 0.5.0 should I update my pull request? or is there something wrong with it? |
@jeremys Nothing wrong with it. Just been buried under some other things, sorry. |
No worries! |
I haven't tested - but after a cursory glance it looks good to me. @bnoordhuis or @isaacs can test and land at will. |
merging this now. |
Squashed into 2a848fa |
Using @bnoordhuis's punycode lib. Close nodejs#1174 also
We could also export both functions toASCII and toUnicode.