-
Notifications
You must be signed in to change notification settings - Fork 30.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
url: ignore IDN errors when domainname have two hyphens #12966
Changes from all commits
2ee246f
9dee867
311667b
5e0f58c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -435,8 +435,7 @@ bool InitializeICUDirectory(const std::string& path) { | |
|
||
int32_t ToUnicode(MaybeStackBuffer<char>* buf, | ||
const char* input, | ||
size_t length, | ||
bool lenient) { | ||
size_t length) { | ||
UErrorCode status = U_ZERO_ERROR; | ||
uint32_t options = UIDNA_DEFAULT; | ||
options |= UIDNA_NONTRANSITIONAL_TO_UNICODE; | ||
|
@@ -461,14 +460,7 @@ int32_t ToUnicode(MaybeStackBuffer<char>* buf, | |
&status); | ||
} | ||
|
||
// UTS #46's ToUnicode operation applies no validation of domain name length | ||
// (nor a flag requesting it to do so, like VerifyDnsLength for ToASCII). For | ||
// that reason, unlike ToASCII below, ICU4C correctly accepts long domain | ||
// names. However, ICU4C still sets the EMPTY_LABEL error in contrary to UTS | ||
// #46. Therefore, explicitly filters out that error here. | ||
info.errors &= ~UIDNA_ERROR_EMPTY_LABEL; | ||
|
||
if (U_FAILURE(status) || (!lenient && info.errors != 0)) { | ||
if (U_FAILURE(status)) { | ||
len = -1; | ||
buf->SetLength(0); | ||
} else { | ||
|
@@ -517,6 +509,21 @@ int32_t ToASCII(MaybeStackBuffer<char>* buf, | |
info.errors &= ~UIDNA_ERROR_LABEL_TOO_LONG; | ||
info.errors &= ~UIDNA_ERROR_DOMAIN_NAME_TOO_LONG; | ||
|
||
// These error conditions are mandated unconditionally by UTS #46 version | ||
// 9.0.0 (rev. 17), but were found to be incompatible with many actual domain | ||
// names in the wild. As such, in the current UTS #46 draft (rev. 18) these | ||
// checks are made optional depending on the CheckHyphens flag, which will be | ||
// disabled in WHATWG URL's "domain to ASCII" algorithm as soon as the UTS #46 | ||
// draft becomes standard. | ||
// Refs: | ||
// - https://github.com/whatwg/url/issues/53 | ||
// - http://www.unicode.org/review/pri317/ | ||
// - http://www.unicode.org/reports/tr46/tr46-18.html | ||
// - https://www.icann.org/news/announcement-2000-01-07-en | ||
info.errors &= ~UIDNA_ERROR_HYPHEN_3_4; | ||
info.errors &= ~UIDNA_ERROR_LEADING_HYPHEN; | ||
info.errors &= ~UIDNA_ERROR_TRAILING_HYPHEN; | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Unfortunately I was mistaken in my understanding of ToUnicode, which means that the entire block (ll. 464-485) shouldn't be necessary. While at it, can you delete this block, the erroneous There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
if (U_FAILURE(status) || (!lenient && info.errors != 0)) { | ||
len = -1; | ||
buf->SetLength(0); | ||
|
@@ -533,11 +540,9 @@ static void ToUnicode(const FunctionCallbackInfo<Value>& args) { | |
CHECK_GE(args.Length(), 1); | ||
CHECK(args[0]->IsString()); | ||
Utf8Value val(env->isolate(), args[0]); | ||
// optional arg | ||
bool lenient = args[1]->BooleanValue(env->context()).FromJust(); | ||
|
||
MaybeStackBuffer<char> buf; | ||
int32_t len = ToUnicode(&buf, *val, val.length(), lenient); | ||
int32_t len = ToUnicode(&buf, *val, val.length()); | ||
|
||
if (len < 0) { | ||
return env->ThrowError("Cannot convert name to Unicode"); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
domain to Unicode
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. done all changes.