-
Notifications
You must be signed in to change notification settings - Fork 7.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix several bugs with collations. #8051
Conversation
Now we will throw an exception with some trash locales like |
dbms/src/Columns/Collator.cpp
Outdated
} | ||
return false; | ||
} | ||
|
||
Collator::Collator(const std::string & locale_) : locale(Poco::toLower(locale_)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if we use toLowerInPlace
, then can pass locale_
by value and maybe not copy string somewhere ...
it's not important also.
UErrorCode status = U_ZERO_ERROR; | ||
|
||
collator = ucol_open(locale.c_str(), &status); | ||
if (status != U_ZERO_ERROR) | ||
if (U_FAILURE(status)) | ||
{ | ||
ucol_close(collator); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RAII
would be better, but ok for now.
dbms/src/Interpreters/sortBlock.cpp
Outdated
@@ -14,15 +15,21 @@ namespace ErrorCodes | |||
} | |||
|
|||
|
|||
static inline bool needCollation(const IColumn * column, const SortColumnDescription & description) | |||
static inline const IColumn * needCollation(const IColumn * column, const SortColumnDescription & description) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now function's semantic has changes, but function name looks like it is still bool.
dbms/src/Interpreters/sortBlock.cpp
Outdated
{ | ||
const ColumnString & column_string = typeid_cast<const ColumnString &>(*it->first); | ||
const ColumnString & column_string = typeid_cast<const ColumnString &>(*column_string_ptr); | ||
res = column_string.compareAtWithCollation(a, b, *it->first, *it->second.collator); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that if needCollation
have returned nested column, it would have size=1. But we compare rows with numbers a
and b
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll write a test.
dbms/src/Interpreters/sortBlock.cpp
Outdated
{ | ||
const ColumnString & column_string = typeid_cast<const ColumnString &>(*column); | ||
const ColumnString & column_string = typeid_cast<const ColumnString &>(*column_string_ptr); | ||
column_string.getPermutationWithCollation(*description[0].collator, reverse, limit, perm); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same question. If column was const, then nested has size=1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost ok.
Performance tests flap. |
Fix several bugs with collations. (cherry picked from commit af7b8f9)
Fix several bugs with collations. (cherry picked from commit af7b8f9)
Fix several bugs with collations. (cherry picked from commit af7b8f9)
Fix several bugs with collations. (cherry picked from commit af7b8f9)
Fix several bugs with collations. (cherry picked from commit af7b8f9)
I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en
Changelog category (leave one):
Changelog entry (up to few sentences, required except for Non-significant/Documentation categories):
Restore support of all ICU locales, add the ability to apply collations for constant expressions and add language name to system.collations table.