-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-12567: [C++][Gandiva] Implement ILIKE SQL function #10179
ARROW-12567: [C++][Gandiva] Implement ILIKE SQL function #10179
Conversation
cpp/src/gandiva/ilike_holder.h
Outdated
namespace gandiva { | ||
|
||
/// Function Holder for SQL 'ilike' | ||
class GANDIVA_EXPORT IlikeHolder : public FunctionHolder { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can reuse the LikeHolder class and provide the case-sensitivity option in the make method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I created a Make method and constructor for the ILIKE function on LikeHolder, and deleted the IlikeHolder
cpp/src/gandiva/ilike_holder.cc
Outdated
std::string& pattern = holder->pattern_; | ||
auto literal_type = node.children().at(1)->return_type(); | ||
|
||
if (RE2::FullMatch(pattern, starts_with_regex_)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems incorrect, they should be case-insensitive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not really, the pattern has (?i)(google/re2#197) to turn case-insensitive on for the pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this matching doesn't matter..actual starts_with, ends_with, sub_str are not case-insensitive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we match on the pattern for starts_with, ends_with cases, and replace with them..which won't be case-insensitive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, now both like and ilike are using the same method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not yet fixed. You are still replacing with starts_with, ends_with which are case-sensitive. You need to disable TryOptimise for ILIKE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, sorry I thought first that it had no difference.
cpp/src/gandiva/like_holder.h
Outdated
@@ -48,6 +51,9 @@ class GANDIVA_EXPORT LikeHolder : public FunctionHolder { | |||
private: | |||
explicit LikeHolder(const std::string& pattern) : pattern_(pattern), regex_(pattern) {} | |||
|
|||
explicit LikeHolder(const std::string& pattern, RE2::Options regex_op) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need explicit here. explicit is used to prevent implicit conversions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
RE2::Options regex_op; | ||
regex_op.set_case_sensitive(false); // set case-insensitive for ilike function. | ||
|
||
return Make(arrow::util::get<std::string>(literal->holder()), holder, regex_op); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why create a new method for this. Modify the existing one take options
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
cpp/src/gandiva/like_holder.cc
Outdated
std::string pcre_pattern; | ||
ARROW_RETURN_NOT_OK(RegexUtil::SqlLikePatternToPcre(sql_pattern, pcre_pattern)); | ||
|
||
auto lholder = std::shared_ptr<LikeHolder>(new LikeHolder(pcre_pattern)); | ||
std::shared_ptr<LikeHolder> lholder; | ||
if (regex_op.case_sensitive()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just call holder = std::shared_ptr<LikeHolder>(new LikeHolder(pcre_pattern, regex_op));
No need for if here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@projjal needs a rebase |
608e08f
to
f160880
Compare
Closes apache#10179 from jvictorhuguenin/feature/implement-sql-ilike and squashes the following commits: f160880 <frank400> Optimize holder constructor call 97e6e2d <frank400> Remove unnecessary Make method c2363b1 <frank400> Disable TryOptimize for ilike a484149 <frank400> Fix checkstyle on cmake file c6a8372 <frank400> Delete unnecessary holder 4be6cc6 <frank400> Fix redefined function b78085a <frank400> Fix miss include 2efd43e <frank400> Implement ilike function Authored-by: frank400 <j.victorhuguenin2018@gmail.com> Signed-off-by: Praveen <praveen@dremio.com> (cherry picked from commit 0072c67)
Closes apache#10179 from jvictorhuguenin/feature/implement-sql-ilike and squashes the following commits: f160880 <frank400> Optimize holder constructor call 97e6e2d <frank400> Remove unnecessary Make method c2363b1 <frank400> Disable TryOptimize for ilike a484149 <frank400> Fix checkstyle on cmake file c6a8372 <frank400> Delete unnecessary holder 4be6cc6 <frank400> Fix redefined function b78085a <frank400> Fix miss include 2efd43e <frank400> Implement ilike function Authored-by: frank400 <j.victorhuguenin2018@gmail.com> Signed-off-by: Praveen <praveen@dremio.com>
* ARROW-11960: [C++][Gandiva] Support escape in LIKE Add gdv_fn_like_utf8_utf8_int8 function in Gandiva to support escape char in LIKE. An escape char is stored in an int8 type which is compatible with char type in C++. Closes apache#9700 from Crystrix/arrow-11960 Authored-by: crystrix <chenxi.li@live.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com> * ARROW-12567: [C++][Gandiva] Implement ILIKE SQL function Closes apache#10179 from jvictorhuguenin/feature/implement-sql-ilike and squashes the following commits: f160880 <frank400> Optimize holder constructor call 97e6e2d <frank400> Remove unnecessary Make method c2363b1 <frank400> Disable TryOptimize for ilike a484149 <frank400> Fix checkstyle on cmake file c6a8372 <frank400> Delete unnecessary holder 4be6cc6 <frank400> Fix redefined function b78085a <frank400> Fix miss include 2efd43e <frank400> Implement ilike function Authored-by: frank400 <j.victorhuguenin2018@gmail.com> Signed-off-by: Praveen <praveen@dremio.com> * ARROW-12410: [C++][Gandiva] Implement regexp_replace function on Gandiva Closes apache#10059 from rodrigojdebem/feature/implement-regexp-replace and squashes the following commits: baf2778 <rodrigojdebem> Add implementation for REGEXP_REPLACE Authored-by: rodrigojdebem <rodrigodebem1@gmail.com> Signed-off-by: Praveen <praveen@dremio.com> Co-authored-by: crystrix <chenxi.li@live.com> Co-authored-by: frank400 <j.victorhuguenin2018@gmail.com> Co-authored-by: rodrigojdebem <rodrigodebem1@gmail.com>
No description provided.