Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-12567: [C++][Gandiva] Implement ILIKE SQL function #10179

Closed

Conversation

jvictorhuguenin
Copy link
Contributor

No description provided.

@github-actions
Copy link

namespace gandiva {

/// Function Holder for SQL 'ilike'
class GANDIVA_EXPORT IlikeHolder : public FunctionHolder {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can reuse the LikeHolder class and provide the case-sensitivity option in the make method

Copy link
Contributor Author

@jvictorhuguenin jvictorhuguenin Apr 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I created a Make method and constructor for the ILIKE function on LikeHolder, and deleted the IlikeHolder

std::string& pattern = holder->pattern_;
auto literal_type = node.children().at(1)->return_type();

if (RE2::FullMatch(pattern, starts_with_regex_)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems incorrect, they should be case-insensitive

Copy link
Contributor Author

@jvictorhuguenin jvictorhuguenin Apr 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not really, the pattern has (?i)(google/re2#197) to turn case-insensitive on for the pattern.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this matching doesn't matter..actual starts_with, ends_with, sub_str are not case-insensitive

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we match on the pattern for starts_with, ends_with cases, and replace with them..which won't be case-insensitive

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, now both like and ilike are using the same method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not yet fixed. You are still replacing with starts_with, ends_with which are case-sensitive. You need to disable TryOptimise for ILIKE

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, sorry I thought first that it had no difference.

@@ -48,6 +51,9 @@ class GANDIVA_EXPORT LikeHolder : public FunctionHolder {
private:
explicit LikeHolder(const std::string& pattern) : pattern_(pattern), regex_(pattern) {}

explicit LikeHolder(const std::string& pattern, RE2::Options regex_op)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need explicit here. explicit is used to prevent implicit conversions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

RE2::Options regex_op;
regex_op.set_case_sensitive(false); // set case-insensitive for ilike function.

return Make(arrow::util::get<std::string>(literal->holder()), holder, regex_op);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why create a new method for this. Modify the existing one take options

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

std::string pcre_pattern;
ARROW_RETURN_NOT_OK(RegexUtil::SqlLikePatternToPcre(sql_pattern, pcre_pattern));

auto lholder = std::shared_ptr<LikeHolder>(new LikeHolder(pcre_pattern));
std::shared_ptr<LikeHolder> lholder;
if (regex_op.case_sensitive()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just call holder = std::shared_ptr<LikeHolder>(new LikeHolder(pcre_pattern, regex_op));
No need for if here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@praveenbingo
Copy link
Contributor

@projjal needs a rebase

@jvictorhuguenin jvictorhuguenin force-pushed the feature/implement-sql-ilike branch from 608e08f to f160880 Compare June 7, 2021 13:00
jvictorhuguenin added a commit to s1mbi0se/arrow that referenced this pull request Sep 21, 2021
Closes apache#10179 from jvictorhuguenin/feature/implement-sql-ilike and squashes the following commits:

f160880 <frank400> Optimize holder constructor call
97e6e2d <frank400> Remove unnecessary Make method
c2363b1 <frank400> Disable TryOptimize for ilike
a484149 <frank400> Fix checkstyle on cmake file
c6a8372 <frank400> Delete unnecessary holder
4be6cc6 <frank400> Fix redefined function
b78085a <frank400> Fix miss include
2efd43e <frank400> Implement ilike function

Authored-by: frank400 <j.victorhuguenin2018@gmail.com>
Signed-off-by: Praveen <praveen@dremio.com>
(cherry picked from commit 0072c67)
@anthonylouisbsb anthonylouisbsb deleted the feature/implement-sql-ilike branch September 28, 2021 22:32
zhouyuan pushed a commit to zhouyuan/arrow that referenced this pull request Nov 24, 2021
Closes apache#10179 from jvictorhuguenin/feature/implement-sql-ilike and squashes the following commits:

f160880 <frank400> Optimize holder constructor call
97e6e2d <frank400> Remove unnecessary Make method
c2363b1 <frank400> Disable TryOptimize for ilike
a484149 <frank400> Fix checkstyle on cmake file
c6a8372 <frank400> Delete unnecessary holder
4be6cc6 <frank400> Fix redefined function
b78085a <frank400> Fix miss include
2efd43e <frank400> Implement ilike function

Authored-by: frank400 <j.victorhuguenin2018@gmail.com>
Signed-off-by: Praveen <praveen@dremio.com>
zhouyuan added a commit to oap-project/arrow that referenced this pull request Nov 25, 2021
* ARROW-11960: [C++][Gandiva] Support escape in LIKE

Add gdv_fn_like_utf8_utf8_int8 function in Gandiva to support escape char in LIKE. An escape char is stored in an int8 type which is compatible with char type in C++.

Closes apache#9700 from Crystrix/arrow-11960

Authored-by: crystrix <chenxi.li@live.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>

* ARROW-12567: [C++][Gandiva] Implement ILIKE SQL function

Closes apache#10179 from jvictorhuguenin/feature/implement-sql-ilike and squashes the following commits:

f160880 <frank400> Optimize holder constructor call
97e6e2d <frank400> Remove unnecessary Make method
c2363b1 <frank400> Disable TryOptimize for ilike
a484149 <frank400> Fix checkstyle on cmake file
c6a8372 <frank400> Delete unnecessary holder
4be6cc6 <frank400> Fix redefined function
b78085a <frank400> Fix miss include
2efd43e <frank400> Implement ilike function

Authored-by: frank400 <j.victorhuguenin2018@gmail.com>
Signed-off-by: Praveen <praveen@dremio.com>

* ARROW-12410: [C++][Gandiva] Implement regexp_replace function on Gandiva

Closes apache#10059 from rodrigojdebem/feature/implement-regexp-replace and squashes the following commits:

baf2778 <rodrigojdebem> Add implementation for REGEXP_REPLACE

Authored-by: rodrigojdebem <rodrigodebem1@gmail.com>
Signed-off-by: Praveen <praveen@dremio.com>

Co-authored-by: crystrix <chenxi.li@live.com>
Co-authored-by: frank400 <j.victorhuguenin2018@gmail.com>
Co-authored-by: rodrigojdebem <rodrigodebem1@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants