Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Support InPredicate in delete statement #4006

Merged
merged 22 commits into from
Aug 6, 2020

Conversation

caiconghui
Copy link
Contributor

This PR is to add inPredicate support to delete statement, and add max_allowed_in_element_num_of_delete variable to limit element num of InPredicate in delete statement.

@caiconghui
Copy link
Contributor Author

for #4008

@morningman
Copy link
Contributor

In fact, we are also trying to implement a batch delete function that can support the deletion of a large number of specified keys (including specifying multiple values in inPredicate)

In your current implementation, if there are too many values in InPredicate (for example, tens of thousands), then each column must be looped through tens of thousands of conditions during query, which may be inefficient.

Maybe you can discuss it with @yangzhg about it.

fe/src/main/java/org/apache/doris/load/DeleteHandler.java Outdated Show resolved Hide resolved
fe/src/main/java/org/apache/doris/qe/SessionVariable.java Outdated Show resolved Hide resolved
be/src/olap/delete_handler.cpp Outdated Show resolved Hide resolved
be/src/olap/olap_cond.cpp Outdated Show resolved Hide resolved
be/src/olap/olap_cond.cpp Outdated Show resolved Hide resolved
be/src/olap/olap_cond.cpp Outdated Show resolved Hide resolved
be/src/olap/olap_cond.h Outdated Show resolved Hide resolved
wrapperField.release_field();
return ret;
}
case OP_NOT_IN: {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These 2 case (OP_IN and OP_NOT_IN) looks same?

Copy link
Contributor Author

@caiconghui caiconghui Jul 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one is "!="(line 196), the other is "=="(line 202)

@@ -176,6 +150,14 @@ OLAPStatus Cond::init(const TCondition& tcond, const TabletColumn& column) {
tcond.column_name.c_str(), operand.c_str(), op);
return res;
}
if (min_value_field == nullptr || f->cmp(min_value_field) > 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a little confused.

A->cmp(B) > 0 means A > B or A < B?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my bad, I will fix it A->cmp(B) > 0 means A > B

typedef std::unordered_set<const WrapperField*, FieldHash, FieldEqual> FieldSet;
FieldSet operand_set;
WrapperField* min_value_field;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comment to explains these 2 new fields.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@morningman morningman self-assigned this Jul 29, 2020
@morningman morningman added area/delete Issues or PRs related to DELETE operation kind/feature Categorizes issue or PR as related to a new feature. labels Jul 29, 2020
morningman
morningman previously approved these changes Aug 1, 2020
Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@morningman
Copy link
Contributor

UT failed.
FE:
DeleteStmtTest.testAnalyze

string condition_str = construct_sub_predicates(condition);
del_pred->add_sub_predicates(condition_str);
LOG(INFO) << "store one sub-delete condition. condition=" << condition_str;
if (condition.condition_values.size() > 1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to check the condition.condition_op to see if this condition is InPredicate, not by the size of condition.condition_values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because inPredicate with one element to stored as BinaryPredicate which is more more light and efficient.

morningman
morningman previously approved these changes Aug 1, 2020
Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@morningman morningman added approved Indicates a PR has been approved by one committer. branch-0.13 PR which need to merge to branch 0.13 labels Aug 1, 2020
kangkaisen
kangkaisen previously approved these changes Aug 3, 2020
Copy link
Contributor

@kangkaisen kangkaisen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

protobuf LGTM

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@morningman morningman merged commit eefad13 into apache:master Aug 6, 2020
@EmmyMiao87 EmmyMiao87 mentioned this pull request Aug 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. area/delete Issues or PRs related to DELETE operation branch-0.13 PR which need to merge to branch 0.13 kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants