-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix](judge-partition) Fix incorrect logic in determining partitioned table #27515
Conversation
…is a partitioned table
run buildall |
(From new machine)TeamCity pipeline, clickbench performance test result: |
TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR approved by at least one committer and no changes requested. |
PR approved by anyone and no changes requested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…is a partitioned table (apache#27515) The old logic used to determine whether it was a partition table based on the number of buckets, but if I had a partition table with only one partition and the number of buckets in that partition was 1, it would be mistakenly recognized as a non partition table. ``` Table[test_load_doris_to_hive_2] is not partitioned ```
This PR #27515 change the logic if Table's `isPartitioned()` method. But this method has 2 usages: 1. To check whether a table is range or list partitioned, for some DML operation such as Alter, Export. For this case, it should return true if the table is range or list partitioned. even if it has only one partition and one buckets. 2. To check whether the data is distributed (either by partitions or by buckets), for query planner. For this case, it should return true if table has more than one bucket. Even if this table is not range or list partitioned, if it has more than one bucket, it should return true. So we should separate this method into 2, for different usages. Otherwise, it may cause some unreasonable plan shape
…is a partitioned table (apache#27515) The old logic used to determine whether it was a partition table based on the number of buckets, but if I had a partition table with only one partition and the number of buckets in that partition was 1, it would be mistakenly recognized as a non partition table. ``` Table[test_load_doris_to_hive_2] is not partitioned ```
This PR apache#27515 change the logic if Table's `isPartitioned()` method. But this method has 2 usages: 1. To check whether a table is range or list partitioned, for some DML operation such as Alter, Export. For this case, it should return true if the table is range or list partitioned. even if it has only one partition and one buckets. 2. To check whether the data is distributed (either by partitions or by buckets), for query planner. For this case, it should return true if table has more than one bucket. Even if this table is not range or list partitioned, if it has more than one bucket, it should return true. So we should separate this method into 2, for different usages. Otherwise, it may cause some unreasonable plan shape
This PR apache#27515 change the logic if Table's `isPartitioned()` method. But this method has 2 usages: 1. To check whether a table is range or list partitioned, for some DML operation such as Alter, Export. For this case, it should return true if the table is range or list partitioned. even if it has only one partition and one buckets. 2. To check whether the data is distributed (either by partitions or by buckets), for query planner. For this case, it should return true if table has more than one bucket. Even if this table is not range or list partitioned, if it has more than one bucket, it should return true. So we should separate this method into 2, for different usages. Otherwise, it may cause some unreasonable plan shape
…is a partitioned table (apache#27515) The old logic used to determine whether it was a partition table based on the number of buckets, but if I had a partition table with only one partition and the number of buckets in that partition was 1, it would be mistakenly recognized as a non partition table. ``` Table[test_load_doris_to_hive_2] is not partitioned ```
This PR apache#27515 change the logic if Table's `isPartitioned()` method. But this method has 2 usages: 1. To check whether a table is range or list partitioned, for some DML operation such as Alter, Export. For this case, it should return true if the table is range or list partitioned. even if it has only one partition and one buckets. 2. To check whether the data is distributed (either by partitions or by buckets), for query planner. For this case, it should return true if table has more than one bucket. Even if this table is not range or list partitioned, if it has more than one bucket, it should return true. So we should separate this method into 2, for different usages. Otherwise, it may cause some unreasonable plan shape
Proposed changes
Issue Number: close #xxx
The old logic used to determine whether it was a partition table based on the number of buckets, but if I had a partition table with only one partition and the number of buckets in that partition was 1, it would be mistakenly recognized as a non partition table.
Further comments
If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...