Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support import true or false as boolean value #3898

Merged
merged 5 commits into from
Jul 2, 2020

Conversation

yangzhg
Copy link
Member

@yangzhg yangzhg commented Jun 18, 2020

Fixes #3831
After this PR
insert into: 1/"1" -> 1, 0/"0"->0, true/"true"->1, false/"false" -> 0, "10"->null, "xxxx" -> null
load: 1/true -> 1, 0/false -> 0, other -> null

@morningman morningman added area/load Issues or PRs related to all kinds of load kind/feature Categorizes issue or PR as related to a new feature. labels Jun 18, 2020
be/src/exprs/cast_functions.cpp Outdated Show resolved Hide resolved
be/src/olap/utils.cpp Show resolved Hide resolved
@yangzhg yangzhg requested a review from chaoyli June 18, 2020 11:35
@caiconghui
Copy link
Contributor

caiconghui commented Jun 19, 2020

@yangzhg Hi, I think the result of processing bool value for stream load and insert stmt should keep consistent.
For now, like insert into xx values(1), result=true, insert into xx values(0) result=false, insert into xxx values("false") result=true insert into xx values(false) result=false insert into xx values("0") result=true, insert into xx values(true) result=true.
This is because insert stmt can analyze the real data type for insert value, so like 1 2 3, will use int_cast_to_bool, even can analyze boolean type. but for stream load, we lost the data type information. we don't know the real data type of the value, we treat all column values as string, so I think may be stream load should be processed specially to keep consistent with insert stmt. maybe need more discussion.

@yangzhg
Copy link
Member Author

yangzhg commented Jun 22, 2020

@yangzhg Hi, I think the result of processing bool value for stream load and insert stmt should keep consistent.
For now, like insert into xx values(1), result=true, insert into xx values(0) result=false, insert into xxx values("false") result=true insert into xx values(false) result=false insert into xx values("0") result=true, insert into xx values(true) result=true.
This is because insert stmt can analyze the real data type for insert value, so like 1 2 3, will use int_cast_to_bool, even can analyze boolean type. but for stream load, we lost the data type information. we don't know the real data type of the value, we treat all column values as string, so I think may be stream load should be processed specially to keep consistent with insert stmt. maybe need more discussion.

@chaoyli Insert into xx values("0") and stream load import file 0 is not consistent in the current code. Insert into xx values("0") result=true, stream load 0 result=false, and now cast_to_bool Only occurs when the type of the column is bool, so it is not contradictory to the current insert

@chaoyli
Copy link
Contributor

chaoyli commented Jun 24, 2020

I think the meaning of caiconghui is to make consistent of stream load and insert in any situation.
Otherwise, it will cause some misunderstanding. @yangzhg

@yangzhg
Copy link
Member Author

yangzhg commented Jun 24, 2020

@yangzhg Hi, I think the result of processing bool value for stream load and insert stmt should keep consistent.
For now, like insert into xx values(1), result=true, insert into xx values(0) result=false, insert into xxx values("false") result=true insert into xx values(false) result=false insert into xx values("0") result=true, insert into xx values(true) result=true.
This is because insert stmt can analyze the real data type for insert value, so like 1 2 3, will use int_cast_to_bool, even can analyze boolean type. but for stream load, we lost the data type information. we don't know the real data type of the value, we treat all column values as string, so I think may be stream load should be processed specially to keep consistent with insert stmt. maybe need more discussion.

@caiconghui I will change the insert statement and stream load to keep consistent with mysql

@yangzhg yangzhg force-pushed the support_bool branch 2 times, most recently from d3ad424 to 2a15dff Compare June 29, 2020 06:38
@morningman morningman added the approved Indicates a PR has been approved by one committer. label Jul 2, 2020
@morningman morningman merged commit 5ade21b into apache:master Jul 2, 2020
@yangzhg yangzhg deleted the support_bool branch July 3, 2020 06:34
@EmmyMiao87 EmmyMiao87 mentioned this pull request Sep 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. area/load Issues or PRs related to all kinds of load kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] load data with bool value will convert to all true value;
5 participants