-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect result from DuckDB execution #556
Comments
Hmm. Strange... Somehow that Copy pastable reproduction script: CREATE TABLE s (a text[]);
INSERT INTO s VALUES (ARRAY['abc', 'def', 'ghi']);
CREATE TABLE t AS TABLE s;
SELECT * FROM s;
SELECT * FROM t;
SET duckdb.force_execution TO true;
SELECT * FROM s;
SELECT * FROM t; |
not sure if it is related to toast. for varchar, btw, detoast will |
Hi. While investigating the problem, I observed two inconsistent behaviors:
So I suspect the issue lies in the definition of the corresponding table in DuckDB.( I can't figure out how the corresponding table is created.) I haven't quite understood how the |
In DuckDB the canonical name for an unlimited size text column is `VARCHAR`[1], in Postgres this is `TEXT`[2]. In DuckDB `TEXT` is simply an alias for `VARCHAR` type, and there's no way to know what was provided by the user. In Postgres these types are actually distinct, although behave exactly the same for unlimited length. Basically everyone uses `TEXT` instead of `VARCHAR`. Currently we convert the DuckDB type to a Postgres `VARCHAR`. In many cases this doesn't really matter, because pretty much all clients handle VARCHAR and TEXT the same too. There's one place where this leaks through though: DDL coming from a query. For example if you do a CTAS with a DuckDB query the resulting table columns will be of type `character varying` instead of `text`[3]. [1]: https://duckdb.org/docs/sql/data_types/text.html [2]: https://www.postgresql.org/docs/current/datatype-character.html [3]: #556 (comment)
@kysshsy The issue you're describing is a separate issue from the wrong result one. It happens when you run the @dpxcc I tested if this was a regression, but this is also happening on 0.2.0. So while it's a serious issue, I don't think it needs to block the 0.3.0 release. |
@YuweiXiao thanks a lot for that investigation!
As explained in my comment above, the fact that it is VARCHARARRAYOID is tracked in #583. (and is not relevant to the incorrect result). The To be clear, that |
The ndims property is simply ignored during the CTAS. https://github.com/postgres/postgres/blob/80d7f990496b1c7be61d9a00a2635b7d96b96197/src/backend/commands/createas.c#L495 I suggest using pg_type's |
Thanks again. I'm working on a fix/workaround for this problem now. |
What happens?
Turning on
duckdb.force_execution
produces incorrect resultPlease see the repro below
To Reproduce
OS:
Linux
pg_duckdb Version (if built from source use commit hash):
0.2.0
Postgres Version (if built from source use commit hash):
17.2
Hardware:
No response
Full Name:
Cheng Chen
Affiliation:
Mooncake Labs
What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.
I have tested with a stable release
Did you include all relevant data sets for reproducing the issue?
Not applicable - the reproduction does not require a data set
Did you include all code required to reproduce the issue?
Did you include all relevant configuration (e.g., CPU architecture, Linux distribution) to reproduce the issue?
The text was updated successfully, but these errors were encountered: