-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Many problems with formal text
affinity values in foreign tables
#66
Comments
Thank you for reporting. SQLite FDW expects/assumes that column type of foreign table and data affinity of SQLite are same, as you know. ---Just a comment:--- Additionally, we need to consider and care the logic of pushing down of WHERE condition. |
Thanks for comments, @t-kataym ! I made a draft for Line 214 in 9fd31f4
|
For not ugly data for |
Look like we need Depends on affinity, data is from first example SELECT i, b, typeof (b) FROM "BLOB" where b = 'テスト'; -- 1 row
SELECT i, b, typeof (b) FROM "BLOB" where b = x'e38386e382b9e38388'; -- 1 row Depends on data only, data is from first example SELECT i, b, typeof (b) FROM "BLOB" where cast (b as text) = 'テスト'; -- 2 rows
SELECT i, b, typeof (b) FROM "BLOB" where cast (b as blob) = x'e38386e382b9e38388'; -- 2 rows |
Now for PostgreSQL with |
@t-kataym , what about https://github.com/mkgrgis/sqlite_fdw/tree/readme-fix-and-additions#datatypes ? |
@mkgrgis Thank you for summarizing it. Please give us a time to confirm it.
Please refer The correct behavior of FDW is not to pushdown WHERE conditions if "empty_as_non_text_null" is enabled, I think.
If user execute |
Yes, please. There is updated table now. I understand it will hard to decide about 17×5=85 datatype×affinity combinations with only 8 obvious cases.
Thanks, @t-kataym !
Yes, it's better for SQL:2016 behaviour on PostgreSQL side.
I understand your example, @t-kataym, thanks. Really there is a problem, but other. I thinks it's a bad example. Neither for |
@mkgrgis Thank you for correcting me. Just my example was not suitable. |
Yes, @t-kataym . Let's study about SQL:2016 behaviour in context of prefetched First I have thought UUID Conclusion: anyway no pushdowning needed for int What about joining and Conclusion: we can pushdown conditions for About other datatypes let's wait on your opinion and opinion of your department employees, @t-kataym . Its very important to decide about all combinations, because the data transformations is de facto kernel of |
Hello @mkgrgis, Sorry for late response, I've checked the table and there are some comments: About description:'x - error' this description is meaningless it seems a result of a manual test result. Some type transformations do not always error such as SQLITE BLOB to Timestamp as I describe on note-2 below. I think it should be 'not support' for case always error and V for another case. 1. About SQLite INT & REAL transparent:
2. About SQLite BLOB transparent:It does not correct at conversion to date, json, name, text, timestamp, timestamptz, varchar. It should be 'V' instead of 'x':
As you can see, the error just because the text presentation is not match with target type. If the blob column has expected data for text type, the conversion is OK:
3. About SQLite TEXT:In my understanding sqlite_fdw does not support cast (raise ERROR) only when execute cast from SQLITE TEXT to Postgres int2, int4, int8, bool, float4, float8, numeric and bytea. And sqlite API may return TEXT type for BLOB column. However, the description of these types is confused, '-' for float4, float8; 'x' for int2, int4; 'V+' for numeric... About: SQLite TEXT -> date, json, numeric, time,... (V+). The cast behavior is the same: SQLite TEXT -> C string -> Postgres input function -> Postgres data The behavior is same as SQLite TEXT -> text(✔), varchar(✔), name (V) but the description is not same. |
Update data transformation table by pgspider#66
Hello, @nxhai98! What about we are discussing? There are 3 aspects of transformation table
and also there are 2 C language data transformations
My table from previous URL is about future behaviour and mainly (but not only) about
No, this means only C-language code of Line 94 in e865a4d
Yes, you are right.
I have described this case here. It's possible to show SQLite
SQLite
Please note here was interpretation of
Yes, with cast to encoding of current PostgreSQL database all this cases сan be marked as V (transparent transformation), but I added new T value "cast to text in encoding of current PostgreSQL database and than transparent transformation if applicable". Max length of
Yes, it's normal for ISO SQL:2016 and in my opinion.
As SQLite C language text-like output SQLite BLOB data really can be more flexible for next transformations. But for PostgreSQL C function this input will not obvious. Yes,
Marked as V+, but most of well-formed values will store as values with
In ISO SQL |
About SQLite BLOB transparent
Not only presentation, @nxhai98, but also encoding! Please note, Unicode databases like UTF-8/UTF-16/UTF-32 isn't only possible. Yes, SQLite recommends threat all text as Unicode data, but we have no such recommendations about BLOBs . |
Thank for the feedback @mkgrgis,
Yes, correct.
Thank for explaining. However, SQLite INT/REAL => bytea still 'x' in the datatype table. Could you update this.
Thank for explaining, I understood.
Currently, sqlite_fdw support only utf-8 SQLite database and don't consider about current Postgres encoding.
Currently, sqlite_fdw always uses sqlite3 APIs to cast value to text/int/float and always 'utf-8' for text. So, the T value
sqlite_fdw only get BLOB data directly when postgres column type is bytea, with other data type, BLOB is cast to int/float/text by sqlite3 API first and only UTF-8 supported.
Do you mean:
If so, his may not feasible on FDW because there is implicit cast in OP: bytea_col = 'test' this corresponding with bytea_col = 'test'::bytea
I check again the table:
=> this mark b - show per-bit form is not correct -> it should be same as another cast of SQLite BLOB (T).
|
Yes, fixed, @nxhai98!
It's important don't consider about current Postgres encoding! UTF-8 output of SQLite is not problem, also UTF-16 SQLite databases is not problem also. What if there will be some
More clear is ... cast to text in SQLite utf-8 encoding, then to PostgreSQL text with current encoding of database* and then transparent transformation if applicable ...
Thanks. So, SQLite blob as text will be utf-8. It's good
No, I meant for both cases something like
Yes, I am about similar problem.
Marked as
Fixed.
Fixed. Please carefully review the table again, @nxhai98. I'll come back at the end of next week after studying about testing |
@mkgrgis Thank for fixing, I confirmed the table.
I understood there is a problem about not match character set. However, sqlite_fdw does not support encoding conversion, it assumes that encoding of SQLite and Postgres are the same. If you want to fix it, here is a suggestion: Postgres's core provides a function: /*
* Convert src string to another encoding (general case).
*
* See the notes about string conversion functions at the top of this file.
*/
unsigned char *
pg_do_encoding_conversion(unsigned char *src, int len, int src_encoding, int dest_encoding); It can apply as below to sqlite_fdw: char *utf8 = (char *) sqlite3_column_text(stmt, colid); // <-- assumes SQLite text is always UTF8
char *valstr = pg_do_encoding_conversion(utf8, strlen(utf8), PG_UTF8, GetDatabaseEncoding()); // <-- convert from utf8 to current database encoding
value_datum = InputFunctionCall(&attinmeta->attinfuncs[attnum],
valstr,
attinmeta->attioparams[attnum],
attinmeta->atttypmods[attnum]); But it is not a complete solution.
I understood your spec, but it seems not feasible because of we do not know when For bytea search: treat BLOB column as BYTEA column, we add the implicit cast for all blob columns when build SQLite query (deparse):
For text search: treat BLOB column as TEXT column, user must be define BLOB column as text column in foreign table and add implicit cast for all text columns when build SQLite query (deparse):
Here is an example: sqlite> CREATE TABLE BLOB_TBL(c1 BLOB);
sqlite> INSERT INTO BLOB_TBL VALUES (x'e38386e382b9e38388');
sqlite> INSERT INTO BLOB_TBL VALUES ('test');
sqlite> INSERT INTO BLOB_TBL VALUES (x'74657374');;
sqlite> SELECT c1, typeof(c1) FROM BLOB_TBL;
テスト|blob
test|text
test|blob PostgreSQL: test=# CREATE FOREIGN TABLE "BLOB_TBL" (c1 text) SERVER sqlite_svr ; <- define BLOB column as TEXT column
CREATE FOREIGN TABLE
test=# select * from "BLOB_TBL";
c1
--------
テスト
test
test
(3 rows)
test=# explain verbose select * from "BLOB_TBL" where c1 = 'test';
QUERY PLAN
--------------------------------------------------------------------------------------------------------
Foreign Scan on public."BLOB_TBL" (cost=10.00..7.00 rows=7 width=32)
Output: c1
SQLite query: SELECT CAST (`c1` AS TEXT) FROM main."BLOB_TBL" WHERE ((CAST (`c1` AS TEXT) = 'test'))
(3 rows)
test=# select * from "BLOB_TBL" where c1 = 'test';
c1
------
test <- this is TEXT type on SQLite
test <- this is BLOB type on SQLite
(2 rows) |
Thanks, @nxhai98 ! I am sorry for waiting.
Let's use this table ;-)
Yes, this is full Also important function is
Nice! Your elegant solution is better than my mechanical solution. I consider this proposals and tests as SQL:2016 recommended for our cases. How we will begin a draft for the table, @nxhai98 ? I hope some my drafts will be usefully for full data transformation implementation:
What do You think about algorithm structure? Maybe it will be usefully to pack all transformation table logic only in |
@mkgrgis, thank for your fb,
Yes, I think it may better, I see your draft, BTW, we have just pushed the bug fixing related to |
@t-kataym and @nxhai98, I have made a PR without changing SQL behaviour to prepare implementation of data transformation table here discussed. Please review #74.
|
Thanks @t-kataym and @nxhai98 ! I have made PR #75 with separate function for data type error message.
|
@t-kataym, can anybody help me with testing conception(plan) in #76 (comment) ? I am frustrated during adding a new tests with database encodings. |
@mkgrgis Could you ask @bichht0608 about testing? |
Yes, @t-kataym . I have #76 (comment) asked. |
@t-kataym in #76 extended tests after review by @bichht0608 are ready. When should I get opinion of pgspider team? In next PR there will implemented formally In other PR after next PR there will be new tests and new implementation according some cells of the table. |
@t-kataym , I have implemented new PR with formally In next PRs some elements of the table will be implemented and discussed. |
@mkgrgis
Our member will confirm it and comment in the PR. |
@t-kataym , still no reaction and unknown perspectives... I have moved #79 contribution to #82 as a part of UUID support. Also I have restored with a new tests The name of this issue is
I want to know your opinion about common problem of any SQLite For ex. FDW can read mixedcase Other ex. PostgreSQL 3rd example. SQLite I hope this discussion will be continued this month, @t-kataym . |
I'm sorry. I cannot spend enough time to this topic. It is not high priority for us.
My basic though is that:
|
A new information about Infinity in SQLite, see https://stackoverflow.com/questions/72113935/how-do-i-select-floating-point-infinity-literals-from-a-sqlite-database |
@t-kataym , now my PR #83 is ready for review after rebase to git mainstream. This PR implements limited to 64 bits length After this PR I am implementing ISO:SQL mixed affinity |
@mkgrgis , Thank you for your contribution. Yes, we will confirm the PR. |
The next milestone is |
@t-kataym , after |
Now there are no problems with data with |
Now ISO:SQL data processing for a SQLite data with |
Only non -
STRICT
tables in legacy SQLite databases is affected.bytea
column. In SQLite values usually belongs to blob affinity. In SQLiteBLOB
column also there is some blob images of simple UTF-8 or ASCII text files, for example with such content asテスト
orHello world
. This values stores withtext
affinity and is normal for converting tobytea
. Why in this case there is errorsqlite_fdw/sqlite_query.c
Line 93 in 9fd31f4
bigint
column. Sometimes in SQLite there is empty string values (formallytext
affinity) in this column which obviously meansNULL
. I think we can't formally analyse empty strings in not-text columns always astext
affinity. If PostgreSQL type of data in foreign column don't belongs totext
/varchar
family we can threat empty string asNULL
. Maybe there is a reason to add a new option for "threat empty string asNULL
for non text and non blob PostgreSQL columns"?The text was updated successfully, but these errors were encountered: