Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: list the conversion functions in pg_type #12526

Closed
knz opened this issue Dec 21, 2016 · 28 comments · Fixed by #14529
Closed

sql: list the conversion functions in pg_type #12526

knz opened this issue Dec 21, 2016 · 28 comments · Fixed by #14529
Labels
A-sql-pgcompat Semantic compatibility with PostgreSQL A-sql-semantics
Milestone

Comments

@knz
Copy link
Contributor

knz commented Dec 21, 2016

Reported in #5582 (comment):

Explained further in #5582 (comment)

The columns "typsend" "typreceive" etc should list the names of conversion functions to/from that data type.

Although we do not support these functions yet it would be useful to fill the columns with the names of these functions nonetheless, so that the client application can do something useful with them (e.g. in the case of Postgrex, run client-side callbacks with the same names).

@tlvenn
Copy link
Contributor

tlvenn commented Dec 21, 2016

I also think the names used should be the ones that exist in postgresql and not how cockroachdb choose to name them.

For example, bytes should be named bytea for oid 17 and typname column and this name should be used to derivated the other names. Same thing for decimal vs numeric.

Ecto will throw if it encounter a type it does not support and this is directly correlated with the values postgrex report from those columns.

And please see the issue I filled regarding int/float precision: #12530

@tlvenn
Copy link
Contributor

tlvenn commented Dec 21, 2016

This is what it should look like:

+------+-------------+------------------+------------------+-----------------+----------------+
| oid  |   typname   |     typsend      |    typreceive    |    typoutput    |    typinput    | 
+------+-------------+------------------+------------------+-----------------+----------------+
|   16 | bool        | boolsend         | boolrecv         | boolout         | boolin         | 
|   17 | bytea       | byteasend        | bytearecv        | byteaout        | byteain        | 
|   20 | int8        | int8send         | int8recv         | int8out         | int8in         |
|   21 | int2        | int2send         | int2recv         | int2out         | int2in         | 
|   23 | int4        | int4send         | int4recv         | int4out         | int4in         | 
|   25 | text        | textsend         | textrecv         | textout         | textin         |
|  700 | float4      | float4send       | float4recv       | float4out       | float4in       |
|  701 | float8      | float8send       | float8recv       | float8out       | float8in       |
| 1005 | _int2       | array_send       | array_recv       | array_out       | array_in       |
| 1007 | _int4       | array_send       | array_recv       | array_out       | array_in       |
| 1009 | _text       | array_send       | array_recv       | array_out       | array_in       |
| 1016 | _int8       | array_send       | array_recv       | array_out       | array_in       |
| 1043 | varchar     | varcharsend      | varcharrecv      | varcharout      | varcharin      |
| 1082 | date        | date_send        | date_recv        | date_out        | date_in        |
| 1114 | timestamp   | timestamp_send   | timestamp_recv   | timestamp_out   | timestamp_in   |
| 1184 | timestamptz | timestamptz_send | timestamptz_recv | timestamptz_out | timestamptz_in |
| 1186 | interval    | interval_send    | interval_recv    | interval_out    | interval_in    |
| 1700 | numeric     | numeric_send     | numeric_recv     | numeric_out     | numeric_in     |
| 2283 | anyelement  | -                | -                | anyelement_out  | anyelement_in  |
+------+-------------+------------------+------------------+-----------------+----------------+

20170206: Updated the table for array types

@tlvenn
Copy link
Contributor

tlvenn commented Dec 21, 2016

To illustrate my point regarding the naming of those columns:

https://github.com/elixir-ecto/postgrex/blob/master/lib/postgrex/extensions/int8.ex#L4

I also wonder how arrays will be working given postgres seems to have a single non typed array with oid 2277 -> anyarray and cockroachdb seems to have only typed arrays at the moment.

@tlvenn
Copy link
Contributor

tlvenn commented Jan 15, 2017

Just to clarify, Postgrex is not using those function names only to provide end user extension points but it maps itself on them to provide encoders / decoders for all types.

Given those tables exists for ORM compatibility only and are not meant to be used internally, I believe it should match postgresql types naming as to not confuse ORMs.

@knz
Copy link
Contributor Author

knz commented Jan 17, 2017

I've been exploring solutions to this issue.

  1. Would it be adequate to provide procedure names for the types recognized by CockroachDB that are named after the CockroachDB data types? (e.g. bytesin/bytesout since CockroachDB's byte array is bytes)

  2. we map int2, int4 etc to the same type int. Would it be OK in your use case to have all the int types use the same type conversion procedure names?

@tlvenn
Copy link
Contributor

tlvenn commented Jan 17, 2017

Hi @knz, the issue is, as soon as Postgrex encounter a typsend it does not know / map on, it is unable to decode / encode it and therefore just crash. So both your proposals would fail with Postgrex out of the box without coding specific support for CockroachDB types into Postgrex.

Conceptually, what is the issue with mapping on names already defined by Postgresql ?

@tlvenn
Copy link
Contributor

tlvenn commented Feb 6, 2017

Hi @knz , any update on this issue ? I am particularly interested on how to deal with typed arrays given postgresql has only one oid for arrays and postgrex is mapping solely on that. Right now i have a few queries that crash with things like int8[] is unknown type.

@jordanlewis
Copy link
Member

@tlvenn, we just merged #13355 which solves a similar issue. We didn't add the correct typsend and typrecieve fields yet, but perhaps the correctly-named type names will solve your issue. Could you please give it another try and let us know how it goes?

@tlvenn
Copy link
Contributor

tlvenn commented Feb 6, 2017

Hi @jordanlewis, your PR is definitely solving part of the problem but until the correct typsend and typreceive are populated, it will not fix the issue.

On the typed arrays, I actually was mistaken when I said Postgres does not have typed arrays, I dont really know how I missed that but actually any typname starting with an _ is an array type on Postgres side... ;)

So looks like once the typsend and typreceive are fixed, it will fix this issue for good.
I have updated the table above to reflect the expected values for typed arrays.

@tlvenn
Copy link
Contributor

tlvenn commented Feb 6, 2017

For clarification on the strictness of sticking with Postgres typname (See #4244), as far as Elixir/ Postgrex is concerned, I don't think it matters too much, typsend and typreceive are the crucial ones but I generally think it would be a good idea to strictly adhere to Postgres naming in pg_type as to minimize any side effect in relation with ORMs.

@jordanlewis
Copy link
Member

Thank you @tlvenn for the information. Just to be clear, it's my understanding that all we must provide is the proper string names of these send and receive functions in each of the rows of pg_type - without corresponding entries in the pg_proc table. Is that correct?

We have no plans to actually implement any of these send and receive builtins, as they are truly internal to Postgres.

@tlvenn
Copy link
Contributor

tlvenn commented Feb 7, 2017

Hi @jordanlewis, yes your understanding is correct, matching entries with postgres in pg_type is all what is needed.

@tlvenn
Copy link
Contributor

tlvenn commented Feb 9, 2017

For typed arrays to work, the typelem must be filled correctly as well. For example for _int2, the typelem is 21.

Should I open a separate issue for this ?

@jordanlewis
Copy link
Member

Okay - yes, please open a separate issue. Thanks!

@tlvenn
Copy link
Contributor

tlvenn commented Feb 10, 2017

Done in #13524

@jordanlewis
Copy link
Member

@tlvenn I'm worried that this is not as straightforward as it seems. In Postgres, the typreceive and typsend functions have type regproc, which is a special Postgres type that appears as a string but is an OID under the hood. We don't support this special kind of type at the moment.

That gives us two options, assuming we don't want to implement this special kind of type: represent the column as a string, or represent the column as an OID, which will show up as an integer.

Both of these options seem slightly problematic to me. What happens if your program gets the integer form of an OID in the relevant queries? I assume it'll break, right? On the other hand, if we change these columns to string types and just return the names of the builtins, we run the risk of breaking other programs that assume they can join that field against pg_proc.

Thoughts? Is there some way we could meet in the middle here? Why exactly do you need to use the typreceive columns when they're so easily programmatically generated - couldn't you use typname for whatever your purpose is?

@tlvenn
Copy link
Contributor

tlvenn commented Feb 10, 2017

Pulling in @fishcakez, who can probably better shed some light on this. Postgrex is indeed expecting strings for typreceive and typsend as far as I know.

@fishcakez
Copy link

Postgrex uses strings because multiple typnames can have the same typreceive and typesend. The later fields describe how the value is encoded over the wire. For example it is possible to create a custom type with the same wire encoding as int4 but a different typname. Postgrex uses the wire encoding because it does not fallback to text format for types it does not know, as these could be confused with string/byte values in the users application.

However it is possible to hook into Postgrex type handling to override the encode/decode for any typname, typreceive, typesend etc. Therefore it would be possible for someone to write type extensions that use the typnames used by cockroach. However these would not be included in Postgrex itself.

The maintainers of Postgrex think its type handling is a key feature of the client and a great benefit compared to the behaviour of other clients so is not going to change in this regard unfortunately.

@jordanlewis
Copy link
Member

Thanks for the info guys! It seems like we'll need to actually implement the regproc type properly to support your use case. It's on our roadmap now, as we'll also need to support that type for some other ORMs including npgsql.

@tlvenn
Copy link
Contributor

tlvenn commented Mar 9, 2017

Hi @jordanlewis, it seems you have merged support for regproc 👍
With that in place, is it correct to assume that the only remaining bit is to properly provide typsend and typreceive ?

@jordanlewis
Copy link
Member

Yes, I believe that is correct! Adding these fields should be pretty simple, if a bit tedious. They'll need to be matched by corresponding entries in the pg_proc table. I am hoping to get to this soon.

@tlvenn
Copy link
Contributor

tlvenn commented Mar 9, 2017

Thanks for the update @jordanlewis

@tlvenn
Copy link
Contributor

tlvenn commented Mar 28, 2017

Hi @jordanlewis, any chance to close that issue soon? It's kinda the last mile to be able to use the stock Postgrex driver.

Thanks a lot in advance.

@jordanlewis
Copy link
Member

Hi @tlvenn, we've been quite busy with preparing for our 1.0 release, so I haven't gotten to this. Sorry for the delay! This is still on my plate, and I still hope to get to it soon.

@jordanlewis
Copy link
Member

@tlvenn please let us know if you're encounter further difficulties here as I'm not yet entirely confident that all of the details of the table are properly filled in.

@tlvenn
Copy link
Contributor

tlvenn commented Apr 3, 2017

Hi @jordanlewis, thanks a lot for this. This looks pretty good but there is still one tiny issue remaining with array typnames.

Right now a [] is suffixed while the expected convention is to prefix the typename with _. So for example, it should read _int2 instead of int2[].

@tlvenn
Copy link
Contributor

tlvenn commented Apr 3, 2017

I also encountering the following error now:

execRequest: error: unsupported comparison operator: <oid> NOT IN <tuple{int}>

Seems like there is a slight regression with the work you did to properly support the OID type.
I have created a separate issue to track it: #14554

@tlvenn
Copy link
Contributor

tlvenn commented Apr 3, 2017

I have created a dedicated issue for the array typname: #14556

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-pgcompat Semantic compatibility with PostgreSQL A-sql-semantics
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants