sql: make the CockroachDB integer types more compatible with postgres #26925

knz · 2018-06-22T11:51:24Z

Type	CockroachDB	Postgres
`int`	primitive type, 64-bit	alias for `int4`, never shows up in vtables
`int2`	alias for `int` (thus wrong width), shows up as "int2" in certain vtables	primitive type, 16-bit
`int4`	alias for `int` (thus wrong width), shows up as "int4" in certain vtables	primitive type, 32-bit
`int8`	alias for `int`, shows up as "int8" in certain vtables	primitive type, 64-bit
`bigint`	alias for `int`, shows up as "bigint" in certain vtables	alias for `int8`, never shows up in vtables
`smallint`	alias for `int` (thus wrong width) shows up as "smallint" in certain vtables	alias for `int2`, never shows up in vtables
`serial`	special form for `int` (thus wrong width) with `default unique_rowid()`	special form for `int4`, create sequence and `default nextval(seqname)`
`bigserial`	alias for `serial`	special form for `int8`, create sequence and `default nextval(seqname)`
`smallserial`	alias for `serial` (thus wrong width)	special form for `int2`, create sequence and `default nextval(seqname)`
`bit`	near-alias for `int`, shows up as `bit` in certain vtables, max 64-bit	primitive type, arbitrarily length byte array with bit input/output representations

Problems:

CockroachDB types int, int2, int4, serial, smallserial, bit have data width that are fully incompatible with postgresql
CockroachDB incorrectly preserves the names "bigint" and "smallint" whereas postgres desugars them to int8 and int2 during parsing.
PostgresDB serial types uses sequences, CockroachDB does not do this
bit has incorrectly a maximum size (and its underlying implementation is currently incorrect)

Fixing this would comprehensively adress #24062 #22607 #26730 #25098 #24686 and possibly others.

Informs #26128.

Jira issue: CRDB-4979

The text was updated successfully, but these errors were encountered:

knz · 2018-06-22T11:59:09Z

@awoods187 I'd like to work to address this over the summer (next milestone or the one after that). This would solve a large class of compatibility problems.

benesch · 2018-07-03T21:47:26Z

/cc @andy-kimball

knz · 2018-08-07T16:09:19Z

PostgresSQL serial types uses sequences, CockroachDB does not do this - planned for 2.1
CockroachDB incorrectly preserves the names "bigint" and "smallint" whereas postgres desugars them to int8 and int2 during parsing. planned for 2.1
bit has incorrectly a maximum size (and its underlying implementation is currently incorrect) planned in the 2.2 time frame, but may be considered as bugfix and backported in 2.1 (with doc known-limitation at the 2.1 release)
CockroachDB types int, int2, int4, serial, smallserial, bit have data width that are fully incompatible with postgresql -- actually that's 2 different issues
- the visible data width, announced in the introspection tables (information_schema, pg_Catalog), is wrong - we can fix this in 2.1 by lying to clients and make them believe what they need to believe
- the actual data width, which restricts the set of possible values for given data types, that becomes known limitation in 2.1 + planned for fix in 2.2

28945: sql: fix the reporting of types in information_schema.columns r=knz a=knz First commits from #28944 and priors. Forked off #28690. Fixes #27601. Largely addresses the concerns that led to issue #26925. Prior to this patch, CockroachDB incorrectly placed the "input syntax" of each SQL type in the column `data_type` of `information_schema.columns`. The input syntax is the one reported in SHOW COLUMNS, SHOW CREATE TABLE and other places, and is suitable to reproduce the exact type of at able. In contrast, `information_schema.columns.data_type` is constrained by compatibility with third party tools and PostgreSQL clients. It must report the name of the type like PostgreSQL does, which in turn is constrained by the SQL standard. A text column must be reported as "text" not "string"; a decimal column as "numeric" not "decimal", a float8 column as "double precision" not "float8", and so on. By reporting the wrong string in that column CockroachDB is confusing ORMs, which subsequently decide that the current on-disk type is not the one expected by the app and then initiate a schema change (ALTER COLUMN SET TYPE). This patch corrects this incompatibility by introducing logic that produces the proper information schema names for column types. This is expected to reduce ORM complaints about insufficient support for ALTER COLUMN SET TYPE (but will be tested/evaluated separately). Release note (bug fix): CockroachDB now populates the `data_type` column of `information_schema.columns` like PostgreSQL for compatibility with 3rd party tools and ORMs. Co-authored-by: Raphael 'kena' Poss <knz@cockroachlabs.com>

28690: sql: fix the handling of integer types r=knz a=knz Addresses a large chunk of #26925. Fixes #25098. Informs #24686. Prior to this patch, CockroachDB maintained an unnecessary distinction between "INT" and "INTEGER", between "BIGINT" and "INT8", etc. This distinction is unnecessary but also costly, as we were paying the price of a "name" attribute in coltypes.TInt, with a string comparison and hash table lookup on every use of the type. What really matters is that the type shows up properly in introspection; this has already been ensured by various OID-to-pgcatalog mappings and the recently introduced `InformationSchemaTypeName()`. Any distinction beyond that is unnecessary and can be dropped from the implementation. Release note: None Co-authored-by: Raphael 'kena' Poss <knz@cockroachlabs.com>

bobvawter · 2018-11-08T21:01:44Z

Here's a micro-RFC that I'd like to get feedback on before starting on the implementation.

Updated 2018-12-05

Motivation:

We currently assume that INT really means INT8, however this is inconsistent with PostgreSQL and various (Java) ORMs that assume INT means INT4. We want a transition path such that INT means INT4 in CockroachDB.

Goals

Code that works correctly with INT --> INT8 on 2.1 will work correctly on 2.2 and also on 2.Next with only a flag change.
Existing database schemas will not change, but it's fine to tweak their introspection metadata to accurately reflect their configuration.
Handle the INT --> INTx mapping as close to parse time as possible to avoid needing to plumb type-mapping logic down into all of the various ColumnType support functions or schema introspection tables.
Present identical data and table-introspection information to all callers, regardless of wether or not they assume 4- or 8-byte INT values to minimize cognitive load and change-sprawl.

Plan

[2.2] PR sql: Make INT an alias for INT8 #32831 makes INT an alias for INT8
[2.2] PR sql: Add default_int_size to control INT alias #32848 add as cluster and session setting default_int_size whose valid values are 8 (the default) and 4.
[2.2] Ensure that all parsing of type INT or INTEGER reflects the default_int_size setting by substituting INT4 or INT8 as appropriate.
- Exception: When a column's default is set to exactly unique_rowid(), it should be an INT8.
~~[2.2] Add a cluster-version ratchet which will upgrade all existing TableDescriptors to ensure that any SemanticType=INT columns with unspecified Width are reset to Width=64.~~
- This won't be necessary since all user inputs where INT := INT4 will simply have INT4 in the parse tree. Other SQL not coming in from user input will continue to use INT := INT8 to preserve the historical behavior.
[2.2] Bugfix sql: Alias Type Information is not propagated to pgwire or pg_attribute #16769: pg_attribute should return correct atttypid values, instead of always returning 20 (INT8) for all int types.
[2.2] Bugfix: information_schema.columns to remove character_maximum_length for int types, which pg leaves unpopulated. The docs on ColumnType.MaxCharacterLength() are inconsistent with the implementation.
~~[2.2] Test: The existing information_schema.column.crdb_sql_type column should return pre-2.2 values until the INT --> INT8 upgrade ratchet has fired.~~
[2.2] Test: Update the information_schema and pg_class logic tests to match pgsql for e.g. create table x(a int, aa integer, b int4, c int8, d bigint, e int2, f smallint); when default_int_size=4 and document differences when in 8-byte mode.
[2.2] Test: Ensure that cast operations '1'::INT result in the correct data width.
[2.2] Telemetry: Record how often default_int_size is set.
[2.Next] Seeing an Int-type ColumnType with Width == 0 is an error (e.g. types.go: FromColumnType().
[2.Next] Switch default_int_size to 4.
[2.Next] Remove the unnecessary ColumnType.VisibleType values, since we can always back them out of the non-zero `ColumnType.Width.
[2.Future] Remove default_int_size settings.

Mixed-version notes

There shouldn't be any need to alter the ColumnType message. The only observable change for a 2.1 node is that 2.2 gateways would never create a ColumnType { SemanticType: INT, Width: 0 }; the width would always be set to some explicit value.

If a column is created on a 2.1 gateway in the mixed-mode case, the upgrade ratchet will update the TableDescriptor to act as though it had been created with INT --> INT8 semantics on a 2.2 gateway.

bdarnell · 2018-11-09T17:04:41Z

LGTM

Exception: When a column's default is set to an INT8-returning function, e.g. unique_rowid(), it should be an INT8.

When this rule is invoked we should emit a warning.

knz · 2018-11-12T10:37:20Z

I recommend you file separate issues linked to this one for the various required bug fixes.

Exception: When a column's default is set to an INT8-returning function, e.g. unique_rowid(), it should be an INT8.

I would restrict that to the expression unique_rowid() specifically, not just any INT expression.

LGTM otherwise! Thanks for the analysis.

knz · 2018-11-12T10:38:36Z

@bdarnell

When this rule is invoked we should emit a warning.

We don't have infrastructure for pgwire warnings yet.

Personally for UX I would prefer introducing table/column comments (#19472) and put the warning in a comment instead.

nocduro · 2018-11-23T01:00:52Z

Hello, thought I would chime in with my experiences with this as a new cockroachdb user.

I was trying to experiment with cockroachdb instead of postgres with a webapp I'm making, and ran into the problem of INT not being equivalent to INT4, specifically in the postgres SERIAL type.

The app is written with Rust, and using the diesel ORM so the types are pretty strictly enforced.

This seems like a very common setup in postgres/sql so would make migration easier...
Here's a search on github showing over 200k instances of this:

https://github.com/search?q=%22+SERIAL+PRIMARY+KEY%22&type=Code

I'm wondering if there will ever be a way to have a SERIAL column in cockroachdb that is an INT4? Seems weird to advertise postgres compatibility when this isn't supported.

edit: my post came off a bit too negative. Now that I think about this some more it isn't too bad. I would just have to migrate my db to use bigserial and change my app code from i32 to i64 in a couple places. The downsides of that are that the ids in my urls are a bit longer (not that big of deal), and that I 'waste' 4 extra bytes per entry. I have fewer than a million rows so that change is pretty insignificant.

edit2: ran into another problem now. My ORM assumes that the returned value is the width of the data type. So i have an INT4 stored and it gets returned to me from cockroach as an INT8... Erasing some type data I had, and taking up more space. I don't need 64 bits to represent a number between 1-100

knz · 2018-11-23T11:09:07Z

@nocduro we are working on it.

In the meantime, please know that we have introduced a partial compatibility mode in CockroachDB, so you can use INT4 with a sequence for SERIAL. See for yourself:

root@127.24.137.227:42763/defaultdb> set experimental_serial_normalization = sql_sequence;
SET

root@127.24.137.227:42763/defaultdb> create table t(x serial4);
CREATE TABLE


root@127.24.137.227:42763/defaultdb> show create t;
  table_name |                     create_statement
+------------+-----------------------------------------------------------+
  t          | CREATE TABLE t (
             |     x INT4 NOT NULL DEFAULT nextval('t_x_seq1':::STRING),
             |     FAMILY "primary" (x, rowid)
             | )
(1 row)

knz · 2018-11-23T11:10:10Z

The gist of this is that you need 1) to set experimental_serial_normalization and 2) use INT4, SERIAL4 etc explicitly (the word "INT" without a size will still automatically translate to INT8).

Regarding your final note:

edit2: ran into another problem now. My ORM assumes that the returned value is the width of the data type. So i have an INT4 stored and it gets returned to me from cockroach as an INT8...

Please provide more information. If this is a bug we need to fix it.

bobvawter · 2018-11-27T13:15:51Z

Status update:

Right now, I’ve explored three different ways of getting this INT change in:

Plumb EvalContext into everywhere that INT-as-a-type gets used:

cfd53f6...bobvawter:int4
I'm running into what appears to be type "erasure" in opt when converting from coltypes.T -> types.T -> coltypes.T
This manifests in tests like SELECT a::INT FROM (SELECT '4294967296' as a) when in INT4-mode. The optimizer unwinds this to a DInt and we lose the cast to what should be INT4 and a range-check error.

Tweak the parser to never kick out an INT into our AST

9363293...bobvawter:int4-2
This approach adds a field to our Scanner type, which implements sqlLexer. The sql.y grammar is changed so that receiving an INT token delegate to the Scanner.defaultIntType field.
Parsing of SQL statements is performed in conn which adds the generated AST to a buffer to be drained by connExecutor. I was running into problems finding a path from conn to the SessionData owned by connExecutor.

If we can't tweak the parser, then we ought to be able to modify the incoming AST in connExecutor

The tree.Statement and tree.Expr types are currently hit-or-miss in terms of being able to run a visitor over them.
It's a huge surface area to manually maintain the traversal code.
I'm sketching out an idea for what a code-generated visitor API could look like here: https://github.com/cockroachdb/cockroach/pull/32628/files

Based on a chat yesterday with @jordanlewis, I think option 2 is once again viable since we have identified a way to get to the SessionData from conn.

Regardless of implementation complexity, option 2 would be the obvious approach if we could make a knife-edge cutover: We want to make INT behave like FLOAT in which we assign an explicit size to the type whenever you use it.

Open question:

There are many places in the code where SQL fragments get re-parsed (e.g. CHECK statements, computed column expressions). Presumably, this new behavior should be applied only to statements coming in from the user. When it's time to do the cleanup migration, having a more robust visitor API could help make that much more fool-proof to implement.

andy-kimball · 2019-09-26T17:49:50Z

Could we leave the cluster default at int8 if it's an existing cluster, but default to int4 for new clusters? i.e. treat it as a cluster setting rather than a session setting

bdarnell · 2019-09-26T17:53:11Z

Could we leave the cluster default at int8 if it's an existing cluster, but default to int4 for new clusters? i.e. treat it as a cluster setting rather than a session setting

It's actually both a cluster and a session setting (the cluster setting is used as the default for the session setting), so that's exactly what would happen if we made this change. Existing clusters would continue to use integer=int8 while new ones would use integer=int4. So we wouldn't break any existing clusters, but cause a divergence between older prod clusters and newer dev/test clusters.

jordanlewis · 2019-09-26T18:25:35Z

Another potentially scary thing is integer tables in a cluster that got upgraded from 19.1 to 19.2 would have a different width than integer tables that got created in the same cluster after the upgrade was finalized. I think that it could be rather confusing for applications.

I still think that making this change would be more disruptive than helpful. Note that I'm assuming that drivers work properly because of the OID change I mentioned above. I haven't seen any evidence that refutes that assumption - if there were such evidence I think I'd need to reconsider.

The spreadsheet we discussed is unwieldy - hard to edit and impossible to keep up to date. If we write down blacklists in code, then we can use an approach like this to always have an up to date aggregation. So far it seems like there's just a lot of unknowns to categorize still. The output today: ``` === RUN TestBlacklists 648: unknown (unknown) 493: cockroachdb#5807 (sql: Add support for TEMP tables) 151: cockroachdb#17511 (sql: support stored procedures) 86: cockroachdb#26097 (sql: make TIMETZ more pg-compatible) 56: cockroachdb#10735 (sql: support SQL savepoints) 55: cockroachdb#32552 (multi-dim arrays) 55: cockroachdb#26508 (sql: restricted DDL / DML inside transactions) 52: cockroachdb#32565 (sql: support optional TIME precision) 39: cockroachdb#243 (roadmap: Blob storage) 33: cockroachdb#26725 (sql: support postgres' API to handle blob storage (incl lo_creat, lo_from_bytea)) 31: cockroachdb#27793 (sql: support custom/user-defined base scalar (primitive) types) 24: cockroachdb#12123 (sql: Can't drop and replace a table within a transaction) 24: cockroachdb#26443 (sql: support user-defined schemas between database and table) 20: cockroachdb#21286 (sql: Add support for geometric types) 18: cockroachdb#6583 (sql: explicit lock syntax (SELECT FOR {SHARE,UPDATE} {skip locked,nowait})) 17: cockroachdb#22329 (Support XA distributed transactions in CockroachDB) 16: cockroachdb#24062 (sql: 32 bit SERIAL type) 16: cockroachdb#30352 (roadmap:when CockroachDB will support cursor?) 12: cockroachdb#27791 (sql: support RANGE types) 8: cockroachdb#40195 (pgwire: multiple active result sets (portals) not supported) 8: cockroachdb#6130 (sql: add support for key watches with notifications of changes) 5: Expected Failure (unknown) 5: cockroachdb#23468 (sql: support sql arrays of JSONB) 5: cockroachdb#40854 (sql: set application_name from connection string) 4: cockroachdb#35879 (sql: `default_transaction_read_only` should also accept 'on' and 'off') 4: cockroachdb#32610 (sql: can't insert self reference) 4: cockroachdb#40205 (sql: add non-trivial implementations of FOR UPDATE, FOR NO KEY UPDATE, FOR SHARE, FOR NO KEY SHARE) 4: cockroachdb#35897 (sql: unknown function: pg_terminate_backend()) 4: cockroachdb#4035 (sql/pgwire: missing support for row count limits in pgwire) 3: cockroachdb#27796 (sql: support user-defined DOMAIN types) 3: cockroachdb#3781 (sql: Add Data Type Formatting Functions) 3: cockroachdb#40476 (sql: support `FOR {UPDATE,SHARE} {SKIP LOCKED,NOWAIT}`) 3: cockroachdb#35882 (sql: support other character sets) 2: cockroachdb#10028 (sql: Support view queries with star expansions) 2: cockroachdb#35807 (sql: INTERVAL output doesn't match PG) 2: cockroachdb#35902 (sql: large object support) 2: cockroachdb#40474 (sql: support `SELECT ... FOR UPDATE OF` syntax) 1: cockroachdb#18846 (sql: Support CIDR column type) 1: cockroachdb#9682 (sql: implement computed indexes) 1: cockroachdb#31632 (sql: FK options (deferrable, etc)) 1: cockroachdb#24897 (sql: CREATE OR REPLACE VIEW) 1: pass? (unknown) 1: cockroachdb#36215 (sql: enable setting standard_conforming_strings to off) 1: cockroachdb#32562 (sql: support SET LOCAL and txn-scoped session variable changes) 1: cockroachdb#36116 (sql: psychopg: investigate how `'infinity'::timestamp` is presented) 1: cockroachdb#26732 (sql: support the binary operator: <int> / <float>) 1: cockroachdb#23299 (sql: support coercing string literals to arrays) 1: cockroachdb#36115 (sql: psychopg: investigate if datetimetz is being returned instead of datetime) 1: cockroachdb#26925 (sql: make the CockroachDB integer types more compatible with postgres) 1: cockroachdb#21085 (sql: WITH RECURSIVE (recursive common table expressions)) 1: cockroachdb#36179 (sql: implicity convert date to timestamp) 1: cockroachdb#36118 (sql: Cannot parse '24:00' as type time) 1: cockroachdb#31708 (sql: support current_time) ``` Release justification: non-production change Release note: None

awoods187 · 2019-10-04T22:50:48Z

I think we should consider defaulting to int4 as the vast majority of our users are still in front of us. We should have a bias towards compatibility and toward forward-looking as long as we have escape hatches for users upgrading.

jordanlewis · 2019-10-04T22:55:39Z

I still haven't found a compelling, real user problem caused by the mapping of integer to int8 instead of int4 by default. The real compatibility bugs that were discussed in this thread are solved now, as far as I know, but I'd love to see evidence to the contrary.

bdarnell · 2019-10-04T23:13:35Z

Concrete example: the migration tool in knex (a javascript query builder). It creates a table with an INTEGER column to track its state. Since javascript doesn't have a 64-bit integer type, the pg driver converts INT8 values into decimal strings, but knex expects javascript number objects. It is therefore impossible to use the knex migration tool out of the box unless you set the cluster setting to make integer=int4.

This isn't a clear-cut case - the cluster setting workaround works, and even that could be avoided if the tool were changed to specify a smaller integer type (but this is tricky and DB-specific - int4 isn't available everywhere). The SQL standard doesn't specify the size of the standard types SMALLINT, INTEGER, and BIGINT, but JDBC gives us a de facto standard: SMALLINT is 16 bits, INTEGER is 32, and BIGINT is 64. Doing anything else will be an ongoing source of friction since anyone who uses java or JS will need to either change their schemas or set this cluster/session setting. And as Andy says, the majority of our usage is still in front of us.

jordanlewis · 2019-10-04T23:29:57Z

Okay, that's compelling enough.

Note that it wouldn't make sense to change integer to int4 without also changing serial to serial4 (and therefore defaulting experimental_serial_normalization to sql_sequence). That has a big performance impact, and I'm more scared of making this change than just changing the integer type.

Why doesn't it makes sense to change just integer?

bdarnell · 2019-10-04T23:36:56Z

This example used an INTEGER, but SERIAL is also very common and raises exactly the same issues. Changing one but not the other means that we cause the backwards-compatibility headaches for existing CockroachDB users, but we also have to continue to recommend the cluster/session settings for future users. I think if we're going to make a backwards-incompatible change for the sake of better compatibility with other databases, we should go all the way and change both types.

The spreadsheet we discussed is unwieldy - hard to edit and impossible to keep up to date. If we write down blacklists in code, then we can use an approach like this to always have an up to date aggregation. So far it seems like there's just a lot of unknowns to categorize still. The output today: ``` === RUN TestBlacklists 648: unknown (unknown) 493: cockroachdb#5807 (sql: Add support for TEMP tables) 151: cockroachdb#17511 (sql: support stored procedures) 86: cockroachdb#26097 (sql: make TIMETZ more pg-compatible) 56: cockroachdb#10735 (sql: support SQL savepoints) 55: cockroachdb#32552 (multi-dim arrays) 55: cockroachdb#26508 (sql: restricted DDL / DML inside transactions) 52: cockroachdb#32565 (sql: support optional TIME precision) 39: cockroachdb#243 (roadmap: Blob storage) 33: cockroachdb#26725 (sql: support postgres' API to handle blob storage (incl lo_creat, lo_from_bytea)) 31: cockroachdb#27793 (sql: support custom/user-defined base scalar (primitive) types) 24: cockroachdb#12123 (sql: Can't drop and replace a table within a transaction) 24: cockroachdb#26443 (sql: support user-defined schemas between database and table) 20: cockroachdb#21286 (sql: Add support for geometric types) 18: cockroachdb#6583 (sql: explicit lock syntax (SELECT FOR {SHARE,UPDATE} {skip locked,nowait})) 17: cockroachdb#22329 (Support XA distributed transactions in CockroachDB) 16: cockroachdb#24062 (sql: 32 bit SERIAL type) 16: cockroachdb#30352 (roadmap:when CockroachDB will support cursor?) 12: cockroachdb#27791 (sql: support RANGE types) 8: cockroachdb#40195 (pgwire: multiple active result sets (portals) not supported) 8: cockroachdb#6130 (sql: add support for key watches with notifications of changes) 5: Expected Failure (unknown) 5: cockroachdb#23468 (sql: support sql arrays of JSONB) 5: cockroachdb#40854 (sql: set application_name from connection string) 4: cockroachdb#35879 (sql: `default_transaction_read_only` should also accept 'on' and 'off') 4: cockroachdb#32610 (sql: can't insert self reference) 4: cockroachdb#40205 (sql: add non-trivial implementations of FOR UPDATE, FOR NO KEY UPDATE, FOR SHARE, FOR NO KEY SHARE) 4: cockroachdb#35897 (sql: unknown function: pg_terminate_backend()) 4: cockroachdb#4035 (sql/pgwire: missing support for row count limits in pgwire) 3: cockroachdb#27796 (sql: support user-defined DOMAIN types) 3: cockroachdb#3781 (sql: Add Data Type Formatting Functions) 3: cockroachdb#40476 (sql: support `FOR {UPDATE,SHARE} {SKIP LOCKED,NOWAIT}`) 3: cockroachdb#35882 (sql: support other character sets) 2: cockroachdb#10028 (sql: Support view queries with star expansions) 2: cockroachdb#35807 (sql: INTERVAL output doesn't match PG) 2: cockroachdb#35902 (sql: large object support) 2: cockroachdb#40474 (sql: support `SELECT ... FOR UPDATE OF` syntax) 1: cockroachdb#18846 (sql: Support CIDR column type) 1: cockroachdb#9682 (sql: implement computed indexes) 1: cockroachdb#31632 (sql: FK options (deferrable, etc)) 1: cockroachdb#24897 (sql: CREATE OR REPLACE VIEW) 1: pass? (unknown) 1: cockroachdb#36215 (sql: enable setting standard_conforming_strings to off) 1: cockroachdb#32562 (sql: support SET LOCAL and txn-scoped session variable changes) 1: cockroachdb#36116 (sql: psychopg: investigate how `'infinity'::timestamp` is presented) 1: cockroachdb#26732 (sql: support the binary operator: <int> / <float>) 1: cockroachdb#23299 (sql: support coercing string literals to arrays) 1: cockroachdb#36115 (sql: psychopg: investigate if datetimetz is being returned instead of datetime) 1: cockroachdb#26925 (sql: make the CockroachDB integer types more compatible with postgres) 1: cockroachdb#21085 (sql: WITH RECURSIVE (recursive common table expressions)) 1: cockroachdb#36179 (sql: implicity convert date to timestamp) 1: cockroachdb#36118 (sql: Cannot parse '24:00' as type time) 1: cockroachdb#31708 (sql: support current_time) ``` Release justification: non-production change Release note: None

41252: roachtest: add test that aggregates orm blacklist failures r=jordanlewis a=jordanlewis The spreadsheet we discussed is unwieldy - hard to edit and impossible to keep up to date. If we write down blacklists in code, then we can use an approach like this to always have an up to date aggregation. So far it seems like there's just a lot of unknowns to categorize still. The output today: ``` === RUN TestBlacklists 648: unknown (unknown) 493: #5807 (sql: Add support for TEMP tables) 151: #17511 (sql: support stored procedures) 86: #26097 (sql: make TIMETZ more pg-compatible) 56: #10735 (sql: support SQL savepoints) 55: #32552 (multi-dim arrays) 55: #26508 (sql: restricted DDL / DML inside transactions) 52: #32565 (sql: support optional TIME precision) 39: #243 (roadmap: Blob storage) 33: #26725 (sql: support postgres' API to handle blob storage (incl lo_creat, lo_from_bytea)) 31: #27793 (sql: support custom/user-defined base scalar (primitive) types) 24: #12123 (sql: Can't drop and replace a table within a transaction) 24: #26443 (sql: support user-defined schemas between database and table) 20: #21286 (sql: Add support for geometric types) 18: #6583 (sql: explicit lock syntax (SELECT FOR {SHARE,UPDATE} {skip locked,nowait})) 17: #22329 (Support XA distributed transactions in CockroachDB) 16: #24062 (sql: 32 bit SERIAL type) 16: #30352 (roadmap:when CockroachDB will support cursor?) 12: #27791 (sql: support RANGE types) 8: #40195 (pgwire: multiple active result sets (portals) not supported) 8: #6130 (sql: add support for key watches with notifications of changes) 5: Expected Failure (unknown) 5: #23468 (sql: support sql arrays of JSONB) 5: #40854 (sql: set application_name from connection string) 4: #35879 (sql: `default_transaction_read_only` should also accept 'on' and 'off') 4: #32610 (sql: can't insert self reference) 4: #40205 (sql: add non-trivial implementations of FOR UPDATE, FOR NO KEY UPDATE, FOR SHARE, FOR NO KEY SHARE) 4: #35897 (sql: unknown function: pg_terminate_backend()) 4: #4035 (sql/pgwire: missing support for row count limits in pgwire) 3: #27796 (sql: support user-defined DOMAIN types) 3: #3781 (sql: Add Data Type Formatting Functions) 3: #40476 (sql: support `FOR {UPDATE,SHARE} {SKIP LOCKED,NOWAIT}`) 3: #35882 (sql: support other character sets) 2: #10028 (sql: Support view queries with star expansions) 2: #35807 (sql: INTERVAL output doesn't match PG) 2: #35902 (sql: large object support) 2: #40474 (sql: support `SELECT ... FOR UPDATE OF` syntax) 1: #18846 (sql: Support CIDR column type) 1: #9682 (sql: implement computed indexes) 1: #31632 (sql: FK options (deferrable, etc)) 1: #24897 (sql: CREATE OR REPLACE VIEW) 1: pass? (unknown) 1: #36215 (sql: enable setting standard_conforming_strings to off) 1: #32562 (sql: support SET LOCAL and txn-scoped session variable changes) 1: #36116 (sql: psychopg: investigate how `'infinity'::timestamp` is presented) 1: #26732 (sql: support the binary operator: <int> / <float>) 1: #23299 (sql: support coercing string literals to arrays) 1: #36115 (sql: psychopg: investigate if datetimetz is being returned instead of datetime) 1: #26925 (sql: make the CockroachDB integer types more compatible with postgres) 1: #21085 (sql: WITH RECURSIVE (recursive common table expressions)) 1: #36179 (sql: implicity convert date to timestamp) 1: #36118 (sql: Cannot parse '24:00' as type time) 1: #31708 (sql: support current_time) ``` Release justification: non-production change Release note: None Co-authored-by: Jordan Lewis <jordanthelewis@gmail.com>

ajwerner · 2021-08-16T14:23:39Z

@rafiss @jordanlewis should this be tracked in the @cockroachdb/sql-experience board?

ajwerner · 2022-02-02T02:36:49Z

We definitely shouldn't go and change the width of integers on already defined table data. If we retain the current default value for upgraded clusters and bootstrap new clusters with the new default, I think that could go a long way towards achieving the goals and not breaking too much backwards compat. I vote we go with sql_sequence_cached on the serial normalization. On the whole, I support the movement and does make sense for compat, but we should retain our escape hatch.

petermattis · 2022-02-02T13:41:09Z

Something to be aware of with regards to compatibility (both here and elsewhere): we don't just need to be backward compatible on existing clusters, but also for new clusters that are expected to have the same behavior as an existing cluster. Consider a common setup: a production cluster, a staging cluster, and N development clusters running locally on engineer's workstations. The production cluster is likely to have been created a long time ago and upgraded through many major versions. Similar story for the staging cluster, though it may get wiped and recreated periodically if something goes horrifically wrong. The local development clusters will get wiped frequently. And if engineer's didn't wipe their development clusters frequently, new engineers would create new local development clusters which have definitely not gone through the same upgrade path as the clusters of other engineers. This isn't a theoretical problem. The difference in the history of clusters has caused migrations on the CC control plane DB to behave differently.

beikov · 2022-09-15T14:22:10Z

I didn't read the whole discussion but just wanted to say that through JDBC, when asking for the column metadata of a integer array column, it will report the column type _int8 whereas it should report _int4. Kind of annoying in Hibernate tests that rely on proper types being reported, but no road blocker.

knz · 2022-09-16T15:24:24Z

@beikov you can influence this by either:

forcing crdb to treat 'integer' as int4, using the session variable default_int_size
or create your column as int4[] explicitly (instead of integer[]).

beikov · 2022-09-19T15:58:05Z

Thanks for the hint. I'll adapt the CockroachDialect in Hibernate then as part of https://hibernate.atlassian.net/browse/HHH-15528

beikov · 2022-09-22T16:46:35Z

The next issue I have run into is that serial4 produces a value that is too big to fit into a 4 byte integer. With a table:

    create table TriggerEntity (
       id serial4 not null,
        name varchar(255),
        primary key (id)
    )

extracting the generated value with ResultSet#getInt results in:

Caused by: org.postgresql.util.PSQLException: Bad value for type int : 798864890501038081
	at app//org.postgresql.jdbc.PgResultSet.toInt(PgResultSet.java:3218)
	at app//org.postgresql.jdbc.PgResultSet.getInt(PgResultSet.java:2408)
	at app//org.postgresql.jdbc.PgResultSet.getInt(PgResultSet.java:2836)

Pseudo-code:

var stmt = connection.prepareStatement( "insert into TriggerEntity(name) values ('abc')", PreparedStatement.RETURN_GENERATED_KEYS );
stmt.executeUpdate();
var rs = stmt.getGeneratedKeys();
rs.next();
var id = rs.getInt( 1 );

knz · 2022-09-22T16:50:59Z

If you run the create table in an interactive session, you'd see what CockroachDB thinks about serial4:

NOTICE: upgrading the column x to INT8 to utilize the session serial_normalization setting
HINT: change the serial_normalization to sql_sequence or sql_sequence_cached if you wish to use 
a smaller sized serial column at the cost of performance. 
See https://www.cockroachlabs.com/docs/v22.1/serial.html

If you want serial4 to behave exactly like in postgres, at the expense of performance, there's some options. I encourage you to peruse the linked documentation page.

knz added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-sql-pgcompat Semantic compatibility with PostgreSQL A-sql-datatypes SQL column types usable in table descriptors. labels Jun 22, 2018

This was referenced Jun 22, 2018

sql: sequences should be automatically added to a table for postgres compatibility #26730

Closed

sql: support features used by Hibernate #16491

Closed

knz mentioned this issue Jul 13, 2018

sql+opt: translate upcasts in comparisons to suitable bound enlargements #27503

Open

This was referenced Aug 16, 2018

sql: fix the handling of integer types #28690

Merged

sql: fix the reporting of types in information_schema.columns #28945

Merged

knz mentioned this issue Oct 8, 2018

sql: TypeORM support for Node.js Apps #22298

Closed

bobvawter added this to the 2.2 milestone Oct 30, 2018

bobvawter self-assigned this Oct 30, 2018

bobvawter mentioned this issue Nov 21, 2018

sql: release-2.3: follow up on changing INT size (#26925) #32534

Closed

bobvawter removed their assignment May 27, 2020

ajwerner removed this from the 19.1 milestone Aug 16, 2021

blathers-crl bot added the T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) label Aug 16, 2021

rafiss mentioned this issue Nov 22, 2021

sql: default_int_size does not apply to integer literals #72138

Open

michae2 mentioned this issue Apr 1, 2022

PostgreSQL Compatibility: add note about INTEGER type width cockroachdb/docs#13418

Open

rafiss mentioned this issue Sep 18, 2023

feat: support schema search path cockroachdb/activerecord-cockroachdb-adapter#286

Merged

Darkness4 mentioned this issue Jun 22, 2024

[Feature Request] support cockroachdb grafana/grafana#8900

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql: make the CockroachDB integer types more compatible with postgres #26925

sql: make the CockroachDB integer types more compatible with postgres #26925

knz commented Jun 22, 2018 •

edited by cockroach-jira-scripts

Loading

knz commented Jun 22, 2018

benesch commented Jul 3, 2018

knz commented Aug 7, 2018 •

edited

Loading

bobvawter commented Nov 8, 2018 •

edited

Loading

bdarnell commented Nov 9, 2018

knz commented Nov 12, 2018

knz commented Nov 12, 2018

nocduro commented Nov 23, 2018 •

edited

Loading

knz commented Nov 23, 2018

knz commented Nov 23, 2018

bobvawter commented Nov 27, 2018 •

edited

Loading

andy-kimball commented Sep 26, 2019

bdarnell commented Sep 26, 2019

jordanlewis commented Sep 26, 2019

awoods187 commented Oct 4, 2019

jordanlewis commented Oct 4, 2019

bdarnell commented Oct 4, 2019

jordanlewis commented Oct 4, 2019

bdarnell commented Oct 4, 2019

ajwerner commented Aug 16, 2021

ajwerner commented Feb 2, 2022

petermattis commented Feb 2, 2022

beikov commented Sep 15, 2022

knz commented Sep 16, 2022

beikov commented Sep 19, 2022

beikov commented Sep 22, 2022

knz commented Sep 22, 2022 •

edited

Loading

sql: make the CockroachDB integer types more compatible with postgres #26925

sql: make the CockroachDB integer types more compatible with postgres #26925

Comments

knz commented Jun 22, 2018 • edited by cockroach-jira-scripts Loading

knz commented Jun 22, 2018

benesch commented Jul 3, 2018

knz commented Aug 7, 2018 • edited Loading

bobvawter commented Nov 8, 2018 • edited Loading

Motivation:

Goals

Plan

Mixed-version notes

bdarnell commented Nov 9, 2018

knz commented Nov 12, 2018

knz commented Nov 12, 2018

nocduro commented Nov 23, 2018 • edited Loading

knz commented Nov 23, 2018

knz commented Nov 23, 2018

bobvawter commented Nov 27, 2018 • edited Loading

andy-kimball commented Sep 26, 2019

bdarnell commented Sep 26, 2019

jordanlewis commented Sep 26, 2019

awoods187 commented Oct 4, 2019

jordanlewis commented Oct 4, 2019

bdarnell commented Oct 4, 2019

jordanlewis commented Oct 4, 2019

bdarnell commented Oct 4, 2019

ajwerner commented Aug 16, 2021

ajwerner commented Feb 2, 2022

petermattis commented Feb 2, 2022

beikov commented Sep 15, 2022

knz commented Sep 16, 2022

beikov commented Sep 19, 2022

beikov commented Sep 22, 2022

knz commented Sep 22, 2022 • edited Loading

knz commented Jun 22, 2018 •

edited by cockroach-jira-scripts

Loading

knz commented Aug 7, 2018 •

edited

Loading

bobvawter commented Nov 8, 2018 •

edited

Loading

nocduro commented Nov 23, 2018 •

edited

Loading

bobvawter commented Nov 27, 2018 •

edited

Loading

knz commented Sep 22, 2022 •

edited

Loading