backport-2.1: fix the handling of SQL types #29006

knz · 2018-08-23T13:00:34Z

Backport 1/1 commits from #28937.
Backport 1/1 commits from #28942.
Backport 17/17 commits from #28943.
Backport 1/1 commits from #28944.
Backport 1/1 commits from #28945.
Backport 1/1 commits from #28949.
Backport 2/2 commits from #28950.
Backport 1/1 commits from #28951 (#28959).
Backport 1/1 commits from #28952 (#28959).
Backport 1/1 commits from #28955 (#28959).
Backport 1/1 commits from #28959.
Backport 1/1 commits from #28959.
Backport 1/1 commits from #28690.

cc @cockroachdb/release

cockroach-teamcity · 2018-08-23T13:00:42Z

This change is

BramGruneir

This is a ton to backport. I've reviewed the individual commits, but I have no idea how to review this.

I think the commits are good, just a question of if anyone else has any objections.

Reviewable status: complete! 0 of 0 LGTMs obtained

knz · 2018-08-23T20:58:45Z

For reference there was zero merge conflict. So if you're good with the base PRs I'd say there's nothing special to see here.

BramGruneir

SGTM

Reviewable status: complete! 0 of 0 LGTMs obtained

knz · 2018-08-23T22:59:20Z

bors r+

29006: backport-2.1: fix the handling of SQL types r=knz a=knz Backport 1/1 commits from #28937. Backport 1/1 commits from #28942. Backport 17/17 commits from #28943. Backport 1/1 commits from #28944. Backport 1/1 commits from #28945. Backport 1/1 commits from #28949. Backport 2/2 commits from #28950. Backport 1/1 commits from #28951 (#28959). Backport 1/1 commits from #28952 (#28959). Backport 1/1 commits from #28955 (#28959). Backport 1/1 commits from #28959. Backport 1/1 commits from #28959. Backport 1/1 commits from #28690. cc @cockroachdb/release Co-authored-by: Raphael 'kena' Poss <knz@cockroachlabs.com>

craig · 2018-08-23T23:33:48Z

Build failed

GitHub CI (Cockroach)

The functions to retrieve the sizes of value encodings have not been used in a long time. Remove them. Release note: None

The `ColumnType` struct is not trivially small. Also calling a by-pointer method on a copy may cause it to escape on the heap. Avoid making copies until necessary. Release note: None

These will be used to break up table.go in smaller pieces. Release note: None

Release note: None

…estutils Release note: None

Release note: None

Since it's super small and used only in one place. Release note: None

Release note: None

to EncodeDatumKeyAscending and EncodeDatumsKeyAscending, respectively. Also, add comments and move them to column_type_encoding.go. Release note: None

This was not well defined; the UNION code that required it really has particular semantics and it is thus best defined close to the implementation of UNION. Release note: None

Release note: None

It is unneeded; we already have a perfectly useful `TypeName` map in package `oid` which does the same. Release note: None

Prior to this patch, CockroachDB incorrectly placed the "input syntax" of each SQL type in the column `data_type` of `information_schema.columns`. The input syntax is the one reported in SHOW COLUMNS, SHOW CREATE TABLE and other places, and is suitable to reproduce the exact type of at able. In contrast, `information_schema.columns.data_type` is constrained by compatibility with third party tools and PostgreSQL clients. It must report the name of the type like PostgreSQL does, which in turn is constrained by the SQL standard. A text column must be reported as "text" not "string"; a decimal column as "numeric" not "decimal", a float8 column as "double precision" not "float8", and so on. By reporting the wrong string in that column CockroachDB is confusing ORMs, which subsequently decide that the current on-disk type is not the one expected by the app and then initiate a schema change (ALTER COLUMN SET TYPE). This patch corrects this incompatibility by introducing logic that produces the proper information schema names for column types. This is expected to reduce ORM complaints about insufficient support for ALTER COLUMN SET TYPE (but will be tested/evaluated separately). Release note (bug fix): CockroachDB now populates the `data_type` column of `information_schema.columns` like PostgreSQL for compatibility with 3rd party tools and ORMs.

There are major problems in `cockroach dump` described separately in cockroachdb#28948. Part of this is `cockroach dump` trying to reverse-engineering a datum type from the announced column type in `information_schema.columns`, by parsing the type name. Prior to this patch, this parsing was incomplete (it went out of sync with the SQL parser over time, naturally as it was bound to do). This entire approach is flawed, but this is no time to redesign `cockroach dump`. Instead, this patch re-syncs the dump-specific type parser with the general SQL parser. Release note (bug fix): `cockroach dump` is now again better able to operate across multiple CockroachDB versions.

The distinction between "TEXT" and "STRING" inside CockroachDB is inconsequential. Remove it. The data type is already properly reported in pg_catalog and information_schema as "text". Release note: None

Release note: None

This patch corrects two long-standing bugs in CockroachDB which were confusing both 3rd party tools/ORMs and users: - the SQL standard type called "CHAR" is supposed to have a default maximum width of 1 character. Prior to this patch, CockroachDB did not support this and instead set no default maximum width on CHAR. This would cause CockroachDB to fail validating the width of values inserted in tables, fail to constrain the width of values during casts to CHAR, and report incorrect information in introspection tables. - PostgresSQL's dialect distinguishes the table column types TEXT, CHAR, VARCHAR and a special thing called `"char"` (this is a legacy PostgreSQL type which is, unfortunately, still used by some pg_catalog tables). Prior to this patch, CockroachDB would map all of these types into the same column type "STRING". In turn this caused them to show up as "text" in introspection. While this was not incorrect from the perspective of value handling (all these types are, indeed, handled with about the same value semantics in both PostgreSQL and CockroachDB), ORMs do pay attention to this distinction and become confused if they see a different type in introspection (e.g. "text") than what the application model requires (e.g. "char"). The result of this confusion was to trigger a schema change to adapt the type, which in turn fails due to missing features in ALTER COLUMN TYPE. This patch corrects both problems by giving the appropriate default width to CHAR and preserving the distinction between the various string types in column descriptors and thus introspection. Additionally, a width constraint check on collated string columns was missing. This is now fixed too. Tables and columns created prior to this patch are unaffected. Release note (bug fix): CockroachDB now distinguishes "CHAR" and "VARCHAR" as mandated by the SQL standard and PostgreSQL compatibility. When a width is not specified (e.g. `CHAR(3)`), the maximum width of `VARCHAR` remains unconstrained whereas the maximum width for `CHAR` is now 1 character. Release note (bug fix): CockroachDB now properly checks the width of strings inserted in a collated string column with a specified width. Release note (sql change): CockroachDB now preserves the distinction between different column types for string values like in PostgreSQL, for compatibility with 3rd party tools and ORMs.

The distinction between the type names "JSON" and "JSONB" is inconsequential. The types are already reported properly in introspection using the right alias. This patch removes this distinction in code. Release note: None

Prior to this patch the canonical name for the timestamptz type in CockroachDB was "TIMESTAMP WITH TIME ZONE" with spaces and all glorious 24 characters. This is unfortunate because this is a pretty common type, and is moreover used to disambiguate the return type of `now()` and thus will be emitted pretty often in distsql physical plans. Having such a long type name is unnecessary and indeed adds 50%+ storage, communication and parsing overhead for every occurrence of this type in distsql plans. This patch shortens the canonical CockroachDB name to "TIMESTAMPTZ". The representation of the type in introspection tables (pg_catalog, information_schema) is unaffected. Release note: None

It was not needed. This simplifies. Release note: None

Prior to this patch, CockroachDB maintained an unnecessary distinction between "INT" and "INTEGER", between "BIGINT" and "INT8", etc. This distinction is unnecessary but also costly, as we were paying the price of a "name" attribute in coltypes.TInt, with a string comparison and hash table lookup on every use of the type. What really matters is that the type shows up properly in introspection; this has already been ensured by various OID-to-pgcatalog mappings and the recently introduced `InformationSchemaTypeName()`. Any distinction beyond that is unnecessary and can be dropped from the implementation. Release note: None

knz · 2018-08-24T06:56:42Z

Merge skew with an opt backport, rebased.

bors r+

29006: backport-2.1: fix the handling of SQL types r=knz a=knz Backport 1/1 commits from #28937. Backport 1/1 commits from #28942. Backport 17/17 commits from #28943. Backport 1/1 commits from #28944. Backport 1/1 commits from #28945. Backport 1/1 commits from #28949. Backport 2/2 commits from #28950. Backport 1/1 commits from #28951 (#28959). Backport 1/1 commits from #28952 (#28959). Backport 1/1 commits from #28955 (#28959). Backport 1/1 commits from #28959. Backport 1/1 commits from #28959. Backport 1/1 commits from #28690. cc @cockroachdb/release Co-authored-by: Raphael 'kena' Poss <knz@cockroachlabs.com>

craig · 2018-08-24T07:24:26Z

Build succeeded

GitHub CI (Cockroach)

knz requested a review from BramGruneir August 23, 2018 13:00

knz requested a review from a team August 23, 2018 13:00

knz requested a review from a team as a code owner August 23, 2018 13:00

knz requested a review from a team August 23, 2018 13:00

knz requested a review from a team as a code owner August 23, 2018 13:00

knz requested review from a team August 23, 2018 13:00

knz requested a review from a team as a code owner August 23, 2018 13:00

knz requested review from a team August 23, 2018 13:00

BramGruneir approved these changes Aug 23, 2018

View reviewed changes

BramGruneir reviewed Aug 23, 2018

View reviewed changes

knz force-pushed the 20180823-backport-28690 branch from 257112b to 7e9633b Compare August 23, 2018 22:51

knz added 7 commits August 24, 2018 08:55

sql: remove dead code

029a023

The functions to retrieve the sizes of value encodings have not been used in a long time. Remove them. Release note: None

sql/distsqlplan: avoid excessive data copying

6889cfb

The `ColumnType` struct is not trivially small. Also calling a by-pointer method on a copy may cause it to escape on the heap. Avoid making copies until necessary. Release note: None

sqlbase: create new files

074013e

These will be used to break up table.go in smaller pieces. Release note: None

sqlbase: move DatumAlloc to its own file

6752529

Release note: None

sqlbase: move index encoding functions to their own file

bccce73

Release note: None

sqlbase: move helper functions down and add comments

025ecf6

Release note: None

sqlbase: MakePrimaryIndexKey -> TestingMakePrimaryIndexKey, move to t…

fb6946a

…estutils Release note: None

knz added 20 commits August 24, 2018 08:55

sqlbase: unexport DatumTypeToColumnSemanticType

0d3f530

Release note: None

sqlbase: move encoding functions to their own file

563f3a8

Release note: None

sqlbase: place decode functions next to encode functions

4382f10

Release note: None

sqlbase: parserTypeToEncodingType -> datumTypeToArrayElementEncodingType

9598a74

Release note: None

sqlbase: remove checkElementType

3210bc0

Since it's super small and used only in one place. Release note: None

sqlbase: add missing comments

2f8b7a4

Release note: None

sqlbase: rename EncodeDatum and EncodeDatums

6b18801

to EncodeDatumKeyAscending and EncodeDatumsKeyAscending, respectively. Also, add comments and move them to column_type_encoding.go. Release note: None

sqlbase: remove (*ColumnType).Equivalent()

005cff3

This was not well defined; the UNION code that required it really has particular semantics and it is thus best defined close to the implementation of UNION. Release note: None

sqlbase: improve comments pertaining to index encodings

63301a7

Release note: None

sqlbase: add a clarifying comment on MarshalColumnValue

485aeac

Release note: None

sql/sem/tree,sql/coltypes: remove PGDisplayName

ad981b9

It is unneeded; we already have a perfectly useful `TypeName` map in package `oid` which does the same. Release note: None

sql: desugar TEXT to STRING

7311a4f

The distinction between "TEXT" and "STRING" inside CockroachDB is inconsequential. Remove it. The data type is already properly reported in pg_catalog and information_schema as "text". Release note: None

sql/sem/tree: mark TestPretty testdata as generated

979897a

Release note: None

sql: desugar JSON to JSONB

40a14d8

The distinction between the type names "JSON" and "JSONB" is inconsequential. The types are already reported properly in introspection using the right alias. This patch removes this distinction in code. Release note: None

sql/coltypes: remove the field Name in TArray

4f49daf

It was not needed. This simplifies. Release note: None

knz force-pushed the 20180823-backport-28690 branch from 7e9633b to 4f27ce0 Compare August 24, 2018 06:56

craig bot merged commit 4f27ce0 into cockroachdb:release-2.1 Aug 24, 2018

knz deleted the 20180823-backport-28690 branch August 24, 2018 07:26

This was referenced Sep 4, 2018

backport-2.1: fix the handling of SQL types cockroachdb/docs#3688

Closed

backport-2.1: fix the handling of SQL types cockroachdb/docs#3692

Closed

backport-2.1: fix the handling of SQL types cockroachdb/docs#3693

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

backport-2.1: fix the handling of SQL types #29006

backport-2.1: fix the handling of SQL types #29006

knz commented Aug 23, 2018

cockroach-teamcity commented Aug 23, 2018

BramGruneir left a comment

knz commented Aug 23, 2018

BramGruneir left a comment

knz commented Aug 23, 2018

craig bot commented Aug 23, 2018

knz commented Aug 24, 2018

craig bot commented Aug 24, 2018

backport-2.1: fix the handling of SQL types #29006

backport-2.1: fix the handling of SQL types #29006

Conversation

knz commented Aug 23, 2018

cockroach-teamcity commented Aug 23, 2018

BramGruneir left a comment

Choose a reason for hiding this comment

knz commented Aug 23, 2018

BramGruneir left a comment

Choose a reason for hiding this comment

knz commented Aug 23, 2018

craig bot commented Aug 23, 2018

Build failed

knz commented Aug 24, 2018

craig bot commented Aug 24, 2018

Build succeeded