-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
backport-2.1: fix the handling of SQL types #29006
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a ton to backport. I've reviewed the individual commits, but I have no idea how to review this.
I think the commits are good, just a question of if anyone else has any objections.
Reviewable status: complete! 0 of 0 LGTMs obtained
For reference there was zero merge conflict. So if you're good with the base PRs I'd say there's nothing special to see here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM
Reviewable status: complete! 0 of 0 LGTMs obtained
257112b
to
7e9633b
Compare
bors r+ |
29006: backport-2.1: fix the handling of SQL types r=knz a=knz Backport 1/1 commits from #28937. Backport 1/1 commits from #28942. Backport 17/17 commits from #28943. Backport 1/1 commits from #28944. Backport 1/1 commits from #28945. Backport 1/1 commits from #28949. Backport 2/2 commits from #28950. Backport 1/1 commits from #28951 (#28959). Backport 1/1 commits from #28952 (#28959). Backport 1/1 commits from #28955 (#28959). Backport 1/1 commits from #28959. Backport 1/1 commits from #28959. Backport 1/1 commits from #28690. cc @cockroachdb/release Co-authored-by: Raphael 'kena' Poss <knz@cockroachlabs.com>
Build failed |
The functions to retrieve the sizes of value encodings have not been used in a long time. Remove them. Release note: None
The `ColumnType` struct is not trivially small. Also calling a by-pointer method on a copy may cause it to escape on the heap. Avoid making copies until necessary. Release note: None
These will be used to break up table.go in smaller pieces. Release note: None
Release note: None
Release note: None
Release note: None
…estutils Release note: None
Release note: None
Release note: None
Release note: None
Release note: None
Since it's super small and used only in one place. Release note: None
Release note: None
to EncodeDatumKeyAscending and EncodeDatumsKeyAscending, respectively. Also, add comments and move them to column_type_encoding.go. Release note: None
This was not well defined; the UNION code that required it really has particular semantics and it is thus best defined close to the implementation of UNION. Release note: None
Release note: None
Release note: None
It is unneeded; we already have a perfectly useful `TypeName` map in package `oid` which does the same. Release note: None
Prior to this patch, CockroachDB incorrectly placed the "input syntax" of each SQL type in the column `data_type` of `information_schema.columns`. The input syntax is the one reported in SHOW COLUMNS, SHOW CREATE TABLE and other places, and is suitable to reproduce the exact type of at able. In contrast, `information_schema.columns.data_type` is constrained by compatibility with third party tools and PostgreSQL clients. It must report the name of the type like PostgreSQL does, which in turn is constrained by the SQL standard. A text column must be reported as "text" not "string"; a decimal column as "numeric" not "decimal", a float8 column as "double precision" not "float8", and so on. By reporting the wrong string in that column CockroachDB is confusing ORMs, which subsequently decide that the current on-disk type is not the one expected by the app and then initiate a schema change (ALTER COLUMN SET TYPE). This patch corrects this incompatibility by introducing logic that produces the proper information schema names for column types. This is expected to reduce ORM complaints about insufficient support for ALTER COLUMN SET TYPE (but will be tested/evaluated separately). Release note (bug fix): CockroachDB now populates the `data_type` column of `information_schema.columns` like PostgreSQL for compatibility with 3rd party tools and ORMs.
There are major problems in `cockroach dump` described separately in cockroachdb#28948. Part of this is `cockroach dump` trying to reverse-engineering a datum type from the announced column type in `information_schema.columns`, by parsing the type name. Prior to this patch, this parsing was incomplete (it went out of sync with the SQL parser over time, naturally as it was bound to do). This entire approach is flawed, but this is no time to redesign `cockroach dump`. Instead, this patch re-syncs the dump-specific type parser with the general SQL parser. Release note (bug fix): `cockroach dump` is now again better able to operate across multiple CockroachDB versions.
The distinction between "TEXT" and "STRING" inside CockroachDB is inconsequential. Remove it. The data type is already properly reported in pg_catalog and information_schema as "text". Release note: None
Release note: None
This patch corrects two long-standing bugs in CockroachDB which were confusing both 3rd party tools/ORMs and users: - the SQL standard type called "CHAR" is supposed to have a default maximum width of 1 character. Prior to this patch, CockroachDB did not support this and instead set no default maximum width on CHAR. This would cause CockroachDB to fail validating the width of values inserted in tables, fail to constrain the width of values during casts to CHAR, and report incorrect information in introspection tables. - PostgresSQL's dialect distinguishes the table column types TEXT, CHAR, VARCHAR and a special thing called `"char"` (this is a legacy PostgreSQL type which is, unfortunately, still used by some pg_catalog tables). Prior to this patch, CockroachDB would map all of these types into the same column type "STRING". In turn this caused them to show up as "text" in introspection. While this was not incorrect from the perspective of value handling (all these types are, indeed, handled with about the same value semantics in both PostgreSQL and CockroachDB), ORMs do pay attention to this distinction and become confused if they see a different type in introspection (e.g. "text") than what the application model requires (e.g. "char"). The result of this confusion was to trigger a schema change to adapt the type, which in turn fails due to missing features in ALTER COLUMN TYPE. This patch corrects both problems by giving the appropriate default width to CHAR and preserving the distinction between the various string types in column descriptors and thus introspection. Additionally, a width constraint check on collated string columns was missing. This is now fixed too. Tables and columns created prior to this patch are unaffected. Release note (bug fix): CockroachDB now distinguishes "CHAR" and "VARCHAR" as mandated by the SQL standard and PostgreSQL compatibility. When a width is not specified (e.g. `CHAR(3)`), the maximum width of `VARCHAR` remains unconstrained whereas the maximum width for `CHAR` is now 1 character. Release note (bug fix): CockroachDB now properly checks the width of strings inserted in a collated string column with a specified width. Release note (sql change): CockroachDB now preserves the distinction between different column types for string values like in PostgreSQL, for compatibility with 3rd party tools and ORMs.
The distinction between the type names "JSON" and "JSONB" is inconsequential. The types are already reported properly in introspection using the right alias. This patch removes this distinction in code. Release note: None
Prior to this patch the canonical name for the timestamptz type in CockroachDB was "TIMESTAMP WITH TIME ZONE" with spaces and all glorious 24 characters. This is unfortunate because this is a pretty common type, and is moreover used to disambiguate the return type of `now()` and thus will be emitted pretty often in distsql physical plans. Having such a long type name is unnecessary and indeed adds 50%+ storage, communication and parsing overhead for every occurrence of this type in distsql plans. This patch shortens the canonical CockroachDB name to "TIMESTAMPTZ". The representation of the type in introspection tables (pg_catalog, information_schema) is unaffected. Release note: None
It was not needed. This simplifies. Release note: None
Prior to this patch, CockroachDB maintained an unnecessary distinction between "INT" and "INTEGER", between "BIGINT" and "INT8", etc. This distinction is unnecessary but also costly, as we were paying the price of a "name" attribute in coltypes.TInt, with a string comparison and hash table lookup on every use of the type. What really matters is that the type shows up properly in introspection; this has already been ensured by various OID-to-pgcatalog mappings and the recently introduced `InformationSchemaTypeName()`. Any distinction beyond that is unnecessary and can be dropped from the implementation. Release note: None
7e9633b
to
4f27ce0
Compare
Merge skew with an opt backport, rebased. bors r+ |
29006: backport-2.1: fix the handling of SQL types r=knz a=knz Backport 1/1 commits from #28937. Backport 1/1 commits from #28942. Backport 17/17 commits from #28943. Backport 1/1 commits from #28944. Backport 1/1 commits from #28945. Backport 1/1 commits from #28949. Backport 2/2 commits from #28950. Backport 1/1 commits from #28951 (#28959). Backport 1/1 commits from #28952 (#28959). Backport 1/1 commits from #28955 (#28959). Backport 1/1 commits from #28959. Backport 1/1 commits from #28959. Backport 1/1 commits from #28690. cc @cockroachdb/release Co-authored-by: Raphael 'kena' Poss <knz@cockroachlabs.com>
Build succeeded |
Backport 1/1 commits from #28937.
Backport 1/1 commits from #28942.
Backport 17/17 commits from #28943.
Backport 1/1 commits from #28944.
Backport 1/1 commits from #28945.
Backport 1/1 commits from #28949.
Backport 2/2 commits from #28950.
Backport 1/1 commits from #28951 (#28959).
Backport 1/1 commits from #28952 (#28959).
Backport 1/1 commits from #28955 (#28959).
Backport 1/1 commits from #28959.
Backport 1/1 commits from #28959.
Backport 1/1 commits from #28690.
cc @cockroachdb/release