Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backport-2.1: fix the handling of SQL types #29006

Merged
merged 29 commits into from
Aug 24, 2018

Commits on Aug 24, 2018

  1. sql: remove dead code

    The functions to retrieve the sizes of value encodings have not been
    used in a long time. Remove them.
    
    Release note: None
    knz committed Aug 24, 2018
    Configuration menu
    Copy the full SHA
    029a023 View commit details
    Browse the repository at this point in the history
  2. sql/distsqlplan: avoid excessive data copying

    The `ColumnType` struct is not trivially small. Also calling a
    by-pointer method on a copy may cause it to escape on the heap. Avoid
    making copies until necessary.
    
    Release note: None
    knz committed Aug 24, 2018
    Configuration menu
    Copy the full SHA
    6889cfb View commit details
    Browse the repository at this point in the history
  3. sqlbase: create new files

    These will be used to break up table.go in smaller pieces.
    
    Release note: None
    knz committed Aug 24, 2018
    Configuration menu
    Copy the full SHA
    074013e View commit details
    Browse the repository at this point in the history
  4. sqlbase: move DatumAlloc to its own file

    Release note: None
    knz committed Aug 24, 2018
    Configuration menu
    Copy the full SHA
    6752529 View commit details
    Browse the repository at this point in the history
  5. sqlbase: move index encoding functions to their own file

    Release note: None
    knz committed Aug 24, 2018
    Configuration menu
    Copy the full SHA
    bccce73 View commit details
    Browse the repository at this point in the history
  6. sqlbase: move helper functions down and add comments

    Release note: None
    knz committed Aug 24, 2018
    Configuration menu
    Copy the full SHA
    025ecf6 View commit details
    Browse the repository at this point in the history
  7. sqlbase: MakePrimaryIndexKey -> TestingMakePrimaryIndexKey, move to t…

    …estutils
    
    Release note: None
    knz committed Aug 24, 2018
    Configuration menu
    Copy the full SHA
    fb6946a View commit details
    Browse the repository at this point in the history
  8. sqlbase: move functions to column_type_properties.go

    Release note: None
    knz committed Aug 24, 2018
    Configuration menu
    Copy the full SHA
    d0f949d View commit details
    Browse the repository at this point in the history
  9. sqlbase: add missing documentation

    Release note: None
    knz committed Aug 24, 2018
    Configuration menu
    Copy the full SHA
    6c2a005 View commit details
    Browse the repository at this point in the history
  10. sqlbase: unexport DatumTypeToColumnSemanticType

    Release note: None
    knz committed Aug 24, 2018
    Configuration menu
    Copy the full SHA
    0d3f530 View commit details
    Browse the repository at this point in the history
  11. sqlbase: move encoding functions to their own file

    Release note: None
    knz committed Aug 24, 2018
    Configuration menu
    Copy the full SHA
    563f3a8 View commit details
    Browse the repository at this point in the history
  12. sqlbase: place decode functions next to encode functions

    Release note: None
    knz committed Aug 24, 2018
    Configuration menu
    Copy the full SHA
    4382f10 View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    9598a74 View commit details
    Browse the repository at this point in the history
  14. sqlbase: remove checkElementType

    Since it's super small and used only in one place.
    
    Release note: None
    knz committed Aug 24, 2018
    Configuration menu
    Copy the full SHA
    3210bc0 View commit details
    Browse the repository at this point in the history
  15. sqlbase: add missing comments

    Release note: None
    knz committed Aug 24, 2018
    Configuration menu
    Copy the full SHA
    2f8b7a4 View commit details
    Browse the repository at this point in the history
  16. sqlbase: rename EncodeDatum and EncodeDatums

    to EncodeDatumKeyAscending and EncodeDatumsKeyAscending, respectively.
    Also, add comments and move them to column_type_encoding.go.
    
    Release note: None
    knz committed Aug 24, 2018
    Configuration menu
    Copy the full SHA
    6b18801 View commit details
    Browse the repository at this point in the history
  17. sqlbase: remove (*ColumnType).Equivalent()

    This was not well defined; the UNION code that required it really has
    particular semantics and it is thus best defined close to the
    implementation of UNION.
    
    Release note: None
    knz committed Aug 24, 2018
    Configuration menu
    Copy the full SHA
    005cff3 View commit details
    Browse the repository at this point in the history
  18. sqlbase: improve comments pertaining to index encodings

    Release note: None
    knz committed Aug 24, 2018
    Configuration menu
    Copy the full SHA
    63301a7 View commit details
    Browse the repository at this point in the history
  19. sqlbase: add a clarifying comment on MarshalColumnValue

    Release note: None
    knz committed Aug 24, 2018
    Configuration menu
    Copy the full SHA
    485aeac View commit details
    Browse the repository at this point in the history
  20. sql/sem/tree,sql/coltypes: remove PGDisplayName

    It is unneeded; we already have a perfectly useful `TypeName` map in
    package `oid` which does the same.
    
    Release note: None
    knz committed Aug 24, 2018
    Configuration menu
    Copy the full SHA
    ad981b9 View commit details
    Browse the repository at this point in the history
  21. sql: fix the reporting of types in information_schema.columns

    Prior to this patch, CockroachDB incorrectly placed the "input syntax"
    of each SQL type in the column `data_type` of
    `information_schema.columns`.
    
    The input syntax is the one reported in SHOW COLUMNS, SHOW CREATE
    TABLE and other places, and is suitable to reproduce the exact type of
    at able.
    
    In contrast, `information_schema.columns.data_type` is constrained by
    compatibility with third party tools and PostgreSQL clients.  It must
    report the name of the type like PostgreSQL does, which in turn is
    constrained by the SQL standard. A text column must be reported as
    "text" not "string"; a decimal column as "numeric" not "decimal", a
    float8 column as "double precision" not "float8", and so on.
    
    By reporting the wrong string in that column CockroachDB is confusing
    ORMs, which subsequently decide that the current on-disk type is not
    the one expected by the app and then initiate a schema change (ALTER
    COLUMN SET TYPE).
    
    This patch corrects this incompatibility by introducing logic that
    produces the proper information schema names for column types. This is
    expected to reduce ORM complaints about insufficient support for ALTER
    COLUMN SET TYPE (but will be tested/evaluated separately).
    
    Release note (bug fix): CockroachDB now populates the `data_type`
    column of `information_schema.columns` like PostgreSQL for
    compatibility with 3rd party tools and ORMs.
    knz committed Aug 24, 2018
    Configuration menu
    Copy the full SHA
    8ac12cd View commit details
    Browse the repository at this point in the history
  22. cli,sql: add another crutch to the handling of column types in dump

    There are major problems in `cockroach dump` described separately in
    cockroachdb#28948.
    
    Part of this is `cockroach dump` trying to reverse-engineering a datum
    type from the announced column type in `information_schema.columns`,
    by parsing the type name. Prior to this patch, this parsing was
    incomplete (it went out of sync with the SQL parser over time,
    naturally as it was bound to do).
    
    This entire approach is flawed, but this is no time to redesign
    `cockroach dump`. Instead, this patch re-syncs the dump-specific type
    parser with the general SQL parser.
    
    Release note (bug fix): `cockroach dump` is now again better able to
    operate across multiple CockroachDB versions.
    knz committed Aug 24, 2018
    Configuration menu
    Copy the full SHA
    89bd0c7 View commit details
    Browse the repository at this point in the history
  23. sql: desugar TEXT to STRING

    The distinction between "TEXT" and "STRING" inside CockroachDB is
    inconsequential. Remove it. The data type is already properly reported
    in pg_catalog and information_schema as "text".
    
    Release note: None
    knz committed Aug 24, 2018
    Configuration menu
    Copy the full SHA
    7311a4f View commit details
    Browse the repository at this point in the history
  24. sql/sem/tree: mark TestPretty testdata as generated

    Release note: None
    knz committed Aug 24, 2018
    Configuration menu
    Copy the full SHA
    979897a View commit details
    Browse the repository at this point in the history
  25. sql: fix the handling of string types

    This patch corrects two long-standing bugs in CockroachDB which were
    confusing both 3rd party tools/ORMs and users:
    
    - the SQL standard type called "CHAR" is supposed to have a default
      maximum width of 1 character.
    
      Prior to this patch, CockroachDB did not support this and instead
      set no default maximum width on CHAR. This would cause CockroachDB
      to fail validating the width of values inserted in tables, fail
      to constrain the width of values during casts to CHAR, and report
      incorrect information in introspection tables.
    
    - PostgresSQL's dialect distinguishes the table column types TEXT,
      CHAR, VARCHAR and a special thing called `"char"` (this is a legacy
      PostgreSQL type which is, unfortunately, still used by some
      pg_catalog tables).
    
      Prior to this patch, CockroachDB would map all of these types into
      the same column type "STRING". In turn this caused them to show
      up as "text" in introspection.
    
      While this was not incorrect from the perspective of value
      handling (all these types are, indeed, handled with about the same
      value semantics in both PostgreSQL and CockroachDB), ORMs
      do pay attention to this distinction and become confused if they
      see a different type in introspection (e.g. "text") than what
      the application model requires (e.g. "char"). The result of this
      confusion was to trigger a schema change to adapt the type, which
      in turn fails due to missing features in ALTER COLUMN TYPE.
    
    This patch corrects both problems by giving the appropriate default
    width to CHAR and preserving the distinction between the various string
    types in column descriptors and thus introspection.
    
    Additionally, a width constraint check on collated string columns was
    missing. This is now fixed too.
    
    Tables and columns created prior to this patch are unaffected.
    
    Release note (bug fix): CockroachDB now distinguishes "CHAR" and
    "VARCHAR" as mandated by the SQL standard and PostgreSQL
    compatibility. When a width is not specified (e.g. `CHAR(3)`), the
    maximum width of `VARCHAR` remains unconstrained whereas the maximum
    width for `CHAR` is now 1 character.
    
    Release note (bug fix): CockroachDB now properly checks the width of
    strings inserted in a collated string column with a specified width.
    
    Release note (sql change): CockroachDB now preserves the distinction
    between different column types for string values like in PostgreSQL,
    for compatibility with 3rd party tools and ORMs.
    knz committed Aug 24, 2018
    Configuration menu
    Copy the full SHA
    9374455 View commit details
    Browse the repository at this point in the history
  26. sql: desugar JSON to JSONB

    The distinction between the type names "JSON" and "JSONB" is
    inconsequential. The types are already reported properly in
    introspection using the right alias. This patch removes this
    distinction in code.
    
    Release note: None
    knz committed Aug 24, 2018
    Configuration menu
    Copy the full SHA
    40a14d8 View commit details
    Browse the repository at this point in the history
  27. sql: desugar "TIMESTAMP WITH TIME ZONE" to "TIMESTAMPTZ"

    Prior to this patch the canonical name for the timestamptz type in
    CockroachDB was "TIMESTAMP WITH TIME ZONE" with spaces and all
    glorious 24 characters.
    
    This is unfortunate because this is a pretty common type, and is
    moreover used to disambiguate the return type of `now()` and thus will
    be emitted pretty often in distsql physical plans. Having such a long
    type name is unnecessary and indeed adds 50%+ storage, communication
    and parsing overhead for every occurrence of this type in
    distsql plans.
    
    This patch shortens the canonical CockroachDB name to "TIMESTAMPTZ".
    
    The representation of the type in introspection tables (pg_catalog,
    information_schema) is unaffected.
    
    Release note: None
    knz committed Aug 24, 2018
    Configuration menu
    Copy the full SHA
    621f421 View commit details
    Browse the repository at this point in the history
  28. sql/coltypes: remove the field Name in TArray

    It was not needed. This simplifies.
    
    Release note: None
    knz committed Aug 24, 2018
    Configuration menu
    Copy the full SHA
    4f49daf View commit details
    Browse the repository at this point in the history
  29. sql: fix the handling of integer types

    Prior to this patch, CockroachDB maintained an unnecessary distinction
    between "INT" and "INTEGER", between "BIGINT" and "INT8", etc.
    
    This distinction is unnecessary but also costly, as we were paying the
    price of a "name" attribute in coltypes.TInt, with a string comparison
    and hash table lookup on every use of the type.
    
    What really matters is that the type shows up properly in
    introspection; this has already been ensured by various
    OID-to-pgcatalog mappings and the recently introduced
    `InformationSchemaTypeName()`.
    
    Any distinction beyond that is unnecessary and can be dropped from the
    implementation.
    
    Release note: None
    knz committed Aug 24, 2018
    Configuration menu
    Copy the full SHA
    4f27ce0 View commit details
    Browse the repository at this point in the history