Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: add assignment casts for INSERTs #70722

Merged
merged 4 commits into from
Oct 13, 2021

Conversation

mgartner
Copy link
Collaborator

@mgartner mgartner commented Sep 24, 2021

sql: rename pkg/sql/sem/tree/casts.go to cast.go

Release note: None

tree: add OID->OID cast map

This commit adds a new map that describes valid casts from OID to OID.
It introduces three cast contexts: explicit casts, assignment casts, and
implicit casts. See the comments for CastContext and castMap for
details.

This map will enable us to properly follow Postgres's casting behavior.
Most immediately, it will allow us to support assignment casts.

Future work includes moving volatility in castMap. In the longer term,
cast functions can be moved into castMap as well.

Release note: None

sql: set "char" type width to 1

The "char" type is a special single-character type. This commit
adds a types.QChar with a width one. It removes the types.MakeQChar
function so that it is impossible to create "char" types with any
other width.

Release note: None

sql: add assignment casts for INSERTs

Casts in Postgres are performed in one of three contexts [1]:

  1. An explicit context with CAST(x AS T) or x::T.
  2. An assignment context performed implicitly during an INSERT,
    UPSERT, or UPDATE.
  3. An implicit context during the evaluation of an expression. For
    example the DATE in '2021-01-02'::DATE < now() will be implicitly
    cast to a TIMESTAMPTZ so that the values can be compared.

Not all casts can be performed in all contexts. Postgres's pg_cast table
lists valid casts and specifies the maximum cast context in which each
can be performed. A cast with a max context of explicit can only be
performed in an explicit context. A cast with an assignment max context
can be performed in an explicit context or an assignment context. A cast
with an implicit max context can be performed in all contexts.

Much to my personal disappointment and frustration, there are valid
casts that are not listed in Postgres's pg_cast table. These casts are
called "automatic I/O conversions" and they allow casting most types to
and from the string types: TEXT, VARCHAR, CHAR, NAME, and "char" [2].
We cannot determine these casts' maximum cast context from the pg_cast
table, so we rely on the documentation which states that conversions to
string types are assignment casts and conversions from string types are
explicit casts [3].

--

This commit implements assignment casts for INSERTs. Follow up work will
implement assignment casts for UPSERTs and UPDATEs.

A cast performed in an assignment context has slightly different
behavior than the same cast performed in an explicit context. In an
assignment context, the cast will error if the width of the value is too
large for the given type. In an explicit context, the value will be
truncated to match the width. The one exception is assignment casts to
the special "char" type which do truncate values.

To support different cast behaviors for different contexts, a new
built-in, crdb_internal.assignment_cast has been introduced. This
function takes two arguments: a value and a type. Because SQL
does not have first-class types, a type cannot be passed directly to the
built-in. Instead, a NULL cast to a type is used as a workaround,
similar to the json_populate_record built-in. For example, an integer
can be assignment-cast to a string with:

crdb_internal.assignment_cast(1::INT, NULL::STRING)

The optimizer is responsible for wrapping INSERT columns with the
assignment cast built-in function. If an insert column type T1 is not
identical to the table's corresponding target column type T2, the
optimizer will check if there is a valid cast from T1 to T2 with a
maximum context that allows an assignment cast. If there is a such a
cast, a projection will wrap the column in the assignment cast built-in
function. If there is no such cast, a user error will be produced.

Some changes to prepared statement placeholder type inference were
required in order to better match Postgres's behavior (this is a
best-effort match thus far and there are still inconsistencies). Most
notably, widths and precision are no longer inferred for the types of
placeholders. The effect of this is that assignment casts will be
correctly added by the optimizer in order to make sure that values for
placeholders are correctly coerced to the target column type during
execution of a prepared insert.

The introduction of assignment casts fixes minor bugs and addresses some
inconsistencies with Postgres's behavior. In general, INSERTS now
successfully cast values to target table column types in more cases. As
one example, inserting a string into an integer column now succeeds:

CREATE TABLE t (i INT)
INSERT INTO t VALUES ('1'::STRING)

Prior to this commit there was logic that mimicked assignment casts, but
it was not correct. Bugs in the implementation caused incorrect behavior
when inserting into tables with computed columns. Most notably, a
computed column expression that referenced another column c was
evaluated with the value of c before the assignment cast was
performed. This resulted in incorrect values for computed columns in
some cases.

In addition, assignment casts make the special logic for rounding
decimal values in optbuilder obsolete. The builtin function
crdb_internal.round_decimal_values and related logic in optbuilder
will be removed once assignment casts are implemented for UPSERTs and
UPDATEs.

Fixes #69327
Fixes #69665

[1] https://www.postgresql.org/docs/current/typeconv.html
[2] https://www.postgresql.org/docs/13/catalog-pg-cast.html#CATALOG-PG-CAST
[3] https://www.postgresql.org/docs/13/sql-createcast.html#SQL-CREATECAST-NOTES

Release note (sql change): Implicit casts performed during INSERT
statements now more closely follow Postgres's behavior. Several minor
bugs related to these types of casts have been fixed.

@mgartner mgartner requested review from rytaft, otan, RaduBerinde and a team September 24, 2021 20:37
@mgartner mgartner requested a review from a team as a code owner September 24, 2021 20:37
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've only skimmed this a bit, but it looks like a very nice work!

Reviewed 2 of 2 files at r1, 1 of 31 files at r3.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner, @otan, @RaduBerinde, @rytaft, and @yuzefovich)


pkg/sql/sem/tree/type_check.go, line 460 at r2 (raw file):

	default:
		// TODO(mgartner): Use OID cast map.
		cast := lookupCastInfo(fromFamily, toFamily, intervalStyleEnabled, dateStyleEnabled)

nit: the change to lookupCastInfo is currently in the second commit but probably should be in the third commit.


pkg/sql/sem/tree/cast.go, line 74 at r2 (raw file):

// context. It is only used to annotate entries in castMap and to perform
// assertions on cast entries in the init function. It has no effect on the
// behavior

nit: missing period.

@mgartner mgartner force-pushed the add-assignment-casts branch from bd658c0 to f4d59bb Compare September 24, 2021 21:19
Copy link
Collaborator Author

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @otan, @RaduBerinde, @rytaft, and @yuzefovich)


pkg/sql/sem/tree/type_check.go, line 460 at r2 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: the change to lookupCastInfo is currently in the second commit but probably should be in the third commit.

Great catch! Fixed.


pkg/sql/sem/tree/cast.go, line 74 at r2 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: missing period.

Done.

Copy link
Contributor

@otan otan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

focused mainly on experience-y bit but did skim the queries-y bits. small comments.
sorry it's on github - reviewable drives me crazy with multi-updates on multi-commits

WHEN 'i' THEN 'CastContextImplicit'
END
),
provolatile
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we doing anything with volatile? should we?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're not doing anything with it yet. I've removed it for now, but we'll likely want to use it to make sure we volatilities correct once we move them to this map.

_ contextOrigin = iota
// pgCast specifies that a cast's maximum context is based on information in
// Postgres's pg_cast table.
pgCast
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional but i feel like prefixing these with contextOrigin makes them easier to find / read.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -1493,12 +1546,6 @@ func performCastWithoutPrecisionTruncation(ctx *EvalContext, d Datum, t *types.T
res = NewDInt(DInt(v.Unix()))
case *DTimestampTZ:
res = NewDInt(DInt(v.Unix()))
case *DDate:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you'll need to put in a release note for this.
is this something we want to do for this commit?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I'll do this in another PR.

# INT4, and the INT4 -> "char" cast is explicit. Our default integer type
# is an INT8 and INT8 -> "char" is an assignment cast.
#
# TODO(mgartner): This should return '{'.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

up to you: i think rather than giving the TODO for yourself for these, file an issue and use TODO(#issue_num). that way you can abandon the baby ;) (and maybe pawn it off to uhhh an unsuspecting team)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created #70771, but I might try to get to it before you do 😉

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Impressive work!

Reviewed 2 of 2 files at r1, 2 of 4 files at r2, 32 of 32 files at r4, 32 of 32 files at r5, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @mgartner and @RaduBerinde)


pkg/sql/logictest/testdata/logic_test/cast, line 34 at r5 (raw file):

INSERT INTO assn_cast(vc) VALUES ('abc')

query T

looks like this case and the one below are duplicates


pkg/sql/opt/optbuilder/mutation_builder.go, line 1421 at r5 (raw file):

				colName := string(mb.tab.Column(ord).ColName())
				err := pgerror.Newf(pgcode.DatatypeMismatch,
					"value type %s doesn't match type %s of column %q",

nit: I don't think this matches the Postgres error message -- did you intend to use something different? On Postgres I got "ERROR: column "x" is of type integer but expression is of type date" for this error.


pkg/sql/sem/tree/cast_map_test.go, line 46 at r4 (raw file):

		_, err := tree.PerformCast(&evalCtx, srcDatum, tgtType)
		if err != nil && pgerror.HasCandidateCode(err) && pgerror.GetPGCode(err) == pgcode.CannotCoerce {
			t.Errorf("cast from %s to %s failed: %s", srcType, tgtType, err)

Do we care about any other errors produced? Maybe add a comment to explain


pkg/sql/sem/tree/cast.go, line 72 at r4 (raw file):

// contextOrigin indicates the source of information for a cast's maximum
// context. It is only used to annotate entries in castMap and to perform

What is the "maximum" context? Is this according to the order above where explicit < assignment < implicit? Could a cast have multiple contexts?

Edit: or you could just reference the maxContext description below


pkg/sql/sem/tree/cast.go, line 1241 at r5 (raw file):

// function will error if the given value is too wide for the given type. Width
// truncation should be performed before this function for explicit casts and
// parsing, not during assignment casts. The one exception is assignment casts

nit: this sentence is a bit confusing. It implies that explicit casts happen before assignment casts, but I think you're just trying to distinguish between the two types, right?


pkg/sql/sem/tree/cast.go, line 1435 at r5 (raw file):

// check on whether the datum fits the type. If truncateWidth is true, widths
// are truncated to match the target type t. If truncateWidth is false, the
// input datum is not truncated.

what happens if the datum doesn't fit?

Also, seems like this only applies to some types, right? Looks like values are still being rounded etc. for numeric types.

It's a bit weird that this function is called "withoutPrecisionTruncation", but you can optionally truncate the width. Maybe add a sentence or two to explain the difference? (maybe this is what is confusing me...)

@mgartner mgartner force-pushed the add-assignment-casts branch from f4d59bb to 91d5589 Compare September 27, 2021 17:06
@blathers-crl blathers-crl bot requested a review from otan September 27, 2021 17:06
@mgartner mgartner force-pushed the add-assignment-casts branch from 91d5589 to d85a380 Compare September 27, 2021 17:39
Copy link
Collaborator Author

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @otan, @RaduBerinde, and @rytaft)


pkg/sql/logictest/testdata/logic_test/cast, line 34 at r5 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

looks like this case and the one below are duplicates

Good catch. I meant these to insert into vc. Fixed.


pkg/sql/opt/optbuilder/mutation_builder.go, line 1421 at r5 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

nit: I don't think this matches the Postgres error message -- did you intend to use something different? On Postgres I got "ERROR: column "x" is of type integer but expression is of type date" for this error.

My intention was to leave these error messages as-is in this commit. I can follow up in a future commit to make the error message match Postgres's.


pkg/sql/sem/tree/cast_map_test.go, line 46 at r4 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Do we care about any other errors produced? Maybe add a comment to explain

Done.


pkg/sql/sem/tree/cast.go, line 72 at r4 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

What is the "maximum" context? Is this according to the order above where explicit < assignment < implicit? Could a cast have multiple contexts?

Edit: or you could just reference the maxContext description below

Done.


pkg/sql/sem/tree/cast.go, line 1241 at r5 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

nit: this sentence is a bit confusing. It implies that explicit casts happen before assignment casts, but I think you're just trying to distinguish between the two types, right?

Good catch. I've rewritten the paragraph and hopefully it reads more clearly.


pkg/sql/sem/tree/cast.go, line 1435 at r5 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

what happens if the datum doesn't fit?

Also, seems like this only applies to some types, right? Looks like values are still being rounded etc. for numeric types.

It's a bit weird that this function is called "withoutPrecisionTruncation", but you can optionally truncate the width. Maybe add a sentence or two to explain the difference? (maybe this is what is confusing me...)

I've updated the comment to make things a little more clear. This is a code stink that we're hoping to clean up in the future. The combination of performCastWithoutPrecisionTruncation and AdjustValueToType is confusing. It's hacked together to work, but ideally the casting logic for each source and target type can live in the castMap. What makes things a bit worse is that PG's assignment cast truncation behavior is not documented anywhere as far as I can tell. So this logic is based on a few experimental observations rather than a unified fundamental theory.

@mgartner mgartner force-pushed the add-assignment-casts branch from d85a380 to 0e2a7ec Compare September 27, 2021 17:51
Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 33 files at r6, 2 of 2 files at r7, 32 of 32 files at r8, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner, @otan, and @RaduBerinde)


pkg/sql/sem/tree/cast.go, line 1244 at r8 (raw file):

// called so that an error is not returned. For assignment casts, inVal should
// not be truncated before this function is called, so that an error is
// returned. The one exception is for assignment casts to the special "char"

nit: "The one exception is for assignment casts to the special "char" type" -> maybe "The one exception for assignment casts is for the special "char" type"?


pkg/sql/sem/tree/cast.go, line 1245 at r8 (raw file):

// not be truncated before this function is called, so that an error is
// returned. The one exception is for assignment casts to the special "char"
// type, which truncate and do not error if the width of the datum is wider than

nit: truncate and do not error -> truncates and does not error

@mgartner mgartner force-pushed the add-assignment-casts branch from 0e2a7ec to 8dcc9ff Compare September 28, 2021 13:28
Copy link
Collaborator Author

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some tests revealed that I completely forgot about casts from NULL values. Please take another look. 🙏

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @otan, @RaduBerinde, and @rytaft)


pkg/sql/sem/tree/cast.go, line 1244 at r8 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

nit: "The one exception is for assignment casts to the special "char" type" -> maybe "The one exception for assignment casts is for the special "char" type"?

Done.


pkg/sql/sem/tree/cast.go, line 1245 at r8 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

nit: truncate and do not error -> truncates and does not error

Done.

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 32 of 32 files at r9, 34 of 34 files at r10, 1 of 1 files at r11, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @mgartner, @otan, and @RaduBerinde)


pkg/sql/sem/tree/cast.go, line 1811 at r11 (raw file):

				return nil, errors.Wrapf(err, "type %s", t.SQLString())
			}
			return &dd, err

nit: now you can return nil here instead of err

@mgartner
Copy link
Collaborator Author

mgartner commented Oct 1, 2021

There's a problem with this implementation and prepared statements. When the expression is built during PREPARE we assuming a placeholder decimal has the same precision and scale as the target column's, so no assignment cast expression is added. However, during EXECUTE the placeholder value could have a different scale which is never adjusted because there is no assignment cast expression.

One solution would be to add an assignment cast around every insert input column, then remove them during normalization if it can be proven they are not needed. However, this would require moving the check that the assignment cast is valid into the evaluation of the cast rather than during opt-build - we can't determine that the assignment cast in invalid if we don't know the type of the placeholder. As a result, the error you get when trying to insert an incompatibly-typed value into a column would not occur until execution-time.

@rytaft
Copy link
Collaborator

rytaft commented Oct 4, 2021

One solution would be...

Sounds like a reasonable solution to me.

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice work!

I noticed the cases where a new stable shows up. Are there any cases where the an assignment cast is more volatile than the regular cast between those types? If not, we can use the existing information we have about cast volatility.

Regarding placeholders - we infer a certain type for placeholders. If the placeholder have type hints that are different, we should be able to insert assignment casts as needed during opt-build. Beyond that, any discrepancy between the expected placeholder type and what we get over the wire should be resolved by execBind -> DecodeDatum() - we probably just want to run assignment casts in there. (note that there are also a couple of other code paths which fill in placeholders, one for Execute and one for internal queries).

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @mgartner and @otan)

Copy link
Collaborator Author

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed the cases where a new stable shows up. Are there any cases where the an assignment cast is more volatile than the regular cast between those types? If not, we can use the existing information we have about cast volatility.

There are no cases that I am aware of. The only difference in behavior is the width truncation in an explicit cast vs an error in an assignment cast, which is not context dependent or side-effecting.

Are you suggesting that I add a mechanism for builtins so that they can have argument-dependent volatility?

Regarding placeholders - we infer a certain type for placeholders. If the placeholder have type hints that are different, we should be able to insert assignment casts as needed during opt-build. Beyond that, any discrepancy between the expected placeholder type and what we get over the wire should be resolved by execBind -> DecodeDatum() - we probably just want to run assignment casts in there. (note that there are also a couple of other code paths which fill in placeholders, one for Execute and one for internal queries).

Brilliant. Thanks for the pointers. I'll give it a shot.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @mgartner and @otan)

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you suggesting that I add a mechanism for builtins so that they can have argument-dependent volatility?

I was thinking of special code in the FunctionExpr case in BuildSharedProps: if the name matches the assignment cast function, then use tree.LookupCastVolatility. Maybe it would even make sense to make this a separate operator (AssignmentCastExpr) or use CastExpr and add a flag (even if it ultimately gets built as a function call).

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @mgartner and @otan)

@mgartner mgartner force-pushed the add-assignment-casts branch from bf0b409 to 7530160 Compare October 11, 2021 18:11
@craig
Copy link
Contributor

craig bot commented Oct 13, 2021

Build succeeded:

@craig craig bot merged commit c908959 into cockroachdb:master Oct 13, 2021
@mgartner mgartner deleted the add-assignment-casts branch October 14, 2021 00:32
mgartner added a commit to mgartner/cockroach that referenced this pull request Nov 15, 2021
When assignment casts for inserts were add in cockroachdb#70722, type resolution of
placeholders in prepared statements was changed so that during the
prepare phase the optimizer would only plan assignment casts when it
could not be guaranteed that an insert column would be an identical type
to the target column. Unfortunately, the change to placeholder type
resolution change strays from Postgres's behavior and breaks some
third-party tools. This commit reverts the type resolution changes, and
as a result, assignment casts are always planned to convert each insert
column to its target column type, even if the cast ends up being a no-op
because the types are identical.

I attempted to implement a solution that did not plan assignment casts
in all cases. It required annotating each placeholder expression in the
memo with a context: a non-assignment context or an assignment context.
This context would be used during the EXECUTE to determine whether a
placeholder value should undergo a regular cast or an assignment cast: a
critical decision due to their slightly different behaviors (see
`PerformAssignmentCast`). Annotating each placeholder with a context was
tedious and error-prone, so I opted for this much simpler solution for
now.

Fixes cockroachdb#71576

There is no release note because this bug was never present in a
release.

Release note: None
mgartner added a commit to mgartner/cockroach that referenced this pull request Nov 16, 2021
When assignment casts for inserts were add in cockroachdb#70722, type resolution of
placeholders in prepared statements was changed so that during the
prepare phase the optimizer would only plan assignment casts when it
could not be guaranteed that an insert column would be an identical type
to the target column. Unfortunately, the change to placeholder type
resolution change strays from Postgres's behavior and breaks some
third-party tools. This commit reverts the type resolution changes, and
as a result, assignment casts are always planned to convert each insert
column to its target column type, even if the cast ends up being a no-op
because the types are identical.

I attempted to implement a solution that did not plan assignment casts
in all cases. It required annotating each placeholder expression in the
memo with a context: a non-assignment context or an assignment context.
This context would be used during the EXECUTE to determine whether a
placeholder value should undergo a regular cast or an assignment cast: a
critical decision due to their slightly different behaviors (see
`PerformAssignmentCast`). Annotating each placeholder with a context was
tedious and error-prone, so I opted for this much simpler solution for
now.

Fixes cockroachdb#71576

There is no release note because this bug was never present in a
release.

Release note: None
mgartner added a commit to mgartner/cockroach that referenced this pull request Nov 19, 2021
When assignment casts for inserts were add in cockroachdb#70722, type resolution of
placeholders in prepared statements was changed so that during the
prepare phase the optimizer would only plan assignment casts when it
could not be guaranteed that an insert column would be an identical type
to the target column. Unfortunately, the change to placeholder type
resolution change strays from Postgres's behavior and breaks some
third-party tools. This commit reverts the type resolution changes, and
as a result, assignment casts are always planned to convert each insert
column to its target column type, even if the cast ends up being a no-op
because the types are identical.

I attempted to implement a solution that did not plan assignment casts
in all cases, but it proved to be tedious and error prone, so I opted
for this simpler solution for now. My attempt required annotating each
placeholder expression in the memo with a context: a non-assignment
context or an assignment context. This context would be used during
the EXECUTE to determine whether a placeholder value should undergo a
regular cast or an assignment cast: a critical decision due to their
slightly different behaviors (see `PerformAssignmentCast`). The
problem is that correctly annotating placeholders in complex mutations
with expressions like subqueries or CTEs is difficult.

Fixes cockroachdb#71576

There is no release note because this bug was never present in a
release.

Release note: None
mgartner added a commit to mgartner/cockroach that referenced this pull request Nov 19, 2021
When assignment casts for inserts were add in cockroachdb#70722, type resolution of
placeholders in prepared statements was changed so that during the
prepare phase the optimizer would only plan assignment casts when it
could not be guaranteed that an insert column would be an identical type
to the target column. Unfortunately, the change to placeholder type
resolution change strays from Postgres's behavior and breaks some
third-party tools. This commit reverts the type resolution changes, and
as a result, assignment casts are always planned to convert each insert
column to its target column type, even if the cast ends up being a no-op
because the types are identical.

I attempted to implement a solution that did not plan assignment casts
in all cases, but it proved to be tedious and error prone, so I opted
for this simpler solution for now. My attempt required annotating each
placeholder expression in the memo with a context: a non-assignment
context or an assignment context. This context would be used during
the EXECUTE to determine whether a placeholder value should undergo a
regular cast or an assignment cast: a critical decision due to their
slightly different behaviors (see `PerformAssignmentCast`). The
problem is that correctly annotating placeholders in complex mutations
with expressions like subqueries or CTEs is difficult.

Fixes cockroachdb#71576

There is no release note because this bug was never present in a
release.

Release note: None
mgartner added a commit to mgartner/cockroach that referenced this pull request Nov 30, 2021
When assignment casts for inserts were add in cockroachdb#70722, type resolution of
placeholders in prepared statements was changed so that during the
prepare phase the optimizer would only plan assignment casts when it
could not be guaranteed that an insert column would be an identical type
to the target column. Unfortunately, the change to placeholder type
resolution change strays from Postgres's behavior and breaks some
third-party tools. This commit reverts the type resolution changes, and
as a result, assignment casts are always planned to convert each insert
column to its target column type, even if the cast ends up being a no-op
because the types are identical.

I attempted to implement a solution that did not plan assignment casts
in all cases, but it proved to be tedious and error prone, so I opted
for this simpler solution for now. My attempt required annotating each
placeholder expression in the memo with a context: a non-assignment
context or an assignment context. This context would be used during
the EXECUTE to determine whether a placeholder value should undergo a
regular cast or an assignment cast: a critical decision due to their
slightly different behaviors (see `PerformAssignmentCast`). The
problem is that correctly annotating placeholders in complex mutations
with expressions like subqueries or CTEs is difficult.

Fixes cockroachdb#71576

There is no release note because this bug was never present in a
release.

Release note: None
mgartner added a commit to mgartner/cockroach that referenced this pull request Dec 3, 2021
When assignment casts for inserts were add in cockroachdb#70722, type resolution of
placeholders in prepared statements was changed so that during the
prepare phase the optimizer would only plan assignment casts when it
could not be guaranteed that an insert column would be an identical type
to the target column. Unfortunately, the change to placeholder type
resolution change strays from Postgres's behavior and breaks some
third-party tools. This commit reverts the type resolution changes, and
as a result, assignment casts are always planned to convert each insert
column to its target column type, even if the cast ends up being a no-op
because the types are identical.

I attempted to implement a solution that did not plan assignment casts
in all cases, but it proved to be tedious and error prone, so I opted
for this simpler solution for now. My attempt required annotating each
placeholder expression in the memo with a context: a non-assignment
context or an assignment context. This context would be used during
the EXECUTE to determine whether a placeholder value should undergo a
regular cast or an assignment cast: a critical decision due to their
slightly different behaviors (see `PerformAssignmentCast`). The
problem is that correctly annotating placeholders in complex mutations
with expressions like subqueries or CTEs is difficult.

Fixes cockroachdb#71576

There is no release note because this bug was never present in a
release.

Release note: None
mgartner added a commit to mgartner/cockroach that referenced this pull request Dec 6, 2021
When assignment casts for inserts were add in cockroachdb#70722, type resolution of
placeholders in prepared statements was changed so that during the
prepare phase the optimizer would only plan assignment casts when it
could not be guaranteed that an insert column would be an identical type
to the target column. Unfortunately, the change to placeholder type
resolution change strays from Postgres's behavior and breaks some
third-party tools. This commit reverts the type resolution changes, and
as a result, assignment casts are always planned to convert each insert
column to its target column type, even if the cast ends up being a no-op
because the types are identical.

I attempted to implement a solution that did not plan assignment casts
in all cases, but it proved to be tedious and error prone, so I opted
for this simpler solution for now. My attempt required annotating each
placeholder expression in the memo with a context: a non-assignment
context or an assignment context. This context would be used during
the EXECUTE to determine whether a placeholder value should undergo a
regular cast or an assignment cast: a critical decision due to their
slightly different behaviors (see `PerformAssignmentCast`). The
problem is that correctly annotating placeholders in complex mutations
with expressions like subqueries or CTEs is difficult.

Fixes cockroachdb#71576

There is no release note because this bug was never present in a
release.

Release note: None
craig bot pushed a commit that referenced this pull request Dec 8, 2021
72793: sql: fix placeholder type resolution r=mgartner a=mgartner

#### sql: fix placeholder type resolution

When assignment casts for inserts were add in #70722, type resolution of
placeholders in prepared statements was changed so that during the
prepare phase the optimizer would only plan assignment casts when it
could not be guaranteed that an insert column would be an identical type
to the target column. Unfortunately, the change to placeholder type
resolution change strays from Postgres's behavior and breaks some
third-party tools. This commit reverts the type resolution changes, and
as a result, assignment casts are always planned to convert each insert
column to its target column type, even if the cast ends up being a no-op
because the types are identical.

I attempted to implement a solution that did not plan assignment casts
in all cases, but it proved to be tedious and error prone, so I opted
for this simpler solution for now. My attempt required annotating each
placeholder expression in the memo with a context: a non-assignment
context or an assignment context. This context would be used during
the EXECUTE to determine whether a placeholder value should undergo a
regular cast or an assignment cast: a critical decision due to their
slightly different behaviors (see `PerformAssignmentCast`). The
problem is that correctly annotating placeholders in complex mutations
with expressions like subqueries or CTEs is difficult.

Fixes #71576

There is no release note because this bug was never present in a
release.

Release note: None

#### opt: add assignment cast normalization rules

This commit adds three new normalization rules related to assignment
casts. These rules work together to eliminate and fold assignment casts
and projections when possible.

These rules are important for allowing fast-path inserts to be planned
in more cases. Fast-path inserts can only be planned when the insert
input is a Values expression, so eliminating assignment cast projections
in mutation inputs is required for this optimization.

These rules also help eliminate uniqueness checks for
`gen_random_uuid()` values in more cases. By pushing down and
eliminating assignment casts that wrap `gen_random_uuid()`, the
optimizer will not plan uniqueness checks in some cases because it can
recognize the builtin function expression.

Release note: None

#### opt: add assign-placeholders-build opt test directive

This commit adds the `assign-placeholders-build` opt test directive. It
behaves similar to `assign-placeholders-norm`, but disables
normalization rules.

Release note: None

Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com>
mgartner added a commit to mgartner/cockroach that referenced this pull request Dec 14, 2021
The width and precision of placeholder values are not known during
PREPARE, therefore we remove any type modifiers from placeholder types
during type checking so that a value of any width will fit within the
palceholder type.

Note that this change is similar to the changes made in cockroachdb#70722 to
placeholder type checking that were later reverted in cockroachdb#72793. In cockroachdb#70722
the type OIDs of placeholders could be altered, e.g., a placeholder
originally with type `INT2` would be converted to an `INT`. In this
commit, type OIDs of placeholders are not changing, only type modifiers
are.

This commit should allow some logic added in cockroachdb#72793 to be simplified or
removed entirely.

Release note: None
mgartner added a commit to mgartner/cockroach that referenced this pull request Dec 16, 2021
The width and precision of placeholder values are not known during
PREPARE, therefore we remove any type modifiers from placeholder types
during type checking so that a value of any width will fit within the
palceholder type.

Note that this change is similar to the changes made in cockroachdb#70722 to
placeholder type checking that were later reverted in cockroachdb#72793. In cockroachdb#70722
the type OIDs of placeholders could be altered, e.g., a placeholder
originally with type `INT2` would be converted to an `INT`. In this
commit, type OIDs of placeholders are not changing, only type modifiers
are.

This commit should allow some logic added in cockroachdb#72793 to be simplified or
removed entirely.

Release note: None
mgartner added a commit to mgartner/cockroach that referenced this pull request Dec 16, 2021
The width and precision of placeholder values are not known during
PREPARE, therefore we remove any type modifiers from placeholder types
during type checking so that a value of any width will fit within the
palceholder type.

Note that this change is similar to the changes made in cockroachdb#70722 to
placeholder type checking that were later reverted in cockroachdb#72793. In cockroachdb#70722
the type OIDs of placeholders could be altered, e.g., a placeholder
originally with type `INT2` would be converted to an `INT`. In this
commit, type OIDs of placeholders are not changing, only type modifiers
are.

This commit should allow some logic added in cockroachdb#72793 to be simplified or
removed entirely.

Release note: None
craig bot pushed a commit that referenced this pull request Dec 17, 2021
73762: opt: eliminate assignment casts with identical source and target types r=mgartner a=mgartner

#### sql: remove type modifiers from placeholder types

The width and precision of placeholder values are not known during
PREPARE, therefore we remove any type modifiers from placeholder types
during type checking so that a value of any width will fit within the
palceholder type.

Note that this change is similar to the changes made in #70722 to
placeholder type checking that were later reverted in #72793. In #70722
the type OIDs of placeholders could be altered, e.g., a placeholder
originally with type `INT2` would be converted to an `INT`. In this
commit, type OIDs of placeholders are not changing, only type modifiers
are.

This commit should allow some logic added in #72793 to be simplified or
removed entirely.

Release note: None

#### opt: eliminate assignment casts with identical source and target types

The `EliminateAssignmentCast` rule has been combined with the
`EliminateCast` rule. Now an assignment cast is eliminated if the source
and target type are identical. This now possible thanks to some changes
to type resolution, including:

  1. Placeholder types are resolved with unspecified type modifiers.
     This ensures that assignment casts won't be eliminated if the a
     placeholder value does not conform to the target type's modifiers.

  2. When constant integer `NumVal`s are resolved as a typed-value,
     they are validated to ensure they fit within their type's width.
     There may be more types we need to perform similar validation for,
     such as floats (see #73743).

  3. Columns in values expressions with values that have non-identical
     types but the same type OID will be typed with type modifiers. For
     example, if a values expression has a CHAR(1) value and a CHAR(3)
     value in the same column, the column will be typed as a CHAR
     without an explicit width.

  4. Type modifiers are now correctly omitted from array content types
     when arrays contain constants.

Fixes #73450

Release note (bug fix): A bug has been fixed that caused incorrect
evaluation of placeholder values in EXECUTE statements. The bug
presented when the PREPARE statement cast a placeholder value, for
example `PREPARE s AS SELECT $1::INT2`. If the assigned value for `$1`
exceeded the maximum width value of the cast target type, the result
value of the cast could be incorrect. This bug has been present since
version 19.1 or earlier.


73819: bazel: upgrade `rules_foreign_cc` to 0.7 r=rail a=rickystewart

Also add `-fcommon` to compile flags for `krb5`.

Closes #71306.

Release note: None

73832: sql/opt/exec: output index/expiry in EXPLAIN SPLIT/RELOCATE statements r=RaduBerinde a=knz

Release note (sql change): The output of `EXPLAIN ALTER INDEX/TABLE
... RELOCATE/SPLIT` now includes the target table/index name and, for
the SPLIT AT variants, the expiry timestamp.

Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com>
Co-authored-by: Ricky Stewart <ricky@cockroachlabs.com>
Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net>
gustasva pushed a commit to gustasva/cockroach that referenced this pull request Jan 4, 2022
The width and precision of placeholder values are not known during
PREPARE, therefore we remove any type modifiers from placeholder types
during type checking so that a value of any width will fit within the
palceholder type.

Note that this change is similar to the changes made in cockroachdb#70722 to
placeholder type checking that were later reverted in cockroachdb#72793. In cockroachdb#70722
the type OIDs of placeholders could be altered, e.g., a placeholder
originally with type `INT2` would be converted to an `INT`. In this
commit, type OIDs of placeholders are not changing, only type modifiers
are.

This commit should allow some logic added in cockroachdb#72793 to be simplified or
removed entirely.

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants