-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: add assignment casts for INSERTs #70722
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've only skimmed this a bit, but it looks like a very nice work!
Reviewed 2 of 2 files at r1, 1 of 31 files at r3.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @mgartner, @otan, @RaduBerinde, @rytaft, and @yuzefovich)
pkg/sql/sem/tree/type_check.go, line 460 at r2 (raw file):
default: // TODO(mgartner): Use OID cast map. cast := lookupCastInfo(fromFamily, toFamily, intervalStyleEnabled, dateStyleEnabled)
nit: the change to lookupCastInfo
is currently in the second commit but probably should be in the third commit.
pkg/sql/sem/tree/cast.go, line 74 at r2 (raw file):
// context. It is only used to annotate entries in castMap and to perform // assertions on cast entries in the init function. It has no effect on the // behavior
nit: missing period.
bd658c0
to
f4d59bb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @otan, @RaduBerinde, @rytaft, and @yuzefovich)
pkg/sql/sem/tree/type_check.go, line 460 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: the change to
lookupCastInfo
is currently in the second commit but probably should be in the third commit.
Great catch! Fixed.
pkg/sql/sem/tree/cast.go, line 74 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: missing period.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
focused mainly on experience-y bit but did skim the queries-y bits. small comments.
sorry it's on github - reviewable drives me crazy with multi-updates on multi-commits
pkg/sql/sem/tree/cast_map_gen.sh
Outdated
WHEN 'i' THEN 'CastContextImplicit' | ||
END | ||
), | ||
provolatile |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we doing anything with volatile? should we?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're not doing anything with it yet. I've removed it for now, but we'll likely want to use it to make sure we volatilities correct once we move them to this map.
pkg/sql/sem/tree/cast.go
Outdated
_ contextOrigin = iota | ||
// pgCast specifies that a cast's maximum context is based on information in | ||
// Postgres's pg_cast table. | ||
pgCast |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
optional but i feel like prefixing these with contextOrigin
makes them easier to find / read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
pkg/sql/sem/tree/cast.go
Outdated
@@ -1493,12 +1546,6 @@ func performCastWithoutPrecisionTruncation(ctx *EvalContext, d Datum, t *types.T | |||
res = NewDInt(DInt(v.Unix())) | |||
case *DTimestampTZ: | |||
res = NewDInt(DInt(v.Unix())) | |||
case *DDate: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you'll need to put in a release note for this.
is this something we want to do for this commit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I'll do this in another PR.
# INT4, and the INT4 -> "char" cast is explicit. Our default integer type | ||
# is an INT8 and INT8 -> "char" is an assignment cast. | ||
# | ||
# TODO(mgartner): This should return '{'. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
up to you: i think rather than giving the TODO for yourself for these, file an issue and use TODO(#issue_num). that way you can abandon the baby ;) (and maybe pawn it off to uhhh an unsuspecting team)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created #70771, but I might try to get to it before you do 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Impressive work!
Reviewed 2 of 2 files at r1, 2 of 4 files at r2, 32 of 32 files at r4, 32 of 32 files at r5, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @mgartner and @RaduBerinde)
pkg/sql/logictest/testdata/logic_test/cast, line 34 at r5 (raw file):
INSERT INTO assn_cast(vc) VALUES ('abc') query T
looks like this case and the one below are duplicates
pkg/sql/opt/optbuilder/mutation_builder.go, line 1421 at r5 (raw file):
colName := string(mb.tab.Column(ord).ColName()) err := pgerror.Newf(pgcode.DatatypeMismatch, "value type %s doesn't match type %s of column %q",
nit: I don't think this matches the Postgres error message -- did you intend to use something different? On Postgres I got "ERROR: column "x" is of type integer but expression is of type date" for this error.
pkg/sql/sem/tree/cast_map_test.go, line 46 at r4 (raw file):
_, err := tree.PerformCast(&evalCtx, srcDatum, tgtType) if err != nil && pgerror.HasCandidateCode(err) && pgerror.GetPGCode(err) == pgcode.CannotCoerce { t.Errorf("cast from %s to %s failed: %s", srcType, tgtType, err)
Do we care about any other errors produced? Maybe add a comment to explain
pkg/sql/sem/tree/cast.go, line 72 at r4 (raw file):
// contextOrigin indicates the source of information for a cast's maximum // context. It is only used to annotate entries in castMap and to perform
What is the "maximum" context? Is this according to the order above where explicit < assignment < implicit? Could a cast have multiple contexts?
Edit: or you could just reference the maxContext
description below
pkg/sql/sem/tree/cast.go, line 1241 at r5 (raw file):
// function will error if the given value is too wide for the given type. Width // truncation should be performed before this function for explicit casts and // parsing, not during assignment casts. The one exception is assignment casts
nit: this sentence is a bit confusing. It implies that explicit casts happen before assignment casts, but I think you're just trying to distinguish between the two types, right?
pkg/sql/sem/tree/cast.go, line 1435 at r5 (raw file):
// check on whether the datum fits the type. If truncateWidth is true, widths // are truncated to match the target type t. If truncateWidth is false, the // input datum is not truncated.
what happens if the datum doesn't fit?
Also, seems like this only applies to some types, right? Looks like values are still being rounded etc. for numeric types.
It's a bit weird that this function is called "withoutPrecisionTruncation", but you can optionally truncate the width. Maybe add a sentence or two to explain the difference? (maybe this is what is confusing me...)
f4d59bb
to
91d5589
Compare
91d5589
to
d85a380
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @otan, @RaduBerinde, and @rytaft)
pkg/sql/logictest/testdata/logic_test/cast, line 34 at r5 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
looks like this case and the one below are duplicates
Good catch. I meant these to insert into vc
. Fixed.
pkg/sql/opt/optbuilder/mutation_builder.go, line 1421 at r5 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
nit: I don't think this matches the Postgres error message -- did you intend to use something different? On Postgres I got "ERROR: column "x" is of type integer but expression is of type date" for this error.
My intention was to leave these error messages as-is in this commit. I can follow up in a future commit to make the error message match Postgres's.
pkg/sql/sem/tree/cast_map_test.go, line 46 at r4 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
Do we care about any other errors produced? Maybe add a comment to explain
Done.
pkg/sql/sem/tree/cast.go, line 72 at r4 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
What is the "maximum" context? Is this according to the order above where explicit < assignment < implicit? Could a cast have multiple contexts?
Edit: or you could just reference the
maxContext
description below
Done.
pkg/sql/sem/tree/cast.go, line 1241 at r5 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
nit: this sentence is a bit confusing. It implies that explicit casts happen before assignment casts, but I think you're just trying to distinguish between the two types, right?
Good catch. I've rewritten the paragraph and hopefully it reads more clearly.
pkg/sql/sem/tree/cast.go, line 1435 at r5 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
what happens if the datum doesn't fit?
Also, seems like this only applies to some types, right? Looks like values are still being rounded etc. for numeric types.
It's a bit weird that this function is called "withoutPrecisionTruncation", but you can optionally truncate the width. Maybe add a sentence or two to explain the difference? (maybe this is what is confusing me...)
I've updated the comment to make things a little more clear. This is a code stink that we're hoping to clean up in the future. The combination of performCastWithoutPrecisionTruncation and AdjustValueToType is confusing. It's hacked together to work, but ideally the casting logic for each source and target type can live in the castMap. What makes things a bit worse is that PG's assignment cast truncation behavior is not documented anywhere as far as I can tell. So this logic is based on a few experimental observations rather than a unified fundamental theory.
d85a380
to
0e2a7ec
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 33 files at r6, 2 of 2 files at r7, 32 of 32 files at r8, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner, @otan, and @RaduBerinde)
pkg/sql/sem/tree/cast.go, line 1244 at r8 (raw file):
// called so that an error is not returned. For assignment casts, inVal should // not be truncated before this function is called, so that an error is // returned. The one exception is for assignment casts to the special "char"
nit: "The one exception is for assignment casts to the special "char" type" -> maybe "The one exception for assignment casts is for the special "char" type"?
pkg/sql/sem/tree/cast.go, line 1245 at r8 (raw file):
// not be truncated before this function is called, so that an error is // returned. The one exception is for assignment casts to the special "char" // type, which truncate and do not error if the width of the datum is wider than
nit: truncate and do not error -> truncates and does not error
0e2a7ec
to
8dcc9ff
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some tests revealed that I completely forgot about casts from NULL values. Please take another look. 🙏
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @otan, @RaduBerinde, and @rytaft)
pkg/sql/sem/tree/cast.go, line 1244 at r8 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
nit: "The one exception is for assignment casts to the special "char" type" -> maybe "The one exception for assignment casts is for the special "char" type"?
Done.
pkg/sql/sem/tree/cast.go, line 1245 at r8 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
nit: truncate and do not error -> truncates and does not error
Done.
8dcc9ff
to
bf0b409
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 32 of 32 files at r9, 34 of 34 files at r10, 1 of 1 files at r11, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @mgartner, @otan, and @RaduBerinde)
pkg/sql/sem/tree/cast.go, line 1811 at r11 (raw file):
return nil, errors.Wrapf(err, "type %s", t.SQLString()) } return &dd, err
nit: now you can return nil here instead of err
There's a problem with this implementation and prepared statements. When the expression is built during One solution would be to add an assignment cast around every insert input column, then remove them during normalization if it can be proven they are not needed. However, this would require moving the check that the assignment cast is valid into the evaluation of the cast rather than during opt-build - we can't determine that the assignment cast in invalid if we don't know the type of the placeholder. As a result, the error you get when trying to insert an incompatibly-typed value into a column would not occur until execution-time. |
Sounds like a reasonable solution to me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice work!
I noticed the cases where a new stable
shows up. Are there any cases where the an assignment cast is more volatile than the regular cast between those types? If not, we can use the existing information we have about cast volatility.
Regarding placeholders - we infer a certain type for placeholders. If the placeholder have type hints that are different, we should be able to insert assignment casts as needed during opt-build. Beyond that, any discrepancy between the expected placeholder type and what we get over the wire should be resolved by execBind -> DecodeDatum()
- we probably just want to run assignment casts in there. (note that there are also a couple of other code paths which fill in placeholders, one for Execute and one for internal queries).
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @mgartner and @otan)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed the cases where a new stable shows up. Are there any cases where the an assignment cast is more volatile than the regular cast between those types? If not, we can use the existing information we have about cast volatility.
There are no cases that I am aware of. The only difference in behavior is the width truncation in an explicit cast vs an error in an assignment cast, which is not context dependent or side-effecting.
Are you suggesting that I add a mechanism for builtins so that they can have argument-dependent volatility?
Regarding placeholders - we infer a certain type for placeholders. If the placeholder have type hints that are different, we should be able to insert assignment casts as needed during opt-build. Beyond that, any discrepancy between the expected placeholder type and what we get over the wire should be resolved by execBind -> DecodeDatum() - we probably just want to run assignment casts in there. (note that there are also a couple of other code paths which fill in placeholders, one for Execute and one for internal queries).
Brilliant. Thanks for the pointers. I'll give it a shot.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @mgartner and @otan)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you suggesting that I add a mechanism for builtins so that they can have argument-dependent volatility?
I was thinking of special code in the FunctionExpr
case in BuildSharedProps
: if the name matches the assignment cast function, then use tree.LookupCastVolatility
. Maybe it would even make sense to make this a separate operator (AssignmentCastExpr) or use CastExpr and add a flag (even if it ultimately gets built as a function call).
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @mgartner and @otan)
bf0b409
to
7530160
Compare
Build succeeded: |
When assignment casts for inserts were add in cockroachdb#70722, type resolution of placeholders in prepared statements was changed so that during the prepare phase the optimizer would only plan assignment casts when it could not be guaranteed that an insert column would be an identical type to the target column. Unfortunately, the change to placeholder type resolution change strays from Postgres's behavior and breaks some third-party tools. This commit reverts the type resolution changes, and as a result, assignment casts are always planned to convert each insert column to its target column type, even if the cast ends up being a no-op because the types are identical. I attempted to implement a solution that did not plan assignment casts in all cases. It required annotating each placeholder expression in the memo with a context: a non-assignment context or an assignment context. This context would be used during the EXECUTE to determine whether a placeholder value should undergo a regular cast or an assignment cast: a critical decision due to their slightly different behaviors (see `PerformAssignmentCast`). Annotating each placeholder with a context was tedious and error-prone, so I opted for this much simpler solution for now. Fixes cockroachdb#71576 There is no release note because this bug was never present in a release. Release note: None
When assignment casts for inserts were add in cockroachdb#70722, type resolution of placeholders in prepared statements was changed so that during the prepare phase the optimizer would only plan assignment casts when it could not be guaranteed that an insert column would be an identical type to the target column. Unfortunately, the change to placeholder type resolution change strays from Postgres's behavior and breaks some third-party tools. This commit reverts the type resolution changes, and as a result, assignment casts are always planned to convert each insert column to its target column type, even if the cast ends up being a no-op because the types are identical. I attempted to implement a solution that did not plan assignment casts in all cases. It required annotating each placeholder expression in the memo with a context: a non-assignment context or an assignment context. This context would be used during the EXECUTE to determine whether a placeholder value should undergo a regular cast or an assignment cast: a critical decision due to their slightly different behaviors (see `PerformAssignmentCast`). Annotating each placeholder with a context was tedious and error-prone, so I opted for this much simpler solution for now. Fixes cockroachdb#71576 There is no release note because this bug was never present in a release. Release note: None
When assignment casts for inserts were add in cockroachdb#70722, type resolution of placeholders in prepared statements was changed so that during the prepare phase the optimizer would only plan assignment casts when it could not be guaranteed that an insert column would be an identical type to the target column. Unfortunately, the change to placeholder type resolution change strays from Postgres's behavior and breaks some third-party tools. This commit reverts the type resolution changes, and as a result, assignment casts are always planned to convert each insert column to its target column type, even if the cast ends up being a no-op because the types are identical. I attempted to implement a solution that did not plan assignment casts in all cases, but it proved to be tedious and error prone, so I opted for this simpler solution for now. My attempt required annotating each placeholder expression in the memo with a context: a non-assignment context or an assignment context. This context would be used during the EXECUTE to determine whether a placeholder value should undergo a regular cast or an assignment cast: a critical decision due to their slightly different behaviors (see `PerformAssignmentCast`). The problem is that correctly annotating placeholders in complex mutations with expressions like subqueries or CTEs is difficult. Fixes cockroachdb#71576 There is no release note because this bug was never present in a release. Release note: None
When assignment casts for inserts were add in cockroachdb#70722, type resolution of placeholders in prepared statements was changed so that during the prepare phase the optimizer would only plan assignment casts when it could not be guaranteed that an insert column would be an identical type to the target column. Unfortunately, the change to placeholder type resolution change strays from Postgres's behavior and breaks some third-party tools. This commit reverts the type resolution changes, and as a result, assignment casts are always planned to convert each insert column to its target column type, even if the cast ends up being a no-op because the types are identical. I attempted to implement a solution that did not plan assignment casts in all cases, but it proved to be tedious and error prone, so I opted for this simpler solution for now. My attempt required annotating each placeholder expression in the memo with a context: a non-assignment context or an assignment context. This context would be used during the EXECUTE to determine whether a placeholder value should undergo a regular cast or an assignment cast: a critical decision due to their slightly different behaviors (see `PerformAssignmentCast`). The problem is that correctly annotating placeholders in complex mutations with expressions like subqueries or CTEs is difficult. Fixes cockroachdb#71576 There is no release note because this bug was never present in a release. Release note: None
When assignment casts for inserts were add in cockroachdb#70722, type resolution of placeholders in prepared statements was changed so that during the prepare phase the optimizer would only plan assignment casts when it could not be guaranteed that an insert column would be an identical type to the target column. Unfortunately, the change to placeholder type resolution change strays from Postgres's behavior and breaks some third-party tools. This commit reverts the type resolution changes, and as a result, assignment casts are always planned to convert each insert column to its target column type, even if the cast ends up being a no-op because the types are identical. I attempted to implement a solution that did not plan assignment casts in all cases, but it proved to be tedious and error prone, so I opted for this simpler solution for now. My attempt required annotating each placeholder expression in the memo with a context: a non-assignment context or an assignment context. This context would be used during the EXECUTE to determine whether a placeholder value should undergo a regular cast or an assignment cast: a critical decision due to their slightly different behaviors (see `PerformAssignmentCast`). The problem is that correctly annotating placeholders in complex mutations with expressions like subqueries or CTEs is difficult. Fixes cockroachdb#71576 There is no release note because this bug was never present in a release. Release note: None
When assignment casts for inserts were add in cockroachdb#70722, type resolution of placeholders in prepared statements was changed so that during the prepare phase the optimizer would only plan assignment casts when it could not be guaranteed that an insert column would be an identical type to the target column. Unfortunately, the change to placeholder type resolution change strays from Postgres's behavior and breaks some third-party tools. This commit reverts the type resolution changes, and as a result, assignment casts are always planned to convert each insert column to its target column type, even if the cast ends up being a no-op because the types are identical. I attempted to implement a solution that did not plan assignment casts in all cases, but it proved to be tedious and error prone, so I opted for this simpler solution for now. My attempt required annotating each placeholder expression in the memo with a context: a non-assignment context or an assignment context. This context would be used during the EXECUTE to determine whether a placeholder value should undergo a regular cast or an assignment cast: a critical decision due to their slightly different behaviors (see `PerformAssignmentCast`). The problem is that correctly annotating placeholders in complex mutations with expressions like subqueries or CTEs is difficult. Fixes cockroachdb#71576 There is no release note because this bug was never present in a release. Release note: None
When assignment casts for inserts were add in cockroachdb#70722, type resolution of placeholders in prepared statements was changed so that during the prepare phase the optimizer would only plan assignment casts when it could not be guaranteed that an insert column would be an identical type to the target column. Unfortunately, the change to placeholder type resolution change strays from Postgres's behavior and breaks some third-party tools. This commit reverts the type resolution changes, and as a result, assignment casts are always planned to convert each insert column to its target column type, even if the cast ends up being a no-op because the types are identical. I attempted to implement a solution that did not plan assignment casts in all cases, but it proved to be tedious and error prone, so I opted for this simpler solution for now. My attempt required annotating each placeholder expression in the memo with a context: a non-assignment context or an assignment context. This context would be used during the EXECUTE to determine whether a placeholder value should undergo a regular cast or an assignment cast: a critical decision due to their slightly different behaviors (see `PerformAssignmentCast`). The problem is that correctly annotating placeholders in complex mutations with expressions like subqueries or CTEs is difficult. Fixes cockroachdb#71576 There is no release note because this bug was never present in a release. Release note: None
72793: sql: fix placeholder type resolution r=mgartner a=mgartner #### sql: fix placeholder type resolution When assignment casts for inserts were add in #70722, type resolution of placeholders in prepared statements was changed so that during the prepare phase the optimizer would only plan assignment casts when it could not be guaranteed that an insert column would be an identical type to the target column. Unfortunately, the change to placeholder type resolution change strays from Postgres's behavior and breaks some third-party tools. This commit reverts the type resolution changes, and as a result, assignment casts are always planned to convert each insert column to its target column type, even if the cast ends up being a no-op because the types are identical. I attempted to implement a solution that did not plan assignment casts in all cases, but it proved to be tedious and error prone, so I opted for this simpler solution for now. My attempt required annotating each placeholder expression in the memo with a context: a non-assignment context or an assignment context. This context would be used during the EXECUTE to determine whether a placeholder value should undergo a regular cast or an assignment cast: a critical decision due to their slightly different behaviors (see `PerformAssignmentCast`). The problem is that correctly annotating placeholders in complex mutations with expressions like subqueries or CTEs is difficult. Fixes #71576 There is no release note because this bug was never present in a release. Release note: None #### opt: add assignment cast normalization rules This commit adds three new normalization rules related to assignment casts. These rules work together to eliminate and fold assignment casts and projections when possible. These rules are important for allowing fast-path inserts to be planned in more cases. Fast-path inserts can only be planned when the insert input is a Values expression, so eliminating assignment cast projections in mutation inputs is required for this optimization. These rules also help eliminate uniqueness checks for `gen_random_uuid()` values in more cases. By pushing down and eliminating assignment casts that wrap `gen_random_uuid()`, the optimizer will not plan uniqueness checks in some cases because it can recognize the builtin function expression. Release note: None #### opt: add assign-placeholders-build opt test directive This commit adds the `assign-placeholders-build` opt test directive. It behaves similar to `assign-placeholders-norm`, but disables normalization rules. Release note: None Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com>
The width and precision of placeholder values are not known during PREPARE, therefore we remove any type modifiers from placeholder types during type checking so that a value of any width will fit within the palceholder type. Note that this change is similar to the changes made in cockroachdb#70722 to placeholder type checking that were later reverted in cockroachdb#72793. In cockroachdb#70722 the type OIDs of placeholders could be altered, e.g., a placeholder originally with type `INT2` would be converted to an `INT`. In this commit, type OIDs of placeholders are not changing, only type modifiers are. This commit should allow some logic added in cockroachdb#72793 to be simplified or removed entirely. Release note: None
The width and precision of placeholder values are not known during PREPARE, therefore we remove any type modifiers from placeholder types during type checking so that a value of any width will fit within the palceholder type. Note that this change is similar to the changes made in cockroachdb#70722 to placeholder type checking that were later reverted in cockroachdb#72793. In cockroachdb#70722 the type OIDs of placeholders could be altered, e.g., a placeholder originally with type `INT2` would be converted to an `INT`. In this commit, type OIDs of placeholders are not changing, only type modifiers are. This commit should allow some logic added in cockroachdb#72793 to be simplified or removed entirely. Release note: None
The width and precision of placeholder values are not known during PREPARE, therefore we remove any type modifiers from placeholder types during type checking so that a value of any width will fit within the palceholder type. Note that this change is similar to the changes made in cockroachdb#70722 to placeholder type checking that were later reverted in cockroachdb#72793. In cockroachdb#70722 the type OIDs of placeholders could be altered, e.g., a placeholder originally with type `INT2` would be converted to an `INT`. In this commit, type OIDs of placeholders are not changing, only type modifiers are. This commit should allow some logic added in cockroachdb#72793 to be simplified or removed entirely. Release note: None
73762: opt: eliminate assignment casts with identical source and target types r=mgartner a=mgartner #### sql: remove type modifiers from placeholder types The width and precision of placeholder values are not known during PREPARE, therefore we remove any type modifiers from placeholder types during type checking so that a value of any width will fit within the palceholder type. Note that this change is similar to the changes made in #70722 to placeholder type checking that were later reverted in #72793. In #70722 the type OIDs of placeholders could be altered, e.g., a placeholder originally with type `INT2` would be converted to an `INT`. In this commit, type OIDs of placeholders are not changing, only type modifiers are. This commit should allow some logic added in #72793 to be simplified or removed entirely. Release note: None #### opt: eliminate assignment casts with identical source and target types The `EliminateAssignmentCast` rule has been combined with the `EliminateCast` rule. Now an assignment cast is eliminated if the source and target type are identical. This now possible thanks to some changes to type resolution, including: 1. Placeholder types are resolved with unspecified type modifiers. This ensures that assignment casts won't be eliminated if the a placeholder value does not conform to the target type's modifiers. 2. When constant integer `NumVal`s are resolved as a typed-value, they are validated to ensure they fit within their type's width. There may be more types we need to perform similar validation for, such as floats (see #73743). 3. Columns in values expressions with values that have non-identical types but the same type OID will be typed with type modifiers. For example, if a values expression has a CHAR(1) value and a CHAR(3) value in the same column, the column will be typed as a CHAR without an explicit width. 4. Type modifiers are now correctly omitted from array content types when arrays contain constants. Fixes #73450 Release note (bug fix): A bug has been fixed that caused incorrect evaluation of placeholder values in EXECUTE statements. The bug presented when the PREPARE statement cast a placeholder value, for example `PREPARE s AS SELECT $1::INT2`. If the assigned value for `$1` exceeded the maximum width value of the cast target type, the result value of the cast could be incorrect. This bug has been present since version 19.1 or earlier. 73819: bazel: upgrade `rules_foreign_cc` to 0.7 r=rail a=rickystewart Also add `-fcommon` to compile flags for `krb5`. Closes #71306. Release note: None 73832: sql/opt/exec: output index/expiry in EXPLAIN SPLIT/RELOCATE statements r=RaduBerinde a=knz Release note (sql change): The output of `EXPLAIN ALTER INDEX/TABLE ... RELOCATE/SPLIT` now includes the target table/index name and, for the SPLIT AT variants, the expiry timestamp. Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com> Co-authored-by: Ricky Stewart <ricky@cockroachlabs.com> Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net>
The width and precision of placeholder values are not known during PREPARE, therefore we remove any type modifiers from placeholder types during type checking so that a value of any width will fit within the palceholder type. Note that this change is similar to the changes made in cockroachdb#70722 to placeholder type checking that were later reverted in cockroachdb#72793. In cockroachdb#70722 the type OIDs of placeholders could be altered, e.g., a placeholder originally with type `INT2` would be converted to an `INT`. In this commit, type OIDs of placeholders are not changing, only type modifiers are. This commit should allow some logic added in cockroachdb#72793 to be simplified or removed entirely. Release note: None
sql: rename pkg/sql/sem/tree/casts.go to cast.go
Release note: None
tree: add OID->OID cast map
This commit adds a new map that describes valid casts from OID to OID.
It introduces three cast contexts: explicit casts, assignment casts, and
implicit casts. See the comments for CastContext and castMap for
details.
This map will enable us to properly follow Postgres's casting behavior.
Most immediately, it will allow us to support assignment casts.
Future work includes moving volatility in castMap. In the longer term,
cast functions can be moved into castMap as well.
Release note: None
sql: set "char" type width to 1
The
"char"
type is a special single-character type. This commitadds a
types.QChar
with a width one. It removes thetypes.MakeQChar
function so that it is impossible to create
"char"
types with anyother width.
Release note: None
sql: add assignment casts for INSERTs
Casts in Postgres are performed in one of three contexts [1]:
CAST(x AS T)
orx::T
.UPSERT, or UPDATE.
example the DATE in
'2021-01-02'::DATE < now()
will be implicitlycast to a TIMESTAMPTZ so that the values can be compared.
Not all casts can be performed in all contexts. Postgres's pg_cast table
lists valid casts and specifies the maximum cast context in which each
can be performed. A cast with a max context of explicit can only be
performed in an explicit context. A cast with an assignment max context
can be performed in an explicit context or an assignment context. A cast
with an implicit max context can be performed in all contexts.
Much to my personal disappointment and frustration, there are valid
casts that are not listed in Postgres's pg_cast table. These casts are
called "automatic I/O conversions" and they allow casting most types to
and from the string types: TEXT, VARCHAR, CHAR, NAME, and "char" [2].
We cannot determine these casts' maximum cast context from the pg_cast
table, so we rely on the documentation which states that conversions to
string types are assignment casts and conversions from string types are
explicit casts [3].
--
This commit implements assignment casts for INSERTs. Follow up work will
implement assignment casts for UPSERTs and UPDATEs.
A cast performed in an assignment context has slightly different
behavior than the same cast performed in an explicit context. In an
assignment context, the cast will error if the width of the value is too
large for the given type. In an explicit context, the value will be
truncated to match the width. The one exception is assignment casts to
the special "char" type which do truncate values.
To support different cast behaviors for different contexts, a new
built-in,
crdb_internal.assignment_cast
has been introduced. Thisfunction takes two arguments: a value and a type. Because SQL
does not have first-class types, a type cannot be passed directly to the
built-in. Instead, a
NULL
cast to a type is used as a workaround,similar to the
json_populate_record
built-in. For example, an integercan be assignment-cast to a string with:
The optimizer is responsible for wrapping INSERT columns with the
assignment cast built-in function. If an insert column type
T1
is notidentical to the table's corresponding target column type
T2
, theoptimizer will check if there is a valid cast from
T1
toT2
with amaximum context that allows an assignment cast. If there is a such a
cast, a projection will wrap the column in the assignment cast built-in
function. If there is no such cast, a user error will be produced.
Some changes to prepared statement placeholder type inference were
required in order to better match Postgres's behavior (this is a
best-effort match thus far and there are still inconsistencies). Most
notably, widths and precision are no longer inferred for the types of
placeholders. The effect of this is that assignment casts will be
correctly added by the optimizer in order to make sure that values for
placeholders are correctly coerced to the target column type during
execution of a prepared insert.
The introduction of assignment casts fixes minor bugs and addresses some
inconsistencies with Postgres's behavior. In general, INSERTS now
successfully cast values to target table column types in more cases. As
one example, inserting a string into an integer column now succeeds:
Prior to this commit there was logic that mimicked assignment casts, but
it was not correct. Bugs in the implementation caused incorrect behavior
when inserting into tables with computed columns. Most notably, a
computed column expression that referenced another column
c
wasevaluated with the value of
c
before the assignment cast wasperformed. This resulted in incorrect values for computed columns in
some cases.
In addition, assignment casts make the special logic for rounding
decimal values in optbuilder obsolete. The builtin function
crdb_internal.round_decimal_values
and related logic in optbuilderwill be removed once assignment casts are implemented for UPSERTs and
UPDATEs.
Fixes #69327
Fixes #69665
[1] https://www.postgresql.org/docs/current/typeconv.html
[2] https://www.postgresql.org/docs/13/catalog-pg-cast.html#CATALOG-PG-CAST
[3] https://www.postgresql.org/docs/13/sql-createcast.html#SQL-CREATECAST-NOTES
Release note (sql change): Implicit casts performed during INSERT
statements now more closely follow Postgres's behavior. Several minor
bugs related to these types of casts have been fixed.