Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REGNAMESPACE function not found with optimizer rules disabled #98322

Closed
cockroach-teamcity opened this issue Mar 9, 2023 · 2 comments · Fixed by #99168
Closed

REGNAMESPACE function not found with optimizer rules disabled #98322

cockroach-teamcity opened this issue Mar 9, 2023 · 2 comments · Fixed by #99168
Assignees
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. T-sql-queries SQL Queries Team
Milestone

Comments

@cockroach-teamcity
Copy link
Member

cockroach-teamcity commented Mar 9, 2023

roachtest.unoptimized-query-oracle/disable-rules=half/rand-tables failed with artifacts on master @ 4dc9e98d8cc1b137ee4cc950a4a42e5895bb890b:

test artifacts and logs in: /artifacts/unoptimized-query-oracle/disable-rules=half/rand-tables/run_1
(query_comparison_util.go:251).runOneRoundQueryComparison: . 3305 statements run: expected unoptimized and optimized results to be equal
  []string{
  	... // 102 identical elements
  	"01010000A0E6100000E635FE3F844664C009767445B1DE50C061DD792C0784FE"...,
  	"01010000A0E6100000E635FE3F844664C009767445B1DE50C061DD792C0784FE"...,
- 	"01010000A0E6100000E635FE3F844664C009767445B1DE50C061DD792C0784FEC1,NULL,true,NULL,29,29,NULL,'eARRPyW',105",
  	"01010000A0E6100000E635FE3F844664C009767445B1DE50C061DD792C0784FE"...,
  	"01010000A0E6100000E635FE3F844664C009767445B1DE50C061DD792C0784FE"...,
  	... // 6 identical elements
  	"01010000A0E6100000E635FE3F844664C009767445B1DE50C061DD792C0784FE"...,
  	"01010000A0E6100000E635FE3F844664C009767445B1DE50C061DD792C0784FE"...,
+ 	"01010000A0E6100000E635FE3F844664C009767445B1DE50C061DD792C0784FEC1,NULL,true,NULL,public,29,NULL,'eARRPyW',105",
  	"01010000E0E6100000000000000000F87F000000000000F87F000000000000F8"...,
  	"01010000E0E6100000062898B6165960C034EF03C6A440504062DEC701EA4AFA"...,
  	... // 228 identical elements
  }
sql: SELECT
	tab2961.col2_5 AS col8615,
	tab2961.cől2_0 AS col8616,
	tab2961.col2_3 AS "c%pol(8617",
	tab2961.col2_7 AS col8618,
	regnamespace(tab2961.col2_8::INT8)::REGNAMESPACE AS col8619,
	tab2961.col2_8 AS co😱l8620,
	tab2961.cől2_0 AS col8621,
	tab2961."col2_%c31" AS "'co%pl8622",
	tab2961.col2_2 AS "co	🙃l8623"
FROM
	defaultdb.public.table2@[0] AS tab2961
ORDER BY
	tab2961.col2_7 DESC NULLS FIRST

Parameters: ROACHTEST_cloud=gce , ROACHTEST_cpu=4 , ROACHTEST_encrypted=false , ROACHTEST_fs=ext4 , ROACHTEST_localSSD=true , ROACHTEST_ssd=0

Help

See: roachtest README

See: How To Investigate (internal)

/cc @cockroachdb/sql-queries

This test on roachdash | Improve this report!

Jira issue: CRDB-25190

@cockroach-teamcity cockroach-teamcity added branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. labels Mar 9, 2023
@cockroach-teamcity cockroach-teamcity added this to the 23.1 milestone Mar 9, 2023
@blathers-crl blathers-crl bot added the T-sql-queries SQL Queries Team label Mar 9, 2023
@msirek
Copy link
Contributor

msirek commented Mar 9, 2023

Simplified test case:

SET testing_optimizer_random_seed = 8514702048362135519;
SET testing_optimizer_disable_rule_probability = 0.500000;

select regnamespace(29)::REGNAMESPACE AS col8619;
  col8619
-----------
  29

RESET testing_optimizer_disable_rule_probability;
select regnamespace(29)::REGNAMESPACE AS col8619;
  col8619
-----------
  public

The REGNAMESPACE function doesn't exist in v22.2, but isn't a blocker since a normalization rule has to be disabled to hit the problem.

@msirek msirek removed the release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. label Mar 9, 2023
@msirek msirek changed the title roachtest: unoptimized-query-oracle/disable-rules=half/rand-tables failed REGNAMESPACE function not found with optimizer rules disabled Mar 12, 2023
@msirek msirek self-assigned this Mar 20, 2023
@msirek
Copy link
Contributor

msirek commented Mar 21, 2023

This problem is occurring when the GenerateConstrainedScans rule is disabled. I tried disabling GenerateConstrainedScans all the time, and found the system won't even start up:

ERROR: server startup failed: cockroach server exited with error: migration-job-find-already-completed: unimplemented: cannot use bounded staleness for queries that may touch more than one range or require an index join
SQLSTATE: 0A000
HINT: You have attempted to use a feature that is not yet implemented.
See: https://go.crdb.dev/issue-v/67562/v23.1
Failed running "start"

So, I think for internal SQLs, this rule should never be disabled.

msirek pushed a commit to msirek/cockroach that referenced this issue Mar 21, 2023
Casting an OID to REGNAMESPACE runs this SQL using the internal executor:
```
SELECT pg_namespace.oid, nspname FROM pg_catalog.pg_namespace WHERE oid = $1
```
When the `GenerateConstrainedScans` rule is disabled, this returns no
rows and so doesn't apply the cast. The same SQL, not run via an
internal executor, but also with `GenerateConstrainedScans` disabled,
behaves correctly. Disabling `GenerateConstrainedScans` all the time
prevents the cluster from starting.

Internal SQL is sensitive to disabled optimizer rewrite rules, and it
could cause server instability to allow rules to ever be disabled for
internal SQL. Also, the `testing_optimizer_disable_rule_probability` session
setting is mainly meant to perturb query plans of foreground SQL in
randomized tests, and enabling it for internal SQL could make these
tests more noisy and create harder to debug issues.

Therefore the fix is to set the `testing_optimizer_disable_rule_probability`
setting to zero for internal SQL statements.

Fixes cockroachdb#98322

Release note: None
msirek pushed a commit to msirek/cockroach that referenced this issue Mar 22, 2023
Casting an OID to REGNAMESPACE runs this SQL using the internal executor:
```
SELECT pg_namespace.oid, nspname FROM pg_catalog.pg_namespace WHERE oid = $1
```
When the `GenerateConstrainedScans` rule is disabled, this returns no
rows and so doesn't apply the cast. The same SQL, not run via an
internal executor, but also with `GenerateConstrainedScans` disabled,
behaves correctly. Disabling `GenerateConstrainedScans` all the time
prevents the cluster from starting.

Internal SQL is sensitive to disabled optimizer rewrite rules, and it
could cause server instability to allow rules to ever be disabled for
internal SQL. Also, the `testing_optimizer_disable_rule_probability` session
setting is mainly meant to perturb query plans of foreground SQL in
randomized tests, and enabling it for internal SQL could make these
tests more noisy and create harder to debug issues.

Therefore the fix is to set the `testing_optimizer_disable_rule_probability`
setting to zero for internal SQL statements.

Fixes cockroachdb#98322

Release note: None
msirek pushed a commit to msirek/cockroach that referenced this issue Mar 22, 2023
Casting an OID to REGNAMESPACE runs this SQL using the internal executor:
```
SELECT pg_namespace.oid, nspname FROM pg_catalog.pg_namespace WHERE oid = $1
```
When the `GenerateConstrainedScans` rule is disabled, this returns no
rows and so doesn't apply the cast. The same SQL, not run via an
internal executor, but also with `GenerateConstrainedScans` disabled,
behaves correctly. Disabling `GenerateConstrainedScans` all the time
prevents the cluster from starting.

Internal SQL is sensitive to disabled optimizer rewrite rules, and it
could cause server instability to allow rules to ever be disabled for
internal SQL. Also, the `testing_optimizer_disable_rule_probability` session
setting is mainly meant to perturb query plans of foreground SQL in
randomized tests, and enabling it for internal SQL could make these
tests more noisy and create harder to debug issues.

Therefore the fix is to set the `testing_optimizer_disable_rule_probability`
setting to zero for internal SQL statements.

Fixes cockroachdb#98322

Release note: None
craig bot pushed a commit that referenced this issue Apr 5, 2023
99168: randgen: disable generation of REGNAMESPACE type expressions r=msirek a=msirek

Casting an OID to REGNAMESPACE runs this SQL using the internal executor:
```
SELECT pg_namespace.oid, nspname FROM pg_catalog.pg_namespace WHERE oid = $1
```
When the `GenerateConstrainedScans` rule is disabled, this returns no
rows and so doesn't apply the cast. The same SQL, not run via an
internal executor, but also with `GenerateConstrainedScans` disabled,
behaves correctly. Disabling `GenerateConstrainedScans` all the time
prevents the cluster from starting.

The reason a full scan of pg_namespace doesn't find the OID is because it
only checks for schemas in the current database:
https://github.com/cockroachdb/cockroach/blob/7b341ffa678e4a22416e2274351fd56f415f5421/pkg/sql/virtual_schema.go#L602
https://github.com/cockroachdb/cockroach/blob/d10c3dd42c3dc40cad82792a30ae47fd2a663f43/pkg/sql/pg_catalog.go#L2099-L2106
https://github.com/cockroachdb/cockroach/blob/a7e9c4a68b81436d1f9382518d4267f74cbdac94/pkg/sql/information_schema.go#L2295-L2296

Whereas with a constrained scan, all descriptors are searched:
https://github.com/cockroachdb/cockroach/blob/af6a72a622ae05f3733b5db637403b3eaa9455f1/pkg/sql/catalog/descs/descriptor.go#L168-L175

The correct result is from the constrained scan, because we currently
allow cross-database references. Once #55791 is fixed, both results
should match.

The fix alternatives are:
1. disallow disabling of the `GenerateConstrainedScans` rule for internal SQL
2. turn off generation of REGNAMESPACE expressions in randgen to avoid hitting this problem in tests.

The 2nd fix alternative is implemented to avoid losing any of the
rule-disabling test coverage provided by the
`testing_optimizer_disable_rule_probability` setting.

Fixes #98322

100652: cluster-ui: fix cached data invalidation on timescale change r=xinhaoz a=xinhaoz

In a prior change, we moved the invalidation of cached data depending on the timescale to the local storage saga for CC. This was so invaldiation would occur after updating the cache. The local storage saga created for the time scale action was not hooked up to fire after the action, thus the data would not have been invalidated. This commit properly subscribes the saga to the update time scale action in CC.

Epic: none

Release note: None

100760: build: make sure toolchain has correct arguments for linking shared lib r=rail a=rickystewart

`geos` has been failing to build without this change since #100313. This fixes it.

Epic: none
Release note: None

Co-authored-by: Mark Sirek <sirek@cockroachlabs.com>
Co-authored-by: Xin Hao Zhang <xzhang@cockroachlabs.com>
Co-authored-by: Ricky Stewart <rickybstewart@gmail.com>
@craig craig bot closed this as completed in 85b666e Apr 5, 2023
blathers-crl bot pushed a commit that referenced this issue Apr 5, 2023
Casting an OID to REGNAMESPACE runs this SQL using the internal executor:
```
SELECT pg_namespace.oid, nspname FROM pg_catalog.pg_namespace WHERE oid = $1
```
When the `GenerateConstrainedScans` rule is disabled, this returns no
rows and so doesn't apply the cast. The same SQL, not run via an
internal executor, but also with `GenerateConstrainedScans` disabled,
behaves correctly. Disabling `GenerateConstrainedScans` all the time
prevents the cluster from starting.

The reason a full scan of pg_namespace doesn't find the OID is because it
only checks for schemas in the current database:
https://github.com/cockroachdb/cockroach/blob/7b341ffa678e4a22416e2274351fd56f415f5421/pkg/sql/virtual_schema.go#L602
https://github.com/cockroachdb/cockroach/blob/d10c3dd42c3dc40cad82792a30ae47fd2a663f43/pkg/sql/pg_catalog.go#L2099-L2106
https://github.com/cockroachdb/cockroach/blob/a7e9c4a68b81436d1f9382518d4267f74cbdac94/pkg/sql/information_schema.go#L2295-L2296

Whereas with a constrained scan, all descriptors are searched:
https://github.com/cockroachdb/cockroach/blob/af6a72a622ae05f3733b5db637403b3eaa9455f1/pkg/sql/catalog/descs/descriptor.go#L168-L175

The correct result is from the constrained scan, because we currently
allow cross-database references. Once #55791 is fixed, both results
should match.

The fix alternatives are:
1. disallow disabling of the `GenerateConstrainedScans` rule for internal SQL
2. turn off generation of REGNAMESPACE expressions in randgen to avoid hitting this problem in tests.

The 2nd fix alternative is implemented to avoid losing any of the
rule-disabling test coverage provided by the
`testing_optimizer_disable_rule_probability` setting.

Fixes #98322

Release note: None
blathers-crl bot pushed a commit that referenced this issue Apr 5, 2023
Casting an OID to REGNAMESPACE runs this SQL using the internal executor:
```
SELECT pg_namespace.oid, nspname FROM pg_catalog.pg_namespace WHERE oid = $1
```
When the `GenerateConstrainedScans` rule is disabled, this returns no
rows and so doesn't apply the cast. The same SQL, not run via an
internal executor, but also with `GenerateConstrainedScans` disabled,
behaves correctly. Disabling `GenerateConstrainedScans` all the time
prevents the cluster from starting.

The reason a full scan of pg_namespace doesn't find the OID is because it
only checks for schemas in the current database:
https://github.com/cockroachdb/cockroach/blob/7b341ffa678e4a22416e2274351fd56f415f5421/pkg/sql/virtual_schema.go#L602
https://github.com/cockroachdb/cockroach/blob/d10c3dd42c3dc40cad82792a30ae47fd2a663f43/pkg/sql/pg_catalog.go#L2099-L2106
https://github.com/cockroachdb/cockroach/blob/a7e9c4a68b81436d1f9382518d4267f74cbdac94/pkg/sql/information_schema.go#L2295-L2296

Whereas with a constrained scan, all descriptors are searched:
https://github.com/cockroachdb/cockroach/blob/af6a72a622ae05f3733b5db637403b3eaa9455f1/pkg/sql/catalog/descs/descriptor.go#L168-L175

The correct result is from the constrained scan, because we currently
allow cross-database references. Once #55791 is fixed, both results
should match.

The fix alternatives are:
1. disallow disabling of the `GenerateConstrainedScans` rule for internal SQL
2. turn off generation of REGNAMESPACE expressions in randgen to avoid hitting this problem in tests.

The 2nd fix alternative is implemented to avoid losing any of the
rule-disabling test coverage provided by the
`testing_optimizer_disable_rule_probability` setting.

Fixes #98322

Release note: None
@mgartner mgartner moved this to Done in SQL Queries Jul 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. T-sql-queries SQL Queries Team
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants