Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: schemachange workload times out due to corrupted table descriptor mutations blocking other schema changes #58344

Closed
cockroach-teamcity opened this issue Dec 29, 2020 · 3 comments
Labels
C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)
Milestone

Comments

@cockroach-teamcity
Copy link
Member

cockroach-teamcity commented Dec 29, 2020

(roachtest).schemachange/random-load failed on release-20.2@c6dadb73c8adfaf5e29baa0cac0d7c7d7bdeef6a:

		  | 35958.0s      831            0.0            0.2      0.0      0.0      0.0      0.0 opOk
		  | 35958.0s      831            0.0            0.0      0.0      0.0      0.0      0.0 txnCmtErr
		  | 35958.0s      831            0.0            0.0      0.0      0.0      0.0      0.0 txnOk
		  | 35959.0s      831            0.0            0.2      0.0      0.0      0.0      0.0 opOk
		  | 35959.0s      831            0.0            0.0      0.0      0.0      0.0      0.0 txnCmtErr
		  | 35959.0s      831            0.0            0.0      0.0      0.0      0.0      0.0 txnOk
		  | 35960.0s      831            0.0            0.2      0.0      0.0      0.0      0.0 opOk
		  | 35960.0s      831            0.0            0.0      0.0      0.0      0.0      0.0 txnCmtErr
		  | 35960.0s      831            0.0            0.0      0.0      0.0      0.0      0.0 txnOk
		  | 35961.0s      831            0.0            0.2      0.0      0.0      0.0      0.0 opOk
		  | _elapsed___errors__ops/sec(inst)___ops/sec(cum)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)
		  | 35961.0s      831            0.0            0.0      0.0      0.0      0.0      0.0 txnCmtErr
		  | 35961.0s      831            0.0            0.0      0.0      0.0      0.0      0.0 txnOk
		  | 35962.0s      831            0.0            0.2      0.0      0.0      0.0      0.0 opOk
		  | 35962.0s      831            0.0            0.0      0.0      0.0      0.0      0.0 txnCmtErr
		  | 35962.0s      831            0.0            0.0      0.0      0.0      0.0      0.0 txnOk
		  | 35963.0s      831            0.0            0.2      0.0      0.0      0.0      0.0 opOk
		  | 35963.0s      831            0.0            0.0      0.0      0.0      0.0      0.0 txnCmtErr
		  | 35963.0s      831            0.0            0.0      0.0      0.0      0.0      0.0 txnOk
		  | 35964.0s      831            0.0            0.2      0.0      0.0      0.0      0.0 opOk
		  | 35964.0s      831            0.0            0.0      0.0      0.0      0.0      0.0 txnCmtErr
		  | 35964.0s      831            0.0            0.0      0.0      0.0      0.0      0.0 txnOk
		  | 35965.0s      831            0.0            0.2      0.0      0.0      0.0      0.0 opOk
		  | 35965.0s      831            0.0            0.0      0.0      0.0      0.0      0.0 txnCmtErr
		  | 35965.0s      831            0.0            0.0      0.0      0.0      0.0      0.0 txnOk
		  | 35966.0s      831            0.0            0.2      0.0      0.0      0.0      0.0 opOk
		  | 35966.0s      831            0.0            0.0      0.0      0.0      0.0      0.0 txnCmtErr
		  | 35966.0s      831            0.0            0.0      0.0      0.0      0.0      0.0 txnOk
		  | 35967.0s      831            0.0            0.2      0.0      0.0      0.0      0.0 opOk
		  | 35967.0s      831            0.0            0.0      0.0      0.0      0.0      0.0 txnCmtErr
		  | 35967.0s      831            0.0            0.0      0.0      0.0      0.0      0.0 txnOk
		  | _elapsed___errors__ops/sec(inst)___ops/sec(cum)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)
		  | 35968.0s      831            0.0            0.2      0.0      0.0      0.0      0.0 opOk
		  | 35968.0s      831            0.0            0.0      0.0      0.0      0.0      0.0 txnCmtErr
		  | 35968.0s      831            0.0            0.0      0.0      0.0      0.0      0.0 txnOk
		  | 35969.0s      831            0.0            0.2      0.0      0.0      0.0      0.0 opOk
		  | 35969.0s      831            0.0            0.0      0.0      0.0      0.0      0.0 txnCmtErr
		  | 35969.0s      831            0.0            0.0      0.0      0.0      0.0      0.0 txnOk
		  | 35970.0s      831            0.0            0.2      0.0      0.0      0.0      0.0 opOk
		  | 35970.0s      831            0.0            0.0      0.0      0.0      0.0      0.0 txnCmtErr
		  | 35970.0s      831            0.0            0.0      0.0      0.0      0.0      0.0 txnOk
		  | 35971.0s      831            0.0            0.2      0.0      0.0      0.0      0.0 opOk
		  | 35971.0s      831            0.0            0.0      0.0      0.0      0.0      0.0 txnCmtErr
		  | 35971.0s      831            0.0            0.0      0.0      0.0      0.0      0.0 txnOk
		Wraps: (4) secondary error attachment
		  | signal: killed
		  | (1) signal: killed
		  | Error types: (1) *exec.ExitError
		Wraps: (5) context canceled
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *main.withCommandDetails (4) *secondary.withSecondaryError (5) *errors.errorString

More

Artifacts: /schemachange/random-load
Related:

See this test on roachdash
powered by pkg/cmd/internal/issues

Jira issue: CRDB-3408

@cockroach-teamcity cockroach-teamcity added branch-release-20.2 C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. labels Dec 29, 2020
@cockroach-teamcity cockroach-teamcity added this to the 20.2 milestone Dec 29, 2020
@ajwerner ajwerner removed branch-release-20.2 release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. labels Jan 5, 2021
@ajwerner
Copy link
Contributor

ajwerner commented Jan 5, 2021

This one ends up being super interesting. In a transaction you have a schema change which drops a column (cascade) and adds a NOT NULL constraint on a different column. The NOT NULL constraint additions fails, but after the indexes have reached DELETE_ONLY. Then, when reverting, we try to add back some of these indexes (which, of course, are unique). However, we of course lost the data for the column we were dropping because we started by doing the column backfill to it. Thus, when we re-backfilled the unique indexes, we ended up finding that the unique constraint was violated and the table descriptor was left in an invalid state. Very fun.

This is, in effect, a duplicate of #46541/#47712.

@thoszhang thoszhang changed the title roachtest: schemachange/random-load failed roachtest: schemachange workload times out due to corrupted table descriptor mutations blocking other schema changes Jan 12, 2021
@thoszhang
Copy link
Contributor

We were thinking of just turning off DROP COLUMN in the workload for the time being.

@ajwerner
Copy link
Contributor

Another, fancier option is to disallow DROP COLUMN in the same transaction as other hazardous operations to the same table. It'd be a bit more work.

@jlinder jlinder added the T-sql-schema-deprecated Use T-sql-foundations instead label Jun 16, 2021
@rafiss rafiss closed this as completed May 1, 2023
@exalate-issue-sync exalate-issue-sync bot added T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) and removed T-sql-schema-deprecated Use T-sql-foundations instead labels May 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)
Projects
None yet
Development

No branches or pull requests

5 participants