Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: schemachange/mixed-versions failed #70204

Closed
cockroach-teamcity opened this issue Sep 14, 2021 · 16 comments
Closed

roachtest: schemachange/mixed-versions failed #70204

cockroach-teamcity opened this issue Sep 14, 2021 · 16 comments
Assignees
Labels
C-test-failure Broken test (automatically or manually discovered). GA-blocker O-roachtest O-robot Originated from a bot. T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)

Comments

@cockroach-teamcity
Copy link
Member

roachtest.schemachange/mixed-versions failed with artifacts on release-21.2 @ 14411b999aae710ca0f4a6376d58e302b197281b:

The test failed on branch=release-21.2, cloud=gce:
test timed out (see artifacts for details)
Reproduce

See: roachtest README

Same failure on other branches

/cc @cockroachdb/sql-schema

This test on roachdash | Improve this report!

@cockroach-teamcity cockroach-teamcity added branch-release-21.2 C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. labels Sep 14, 2021
@blathers-crl blathers-crl bot added the T-sql-schema-deprecated Use T-sql-foundations instead label Sep 14, 2021
@cockroach-teamcity
Copy link
Member Author

roachtest.schemachange/mixed-versions failed with artifacts on release-21.2 @ d3fc366bcba04ab19f4cb59844212e908e2a9aa8:

The test failed on branch=release-21.2, cloud=gce:
test timed out (see artifacts for details)
Reproduce

See: roachtest README

Same failure on other branches

/cc @cockroachdb/sql-schema

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

roachtest.schemachange/mixed-versions failed with artifacts on release-21.2 @ 0babf97f52ed9e44036851b2a9868e17eeee95ed:

The test failed on branch=release-21.2, cloud=gce:
test timed out (see artifacts for details)
Reproduce

See: roachtest README

Same failure on other branches

/cc @cockroachdb/sql-schema

This test on roachdash | Improve this report!

@irfansharif irfansharif changed the title roachtest: schemachange/mixed-versions failed roachtest: schemachange/mixed-versions failed [should stop after beta1] Sep 20, 2021
@irfansharif irfansharif changed the title roachtest: schemachange/mixed-versions failed [should stop after beta1] roachtest: schemachange/mixed-versions failed Sep 20, 2021
@irfansharif
Copy link
Contributor

(This was on release-21.2, so not caused by the fallout from #69887. Should be investigated separately.)

@ajwerner
Copy link
Contributor

Thanks and sorry for the noise. This seems totally legit.

I210914 06:28:57.025508 1 util/log/flags.go:180  [-] 1  stderr capture started
panic: unexpected Oid: 4096 [recovered]
	panic: unexpected Oid: 4096

goroutine 41668 [running]:
panic(0x47d67c0, 0xc0026fe0a0)
	/usr/local/go/src/runtime/panic.go:1064 +0x545 fp=0xc003c072b0 sp=0xc003c071e8 pc=0x48b725
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).Recover(0xc000d0c380, 0x5ac8a20, 0xc00237b1b8)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:231 +0x126 fp=0xc003c07310 sp=0xc003c072b0 pc=0x1427366
runtime.call32(0x0, 0x534f0a0, 0xc002b0ff68, 0x1800000018)
	/usr/local/go/src/runtime/asm_amd64.s:540 +0x3e fp=0xc003c07340 sp=0xc003c07310 pc=0x4c311e
runtime.reflectcallSave(0xc003c07480, 0x534f0a0, 0xc002b0ff68, 0xc000000018)
	/usr/local/go/src/runtime/panic.go:881 +0x58 fp=0xc003c07370 sp=0xc003c07340 pc=0x48b118
runtime.runOpenDeferFrame(0xc0028c0d80, 0xc002b0ff20, 0x0)
	/usr/local/go/src/runtime/panic.go:855 +0x2cd fp=0xc003c07400 sp=0xc003c07370 pc=0x48afcd
panic(0x47d67c0, 0xc0026fe0a0)
	/usr/local/go/src/runtime/panic.go:969 +0x1b9 fp=0xc003c074c8 sp=0xc003c07400 pc=0x48b399
github.com/cockroachdb/cockroach/pkg/sql/types.(*T).SQLStandardNameWithTypmod(0xc0012960c0, 0xc001296000, 0x0, 0x87ed580, 0x0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/types/types.go:1557 +0x1845 fp=0xc003c07640 sp=0xc003c074c8 pc=0x148ba65
github.com/cockroachdb/cockroach/pkg/sql/types.(*T).SQLStandardName(...)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/types/types.go:1440
github.com/cockroachdb/cockroach/pkg/sql/types.(*T).InformationSchemaName(...)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/types/types.go:1639
github.com/cockroachdb/cockroach/pkg/sql.glob..func82.1(0xc003b0ff00, 0xc002c7ffe0, 0xa, 0x5c1da40, 0xc002e81200, 0x602f1180, 0x699c24273f4c3147)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/information_schema.go:477 +0x15fd fp=0xc003c078d8 sp=0xc003c07640 pc=0x36c6c1d
github.com/cockroachdb/cockroach/pkg/sql.forEachTableDesc.func1(0xc003b0ff00, 0xc002c7ffe0, 0xa, 0x5c1da40, 0xc002e81200, 0xc001dde1e0, 0xc003b0ff00, 0x0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/information_schema.go:1948 +0x57 fp=0xc003c07920 sp=0xc003c078d8 pc=0x3692b97
github.com/cockroachdb/cockroach/pkg/sql.forEachTableDescWithTableLookupInternalFromDescriptors(0x5aab0a0, 0xc003b27dc0, 0xc00203fdf8, 0xc000b77800, 0x0, 0x0, 0xc000aca000, 0xd2, 0xd2, 0xc003c07bb0, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/information_schema.go:2191 +0x527 fp=0xc003c07af8 sp=0xc003c07920 pc=0x35787a7
github.com/cockroachdb/cockroach/pkg/sql.forEachTableDescWithTableLookupInternal(0x5aab0a0, 0xc003b27dc0, 0xc00203fdf8, 0xc000b77800, 0x0, 0xc000b3bb00, 0xc003c07bb0, 0x4668240, 0x4adcba0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/information_schema.go:2064 +0xf4 fp=0xc003c07b68 sp=0xc003c07af8 pc=0x3577db4
github.com/cockroachdb/cockroach/pkg/sql.forEachTableDescWithTableLookup(...)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/information_schema.go:2014
github.com/cockroachdb/cockroach/pkg/sql.forEachTableDesc(0x5aab0a0, 0xc003b27dc0, 0xc00203fdf8, 0xc000b77800, 0x0, 0xc000b3bc98, 0x0, 0x0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/information_schema.go:1942 +0x8d fp=0xc003c07bd0 sp=0xc003c07b68 pc=0x357784d
github.com/cockroachdb/cockroach/pkg/sql.glob..func82(0x5aab0a0, 0xc003b27dc0, 0xc00203fdf8, 0xc000b77800, 0xc003a80100, 0xc0027296e8, 0x4bcb80)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/information_schema.go:423 +0x32f fp=0xc003c07d68 sp=0xc003c07bd0 pc=0x366616f
github.com/cockroachdb/cockroach/pkg/sql.(*virtualDefEntry).getPlanInfo.func1.1(0x5a0f440, 0xc00365f780, 0x2, 0x1)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/virtual_schema.go:526 +0x115 fp=0xc003c07de0 sp=0xc003c07d68 pc=0x36b2eb5
github.com/cockroachdb/cockroach/pkg/sql.setupGenerator.func3(0x5ac8a20, 0xc00237b1b8)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/virtual_table.go:121 +0x152 fp=0xc003c07f30 sp=0xc003c07de0 pc=0x36b4e12
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx.func2(0xc000d0c380, 0x5ac8a20, 0xc00237b1b8, 0xc00237b040, 0x0, 0xc003b1be60)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:446 +0xf3 fp=0xc003c07fb0 sp=0xc003c07f30 pc=0x1428873
runtime.goexit()
	/usr/local/go/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc003c07fb8 sp=0xc003c07fb0 pc=0x4c4b41
created by github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx
	/go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:438 +0x22b

@blathers-crl blathers-crl bot added the T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) label Sep 20, 2021
@ajwerner
Copy link
Contributor

@nvanbenschoten, @rafiss, @otan this seems to be related to regrole and #68877. Is there a missing version gate somewhere? My guess is that the test created a table involving a regrole and then boom.

@nvanbenschoten
Copy link
Member

This seems likely. There was no version gating added in #68877. How do we usually perform version gating on newly introduced SQL types?

@rafiss
Copy link
Collaborator

rafiss commented Sep 20, 2021

Agh, yeah it must be version gating, sorry for missing it in the review.

I think what makes sense to add the version gating in CREATE and ALTER TABLE around these two areas:

switch toType.Oid() {

switch defType.Oid() {

@ajwerner
Copy link
Contributor

Last time we did it, we did it here:

// minimumTypeUsageVersions defines the minimum version needed for a new
// data type.
var minimumTypeUsageVersions = map[types.Family]clusterversion.VersionKey{
types.TimeTZFamily: clusterversion.VersionTimeTZType,
types.GeographyFamily: clusterversion.VersionGeospatialType,
types.GeometryFamily: clusterversion.VersionGeospatialType,
}
// isTypeSupportedInVersion returns whether a given type is supported in the given version.
func isTypeSupportedInVersion(v clusterversion.ClusterVersion, t *types.T) (bool, error) {
switch t.Family() {
case types.TimeFamily, types.TimestampFamily, types.TimestampTZFamily, types.TimeTZFamily:
if t.Precision() != 6 && !v.IsActive(clusterversion.VersionTimePrecision) {
return false, nil
}
case types.IntervalFamily:
itm, err := t.IntervalTypeMetadata()
if err != nil {
return false, err
}
if (t.Precision() != 6 || itm.DurationField != types.IntervalDurationField{}) &&
!v.IsActive(clusterversion.VersionTimePrecision) {
return false, nil
}
}
minVersion, ok := minimumTypeUsageVersions[t.Family()]
if !ok {
return true, nil
}
return v.IsActive(minVersion), nil
}

@nvanbenschoten
Copy link
Member

On a related note, how do we handle version gating of newly introduced builtin functions? Can't these be stored in table descriptors (e.g. in DEFAULT clauses) and then evaluated on nodes running the old binary version?

@ajwerner
Copy link
Contributor

Can't these be stored in table descriptors (e.g. in DEFAULT clauses) and then evaluated on nodes running the old binary version?

That's a really good point. We need to build something there. We haven't tested enough in that direction. I'll file an issue.

@celiala celiala added GA-blocker and removed release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. labels Sep 20, 2021
@nvanbenschoten
Copy link
Member

Last time we did it, we did it here:

This was unfortunately all removed in f3d95c5. It seems like the kind of infrastructure we'll want to keep in place across versions, even if there are times when all types are temporarily supported.

@rafiss is this something someone on your team would be able to pick up? I can as well, but likely won't get to this until Wed or Thurs.

@RichardJCai
Copy link
Contributor

Last time we did it, we did it here:

This was unfortunately all removed in f3d95c5. It seems like the kind of infrastructure we'll want to keep in place across versions, even if there are times when all types are temporarily supported.

@rafiss is this something someone on your team would be able to pick up? I can as well, but likely won't get to this until Wed or Thurs.

I can probably get to this tomorrow.

@rafiss
Copy link
Collaborator

rafiss commented Sep 20, 2021

Thanks @RichardJCai !

@nvanbenschoten
Copy link
Member

Thanks Richard!

@rafiss
Copy link
Collaborator

rafiss commented Sep 21, 2021

On a related note, how do we handle version gating of newly introduced builtin functions?

Also, we don't do any version gating (that i know of) when we change/fix the implementation of existing builtins.

@cockroach-teamcity
Copy link
Member Author

roachtest.schemachange/mixed-versions failed with artifacts on release-21.2 @ 24021ba163e4ac438b169d575cf1527a4aae394d:

		  | I210925 08:02:01.044030 1 workload/pgx_helpers.go:61  [-] 3604  pgx logger [error]: connect failed logParams=map[err:dial tcp 10.142.0.227:26257: connect: connection refused]
		  | W210925 08:02:03.105389 1 workload/cli/run.go:417  [-] 3605  retrying after error while creating load: failed to initialize the load generator: dial tcp 10.142.0.227:26257: connect: connection refused
		  | I210925 08:02:03.107316 1 workload/pgx_helpers.go:61  [-] 3606  pgx logger [error]: connect failed logParams=map[err:dial tcp 10.142.0.227:26257: connect: connection refused]
		  | W210925 08:02:05.028628 1 workload/cli/run.go:417  [-] 3607  retrying after error while creating load: failed to initialize the load generator: dial tcp 10.142.0.227:26257: connect: connection refused
		  | I210925 08:02:05.030451 1 workload/pgx_helpers.go:61  [-] 3608  pgx logger [error]: connect failed logParams=map[err:dial tcp 10.142.0.227:26257: connect: connection refused]
		  | W210925 08:02:07.153761 1 workload/cli/run.go:417  [-] 3609  retrying after error while creating load: failed to initialize the load generator: dial tcp 10.142.0.227:26257: connect: connection refused
		  | I210925 08:02:07.155723 1 workload/pgx_helpers.go:61  [-] 3610  pgx logger [error]: connect failed logParams=map[err:dial tcp 10.142.0.227:26257: connect: connection refused]
		  | W210925 08:02:09.450669 1 workload/cli/run.go:417  [-] 3611  retrying after error while creating load: failed to initialize the load generator: dial tcp 10.142.0.227:26257: connect: connection refused
		  | I210925 08:02:09.452499 1 workload/pgx_helpers.go:61  [-] 3612  pgx logger [error]: connect failed logParams=map[err:dial tcp 10.142.0.227:26257: connect: connection refused]
		  | E210925 08:02:09.883827 1 workload/cli/run.go:431  [-] 3613  Attempt to create load generator failed. It's been more than 1h0m0s since we started trying to create the load generator so we're giving up. Last failure: failed to initialize the load generator: dial tcp 10.142.0.227:26257: connect: connection refused
		  | Error: failed to initialize the load generator: failed to connect to ``host=10.142.0.227 user=root database=schemachange``: dial error (dial tcp 10.142.0.227:26257: connect: connection refused)
		  | Error: COMMAND_PROBLEM: exit status 1
		  | (1) COMMAND_PROBLEM
		  | Wraps: (2) Node 4. Command with error:
		  |   | ``````
		  |   | ./workload run schemachange --verbose=1 --tolerate-errors=true --max-ops 100 --concurrency 5 {pgurl:1-4}
		  |   | ``````
		  | Wraps: (3) exit status 1
		  | Error types: (1) errors.Cmd (2) *hintdetail.withDetail (3) *exec.ExitError
		  |
		  | stdout:
		Wraps: (4) exit status 20
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *cluster.WithCommandDetails (4) *exec.ExitError

	cluster.go:1249,context.go:89,cluster.go:1237,test_runner.go:866: dead node detection: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod monitor teamcity-3496754-1632552652-01-n4cpu4 --oneshot --ignore-empty-nodes: exit status 1 4: 12444
		1: 13317
		3: 11917
		2: dead (exit status 134)
		Error: UNCLASSIFIED_PROBLEM: 2: dead (exit status 134)
		(1) UNCLASSIFIED_PROBLEM
		Wraps: (2) attached stack trace
		  -- stack trace:
		  | main.glob..func14
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1173
		  | main.wrap.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:281
		  | github.com/spf13/cobra.(*Command).execute
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:856
		  | github.com/spf13/cobra.(*Command).ExecuteC
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:960
		  | github.com/spf13/cobra.(*Command).Execute
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:897
		  | main.main
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:2107
		  | runtime.main
		  | 	/usr/local/go/src/runtime/proc.go:225
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1371
		Wraps: (3) 2: dead (exit status 134)
		Error types: (1) errors.Unclassified (2) *withstack.withStack (3) *errutil.leafError
Reproduce

See: roachtest README

Same failure on other branches

/cc @cockroachdb/sql-schema

This test on roachdash | Improve this report!

@rafiss rafiss closed this as completed Sep 27, 2021
@exalate-issue-sync exalate-issue-sync bot removed the T-sql-schema-deprecated Use T-sql-foundations instead label May 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-test-failure Broken test (automatically or manually discovered). GA-blocker O-roachtest O-robot Originated from a bot. T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)
Projects
None yet
Development

No branches or pull requests

7 participants