Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: sqlsmith/setup=empty/setting=no-ddl failed #92505

Closed
cockroach-teamcity opened this issue Nov 26, 2022 · 2 comments
Closed

roachtest: sqlsmith/setup=empty/setting=no-ddl failed #92505

cockroach-teamcity opened this issue Nov 26, 2022 · 2 comments
Labels
branch-release-22.1 Used to mark GA and release blockers, technical advisories, and bugs for 22.1 C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. T-sql-queries SQL Queries Team
Milestone

Comments

@cockroach-teamcity
Copy link
Member

cockroach-teamcity commented Nov 26, 2022

roachtest.sqlsmith/setup=empty/setting=no-ddl failed with artifacts on release-22.1 @ 6b3f19ae0f6f47bb606d47a07fd5e0f8a80e4fe8:

The test failed on branch=release-22.1, cloud=gce:
test artifacts and logs in: /artifacts/sqlsmith/setup=empty/setting=no-ddl/run_1
	test_runner.go:1014,test_runner.go:913: test timed out (20m0s)
Help

See: roachtest README

See: How To Investigate (internal)

/cc @cockroachdb/sql-queries

This test on roachdash | Improve this report!

Jira issue: CRDB-21818

@cockroach-teamcity cockroach-teamcity added branch-release-22.1 Used to mark GA and release blockers, technical advisories, and bugs for 22.1 C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. labels Nov 26, 2022
@cockroach-teamcity cockroach-teamcity added this to the 22.1 milestone Nov 26, 2022
@blathers-crl blathers-crl bot added the T-sql-queries SQL Queries Team label Nov 26, 2022
@DrewKimball
Copy link
Collaborator

There are no obvious culprits among the queries that ran or memory/cpu usage. It seems likely that there were some network latency/connectivity issues. Possibly related to #91771? Seeing lines like

I221126 06:39:13.799780 27042 1@github.com/cockroachdb/circuitbreaker/external/com_github_cockroachdb_circuitbreaker/circuitbreaker.go:322 ⋮ [n1] 1153  circuitbreaker: ‹rpc [::]:26257 [n4]› tripped: failed to connect to n4 at ‹10.142.0.42:26257›: ‹initial connection heartbeat failed›: ‹operation "rpc heartbeat" timed out after 6s (given timeout 6s)›: ‹rpc error: code = DeadlineExceeded desc = context deadline exceeded›

and node 4 didn't even make it into the logs.
The test teardown log looks like this:

teardown: 06:43:53 test_runner.go:949: [w15] dumped stacks to __stacks
teardown: 06:43:53 cluster.go:636: test status: stopping nodes :1-4
teamcity-7697786-1669443350-16-n4cpu4: stopping
3: exit status 255: 
teardown: 06:43:58 test_runner.go:959: [w15] asked CRDB nodes to dump stacks; check their main (DEV) logs: cluster.StopE: one or more parallel execution failure
teardown: 06:44:01 cluster.go:1346: running (fast) consistency checks on node 1
teardown: 06:44:01 test_runner.go:1108: [w15] collecting cluster logs
teardown: 06:44:06 test_runner.go:1112: failed to fetch disk uage summary: output in run_064401.164717476_n1-4_du: du -c /mnt/data1 --exclude lost+found >> logs/diskusage.txt returned: SSH_PROBLEM: exit status 255
teardown: 06:44:06 cluster.go:1085: fetching logs
teardown: 06:44:06 cluster.go:636: test status: fetching logs
teardown: 06:44:06 cluster.go:636: test status: getting logs
teardown: 06:44:06 cluster_synced.go:1755: teamcity-7697786-1669443350-16-n4cpu4: getting (scp) logs /artifacts/sqlsmith/setup=empty/setting=no-ddl/run_1/logs/unredacted
teardown: 06:46:17 cluster.go:636: test status: 
teardown: 06:46:17 cluster.go:1096: failed to fetch logs: cluster.Get: get logs failed
teardown: 06:46:17 test_runner.go:1115: failed to download logs: operation "fetch logs" timed out after 2m10.849s (given timeout 2m0s): cluster.FetchLogs: cluster.Get: get logs failed
teardown: 06:46:17 cluster.go:1374: fetching dmesg
teardown: 06:46:17 cluster.go:636: test status: fetching dmesg
teardown: 06:46:22 cluster.go:1398: running dmesg failed on node 4: SSH_PROBLEM: exit status 255
teardown: 06:46:22 cluster.go:636: test status: getting dmesg.txt
teardown: 06:46:22 cluster_synced.go:1755: teamcity-7697786-1669443350-16-n4cpu4: getting (scp) dmesg.txt /artifacts/sqlsmith/setup=empty/setting=no-ddl/run_1/dmesg.txt
teardown: 06:46:22 cluster.go:636: test status: 
teardown: 06:46:22 test_runner.go:1118: failed to fetch dmesg: SSH_PROBLEM: exit status 255
teardown: 06:46:22 cluster.go:1425: fetching journalctl
teardown: 06:46:22 cluster.go:636: test status: fetching journalctl
teardown: 06:46:27 cluster.go:1449: running journalctl failed on node 4: SSH_PROBLEM: exit status 255
teardown: 06:46:27 cluster.go:636: test status: getting journalctl.txt
teardown: 06:46:27 cluster_synced.go:1755: teamcity-7697786-1669443350-16-n4cpu4: getting (scp) journalctl.txt /artifacts/sqlsmith/setup=empty/setting=no-ddl/run_1/journalctl.txt
teardown: 06:46:28 cluster.go:636: test status: 
teardown: 06:46:28 test_runner.go:1121: failed to fetch journalctl: SSH_PROBLEM: exit status 255
teardown: 06:46:28 cluster.go:1480: skipped fetching cores
teardown: 06:46:30 cluster.go:636: test status: getting tsdump.gob
teardown: 06:46:30 cluster_synced.go:1755: teamcity-7697786-1669443350-16-n4cpu4: getting (scp) tsdump.gob /artifacts/sqlsmith/setup=empty/setting=no-ddl/run_1/tsdump.gob
teardown: 06:46:31 cluster.go:636: test status: 
teardown: 06:46:31 cluster.go:1226: fetching debug zip
teardown: 06:46:31 cluster.go:636: test status: fetching debug zip
teardown: 06:46:36 cluster.go:1247: ./cockroach debug zip failed: output in run_064631.545077107_n1-4_cockroach_debug_zip: ./cockroach debug zip --exclude-files='*.log,*.txt,*.pprof' --url {pgurl:1} debug.zip returned: one or more parallel execution failure
teardown: 06:46:41 cluster.go:1247: ./cockroach debug zip failed: output in run_064636.579311674_n1-4_cockroach_debug_zip: ./cockroach debug zip --exclude-files='*.log,*.txt,*.pprof' --url {pgurl:2} debug.zip returned: one or more parallel execution failure
teardown: 06:46:46 cluster.go:1247: ./cockroach debug zip failed: output in run_064641.602661504_n1-4_cockroach_debug_zip: ./cockroach debug zip --exclude-files='*.log,*.txt,*.pprof' --url {pgurl:3} debug.zip returned: one or more parallel execution failure
teardown: 06:46:51 cluster.go:1247: ./cockroach debug zip failed: output in run_064646.622640425_n1-4_cockroach_debug_zip: ./cockroach debug zip --exclude-files='*.log,*.txt,*.pprof' --url {pgurl:4} debug.zip returned: one or more parallel execution failure
teardown: 06:46:51 test_runner.go:1133: failed to collect zip: output in run_064646.622640425_n1-4_cockroach_debug_zip: ./cockroach debug zip --exclude-files='*.log,*.txt,*.pprof' --url {pgurl:4} debug.zip returned: one or more parallel execution failure
teardown: 06:46:51 cluster.go:636: test status: stopping nodes :1-4
teamcity-7697786-1669443350-16-n4cpu4: stopping and waiting

@DrewKimball
Copy link
Collaborator

I'll close this as a dup of #92557 for now, since it seems likely they have a common cause that will be fixed soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-release-22.1 Used to mark GA and release blockers, technical advisories, and bugs for 22.1 C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. T-sql-queries SQL Queries Team
Projects
Archived in project
Development

No branches or pull requests

2 participants