Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg/ccl/spanconfigccl/spanconfigsqlwatcherccl/spanconfigsqlwatcherccl_test: TestSQLWatcherOnEventError failed #76765

Closed
cockroach-teamcity opened this issue Feb 18, 2022 · 23 comments · Fixed by #99745
Assignees
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. sync-me-8

Comments

@cockroach-teamcity
Copy link
Member

cockroach-teamcity commented Feb 18, 2022

pkg/ccl/spanconfigccl/spanconfigsqlwatcherccl/spanconfigsqlwatcherccl_test.TestSQLWatcherOnEventError failed with artifacts on master @ 1113ce6c6b53147821fd808a0ca215ef81509ee4:

=== RUN   TestSQLWatcherOnEventError
    sqlwatcher_test.go:437: handler should never run
--- FAIL: TestSQLWatcherOnEventError (0.96s)
Help

See also: How To Investigate a Go Test Failure (internal)
Parameters in this failure:

  • TAGS=bazel,gss,deadlock

This test on roachdash | Improve this report!

Jira issue: CRDB-13266

@cockroach-teamcity cockroach-teamcity added branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. labels Feb 18, 2022
@cockroach-teamcity
Copy link
Member Author

pkg/ccl/spanconfigccl/spanconfigsqlwatcherccl/spanconfigsqlwatcherccl_test.TestSQLWatcherOnEventError failed with artifacts on master @ 231fd420c61e39b6ad08c982c496d82cf1910bc5:

=== RUN   TestSQLWatcherOnEventError
    sqlwatcher_test.go:437: handler should never run
--- FAIL: TestSQLWatcherOnEventError (0.98s)
Help

See also: How To Investigate a Go Test Failure (internal)
Parameters in this failure:

  • TAGS=bazel,gss,deadlock

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

pkg/ccl/spanconfigccl/spanconfigsqlwatcherccl/spanconfigsqlwatcherccl_test.TestSQLWatcherOnEventError failed with artifacts on master @ 6537583259df42cb98f6301e969089e5064df614:

=== RUN   TestSQLWatcherOnEventError
    sqlwatcher_test.go:454: handler should never run
--- FAIL: TestSQLWatcherOnEventError (0.80s)
Help

See also: How To Investigate a Go Test Failure (internal)
Parameters in this failure:

  • TAGS=bazel,gss,deadlock

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

pkg/ccl/spanconfigccl/spanconfigsqlwatcherccl/spanconfigsqlwatcherccl_test.TestSQLWatcherOnEventError failed with artifacts on master @ a2e1910f51593bd2ef72e1d7c615e08f95791186:

=== RUN   TestSQLWatcherOnEventError
    sqlwatcher_test.go:454: handler should never run
--- FAIL: TestSQLWatcherOnEventError (0.94s)
Help

See also: How To Investigate a Go Test Failure (internal)
Parameters in this failure:

  • TAGS=bazel,gss,deadlock

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

pkg/ccl/spanconfigccl/spanconfigsqlwatcherccl/spanconfigsqlwatcherccl_test.TestSQLWatcherOnEventError failed with artifacts on master @ bc1ee7c7c276984fce8ff5ba4fcfcdff335dde50:

=== RUN   TestSQLWatcherOnEventError
    sqlwatcher_test.go:454: handler should never run
--- FAIL: TestSQLWatcherOnEventError (1.09s)
Help

See also: How To Investigate a Go Test Failure (internal)
Parameters in this failure:

  • TAGS=bazel,gss,deadlock

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

pkg/ccl/spanconfigccl/spanconfigsqlwatcherccl/spanconfigsqlwatcherccl_test.TestSQLWatcherOnEventError failed with artifacts on master @ bc5f5b7159ed2b7dbffa94bb2a590fdddb642b9b:

=== RUN   TestSQLWatcherOnEventError
    sqlwatcher_test.go:454: handler should never run
--- FAIL: TestSQLWatcherOnEventError (0.85s)
Help

See also: How To Investigate a Go Test Failure (internal)
Parameters in this failure:

  • TAGS=bazel,gss,deadlock

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

pkg/ccl/spanconfigccl/spanconfigsqlwatcherccl/spanconfigsqlwatcherccl_test.TestSQLWatcherOnEventError failed with artifacts on master @ a35b848edb3f5562199b0f56bc27a6c576073d5d:

=== RUN   TestSQLWatcherOnEventError
    sqlwatcher_test.go:454: handler should never run
--- FAIL: TestSQLWatcherOnEventError (0.86s)
Help

See also: How To Investigate a Go Test Failure (internal)
Parameters in this failure:

  • TAGS=bazel,gss,deadlock

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

pkg/ccl/spanconfigccl/spanconfigsqlwatcherccl/spanconfigsqlwatcherccl_test.TestSQLWatcherOnEventError failed with artifacts on master @ ce66acdbba801de88f1dd645eaedeb3834f23dbd:

=== RUN   TestSQLWatcherOnEventError
    sqlwatcher_test.go:454: handler should never run
--- FAIL: TestSQLWatcherOnEventError (0.96s)
Help

See also: How To Investigate a Go Test Failure (internal)
Parameters in this failure:

  • TAGS=bazel,gss,deadlock

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

pkg/ccl/spanconfigccl/spanconfigsqlwatcherccl/spanconfigsqlwatcherccl_test.TestSQLWatcherOnEventError failed with artifacts on master @ 92947c29c55ff909f50a5e625811d34a1bbe71f7:

=== RUN   TestSQLWatcherOnEventError
    sqlwatcher_test.go:454: handler should never run
--- FAIL: TestSQLWatcherOnEventError (0.84s)
Help

See also: How To Investigate a Go Test Failure (internal)
Parameters in this failure:

  • TAGS=bazel,gss,deadlock

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

pkg/ccl/spanconfigccl/spanconfigsqlwatcherccl/spanconfigsqlwatcherccl_test.TestSQLWatcherOnEventError failed with artifacts on master @ 5c400e31f33b97a7b3232bb54abdd55645f0e978:

=== RUN   TestSQLWatcherOnEventError
    sqlwatcher_test.go:454: handler should never run
--- FAIL: TestSQLWatcherOnEventError (1.03s)
Help

See also: How To Investigate a Go Test Failure (internal)
Parameters in this failure:

  • TAGS=bazel,gss,deadlock

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

pkg/ccl/spanconfigccl/spanconfigsqlwatcherccl/spanconfigsqlwatcherccl_test.TestSQLWatcherOnEventError failed with artifacts on master @ 1cea73c8a18623949b81705eb5f75179e6cd8d86:

=== RUN   TestSQLWatcherOnEventError
    sqlwatcher_test.go:454: handler should never run
--- FAIL: TestSQLWatcherOnEventError (0.94s)
Help

See also: How To Investigate a Go Test Failure (internal)
Parameters in this failure:

  • TAGS=bazel,gss,deadlock

Same failure on other branches

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

pkg/ccl/spanconfigccl/spanconfigsqlwatcherccl/spanconfigsqlwatcherccl_test.TestSQLWatcherOnEventError failed with artifacts on master @ 5a54758ce89e866b4fe28c0df74bd610973c6918:

=== RUN   TestSQLWatcherOnEventError
    sqlwatcher_test.go:454: handler should never run
--- FAIL: TestSQLWatcherOnEventError (0.95s)
Help

See also: How To Investigate a Go Test Failure (internal)
Parameters in this failure:

  • TAGS=bazel,gss,deadlock

Same failure on other branches

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

pkg/ccl/spanconfigccl/spanconfigsqlwatcherccl/spanconfigsqlwatcherccl_test.TestSQLWatcherOnEventError failed with artifacts on master @ 3706c07f7332098d3eb0fcb6ae3b682a385b1306:

=== RUN   TestSQLWatcherOnEventError
    sqlwatcher_test.go:454: handler should never run
--- FAIL: TestSQLWatcherOnEventError (1.12s)

Parameters: TAGS=bazel,gss,deadlock

Help

See also: How To Investigate a Go Test Failure (internal)

Same failure on other branches

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

pkg/ccl/spanconfigccl/spanconfigsqlwatcherccl/spanconfigsqlwatcherccl_test.TestSQLWatcherOnEventError failed with artifacts on master @ 45ee8026c1997df738ee184fe6a606e927c0a8c5:

=== RUN   TestSQLWatcherOnEventError
    sqlwatcher_test.go:454: handler should never run
--- FAIL: TestSQLWatcherOnEventError (0.99s)

Parameters: TAGS=bazel,gss,deadlock

Help

See also: How To Investigate a Go Test Failure (internal)

Same failure on other branches

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

pkg/ccl/spanconfigccl/spanconfigsqlwatcherccl/spanconfigsqlwatcherccl_test.TestSQLWatcherOnEventError failed with artifacts on master @ d82ac30cc7b3cb5566e347ae1fecd2025c5e0623:

=== RUN   TestSQLWatcherOnEventError
    sqlwatcher_test.go:454: handler should never run
--- FAIL: TestSQLWatcherOnEventError (1.06s)

Parameters: TAGS=bazel,gss,deadlock

Help

See also: How To Investigate a Go Test Failure (internal)

Same failure on other branches

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

pkg/ccl/spanconfigccl/spanconfigsqlwatcherccl/spanconfigsqlwatcherccl_test.TestSQLWatcherOnEventError failed with artifacts on master @ 8d34ef1ea15850ee1c70470610b6652df4c317de:

=== RUN   TestSQLWatcherOnEventError
    sqlwatcher_test.go:460: handler should never run
--- FAIL: TestSQLWatcherOnEventError (1.19s)

Parameters: TAGS=bazel,gss,deadlock

Help

See also: How To Investigate a Go Test Failure (internal)

Same failure on other branches

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

pkg/ccl/spanconfigccl/spanconfigsqlwatcherccl/spanconfigsqlwatcherccl_test.TestSQLWatcherOnEventError failed with artifacts on master @ 8d34ef1ea15850ee1c70470610b6652df4c317de:

=== RUN   TestSQLWatcherOnEventError
    sqlwatcher_test.go:460: handler should never run
--- FAIL: TestSQLWatcherOnEventError (1.12s)

Parameters: TAGS=bazel,gss,deadlock

Help

See also: How To Investigate a Go Test Failure (internal)

Same failure on other branches

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

pkg/ccl/spanconfigccl/spanconfigsqlwatcherccl/spanconfigsqlwatcherccl_test.TestSQLWatcherOnEventError failed with artifacts on master @ 377b113bd7e7b679c55a0f93e0d8ab5dbb6f2e97:

=== RUN   TestSQLWatcherOnEventError
    sqlwatcher_test.go:460: handler should never run
--- FAIL: TestSQLWatcherOnEventError (1.07s)

Parameters: TAGS=bazel,gss,deadlock

Help

See also: How To Investigate a Go Test Failure (internal)

Same failure on other branches

This test on roachdash | Improve this report!

@irfansharif
Copy link
Contributor

irfansharif commented Feb 18, 2023

Looks like we're tripping up the one assertion the test was written for, flakes on 22.1 (#82031) and 22.2 (#97319).

@cockroach-teamcity
Copy link
Member Author

pkg/ccl/spanconfigccl/spanconfigsqlwatcherccl/spanconfigsqlwatcherccl_test.TestSQLWatcherOnEventError failed with artifacts on master @ e9c96e7179e19aae2f8d386f67eb950db8c3354b:

W230220 07:28:32.799391 1307734 sql/sqlliveness/slinstance/slinstance.go:312  [T1,n1] 797  exiting heartbeat loop
I230220 07:28:32.797312 1315118 sql/stats/automatic_stats.go:507  [T1,n1] 794  quiescing auto stats refresher
I230220 07:28:32.799058 1298508 server/start_listen.go:103  [T1,n1] 798  server shutting down: instructing cmux to stop accepting
W230220 07:28:32.804331 1300705 kv/kvserver/replica_proposal.go:505  [T1,n1,s1,r34/1:/Table/3{2-3},raft] 799  lease <empty> expired before being followed by lease repl=(n1,s1):1 seq=1 start=0,0 epo=1 pro=1676878112.792404825,0; foreground traffic may have been impacted
W230220 07:28:32.805555 1307734 sql/sqlliveness/slinstance/slinstance.go:297  [T1,n1] 800  exiting heartbeat loop
E230220 07:28:32.806190 1307734 server/server_sql.go:474  [T1,n1] 801  failed to run update of instance with new session ID: node unavailable; try another peer
E230220 07:28:32.806571 1315294 jobs/registry.go:994  [T1,n1] 802  error expiring job sessions: expire-sessions: failed to read query result: context canceled
E230220 07:28:32.807569 1315294 jobs/registry.go:1002  [T1,n1] 803  failed to serve pause and cancel requests: txn exec: context canceled
E230220 07:28:32.807776 1315336 jobs/registry.go:1062  [T1,n1] 804  error processing claimed jobs: could not query for claimed jobs: select-running/get-claimed-jobs: failed to read query result: context canceled
W230220 07:28:32.807820 1315294 jobs/registry.go:1078  [T1,n1] 805  canceling all adopted jobs due to stopper quiescing
E230220 07:28:32.808298 1542216 jobs/registry.go:950  [T1,n1] 806  error getting live session: node unavailable; try another peer
E230220 07:28:32.810376 1574295 kv/kvclient/kvcoord/txn_interceptor_committer.go:489  [T1] 807  making txn commit explicit failed for "sql txn" meta={id=bd21503c key=/Table/3/1/104/2/1 pri=0.00547436 epo=0 ts=1676878112.695089141,0 min=1676878112.695089141,0 seq=3} lock=true stat=STAGING rts=1676878112.695089141,0 wto=false gul=1676878113.195089141,0 ifw=2: result is ambiguous: server shutdown
W230220 07:28:32.810583 1574342 1@sql/event_log.go:653  [T1,n1,client=127.0.0.1:47642,user=root] 808  unable to save 1 entries to system.eventlog: log-event: failed to read query result: context canceled
W230220 07:28:32.811776 1301137 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 809  failed node liveness heartbeat: txn exec: context canceled
W230220 07:28:32.811776 1301137 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 809 +(1) attached stack trace
W230220 07:28:32.811776 1301137 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 809 +  -- stack trace:
W230220 07:28:32.811776 1301137 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 809 +  | github.com/cockroachdb/cockroach/pkg/kv.(*Txn).exec
W230220 07:28:32.811776 1301137 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 809 +  | 	github.com/cockroachdb/cockroach/pkg/kv/txn.go:926
W230220 07:28:32.811776 1301137 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 809 +  | github.com/cockroachdb/cockroach/pkg/kv.runTxn
W230220 07:28:32.811776 1301137 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 809 +  | 	github.com/cockroachdb/cockroach/pkg/kv/db.go:959
W230220 07:28:32.811776 1301137 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 809 +  | github.com/cockroachdb/cockroach/pkg/kv.(*DB).TxnWithAdmissionControl
W230220 07:28:32.811776 1301137 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 809 +  | 	github.com/cockroachdb/cockroach/pkg/kv/db.go:922
W230220 07:28:32.811776 1301137 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 809 +  | github.com/cockroachdb/cockroach/pkg/kv.(*DB).Txn
W230220 07:28:32.811776 1301137 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 809 +  | 	github.com/cockroachdb/cockroach/pkg/kv/db.go:897
W230220 07:28:32.811776 1301137 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 809 +  | github.com/cockroachdb/cockroach/pkg/kv/kvserver/liveness.(*NodeLiveness).updateLivenessAttempt
W230220 07:28:32.811776 1301137 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 809 +  | 	github.com/cockroachdb/cockroach/pkg/kv/kvserver/liveness/liveness.go:1343
W230220 07:28:32.811776 1301137 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 809 +  | github.com/cockroachdb/cockroach/pkg/kv/kvserver/liveness.(*NodeLiveness).updateLiveness
W230220 07:28:32.811776 1301137 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 809 +  | 	github.com/cockroachdb/cockroach/pkg/kv/kvserver/liveness/liveness.go:1298
W230220 07:28:32.811776 1301137 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 809 +  | github.com/cockroachdb/cockroach/pkg/kv/kvserver/liveness.(*NodeLiveness).heartbeatInternal
W230220 07:28:32.811776 1301137 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 809 +  | 	github.com/cockroachdb/cockroach/pkg/kv/kvserver/liveness/liveness.go:965
W230220 07:28:32.811776 1301137 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 809 +  | github.com/cockroachdb/cockroach/pkg/kv/kvserver/liveness.(*NodeLiveness).Start.func1.1
W230220 07:28:32.811776 1301137 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 809 +  | 	github.com/cockroachdb/cockroach/pkg/kv/kvserver/liveness/liveness.go:780
W230220 07:28:32.811776 1301137 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 809 +  | github.com/cockroachdb/cockroach/pkg/util/contextutil.RunWithTimeout
W230220 07:28:32.811776 1301137 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 809 +  | 	github.com/cockroachdb/cockroach/pkg/util/contextutil/context.go:91
W230220 07:28:32.811776 1301137 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 809 +  | github.com/cockroachdb/cockroach/pkg/kv/kvserver/liveness.(*NodeLiveness).Start.func1
W230220 07:28:32.811776 1301137 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 809 +  | 	github.com/cockroachdb/cockroach/pkg/kv/kvserver/liveness/liveness.go:763
W230220 07:28:32.811776 1301137 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 809 +  | github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx.func2
W230220 07:28:32.811776 1301137 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 809 +  | 	github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:470
W230220 07:28:32.811776 1301137 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 809 +  | runtime.goexit
W230220 07:28:32.811776 1301137 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 809 +  | 	GOROOT/src/runtime/asm_amd64.s:1594
W230220 07:28:32.811776 1301137 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 809 +Wraps: (2) txn exec
W230220 07:28:32.811776 1301137 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 809 +Wraps: (3) context canceled
W230220 07:28:32.811776 1301137 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 809 +Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *errors.errorString
W230220 07:28:32.811776 1301137 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 809 +
W230220 07:28:32.811776 1301137 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 809 +An inability to maintain liveness will prevent a node from participating in a
W230220 07:28:32.811776 1301137 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 809 +cluster. If this problem persists, it may be a sign of resource starvation or
W230220 07:28:32.811776 1301137 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 809 +of network connectivity problems. For help troubleshooting, visit:
W230220 07:28:32.811776 1301137 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 809 +
W230220 07:28:32.811776 1301137 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 809 +    https://www.cockroachlabs.com/docs/stable/cluster-setup-troubleshooting.html#node-liveness-issues
--- FAIL: TestSQLWatcherOnEventError (3.29s)

Parameters: TAGS=bazel,gss,deadlock

Help

See also: How To Investigate a Go Test Failure (internal)

Same failure on other branches

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

pkg/ccl/spanconfigccl/spanconfigsqlwatcherccl/spanconfigsqlwatcherccl_test.TestSQLWatcherOnEventError failed with artifacts on master @ 5b2a5670cbbe895d76602c230390816e783e0caa:

I230311 07:14:02.617652 1350351 jobs/wait.go:142  [T1,n1] 671  waited for 1 [846883855870001153] queued jobs to complete 247.197513ms
I230311 07:14:02.619308 1350351 upgrade/upgrademanager/manager.go:268  [T1,n1] 672  running permanent upgrade for version 1000022.2-38
I230311 07:14:02.650941 1583069 jobs/registry.go:1495  [T1,n1] 673  KEY VISUALIZER job 100: stepping through state running
I230311 07:14:02.737624 1350351 upgrade/upgrademanager/manager.go:584  [T1,n1] 674  running Upgrade to 1000022.2-38: "create jobs metrics polling job"
I230311 07:14:02.737826 1593481 jobs/adopt.go:247  [T1,n1] 675  job 846883856775249921: resuming execution
I230311 07:14:02.773918 1595581 jobs/registry.go:1495  [T1,n1] 676  MIGRATION job 846883856775249921: stepping through state running
I230311 07:14:02.852662 1595581 jobs/registry.go:1495  [T1,n1] 677  MIGRATION job 846883856775249921: stepping through state succeeded
I230311 07:14:02.867549 1375149 jobs/adopt.go:110  [T1,n1] 678  claimed 1 jobs
I230311 07:14:02.894566 1350351 jobs/wait.go:142  [T1,n1] 679  waited for 1 [846883856775249921] queued jobs to complete 156.755366ms
I230311 07:14:02.894686 1350351 server/server_sql.go:1540  [T1,n1] 680  done ensuring all necessary startup migrations have run
I230311 07:14:02.899377 1608380 jobs/job_scheduler.go:407  [T1,n1] 681  waiting 4m0s before scheduled jobs daemon start
I230311 07:14:02.906811 1609546 kv/kvclient/rangefeed/rangefeedcache/watcher.go:335  [T1,n1] 682  tenant-settings-watcher: established range feed cache
I230311 07:14:02.914770 1610702 kv/kvclient/rangefeed/rangefeedcache/watcher.go:335  [T1,n1] 683  tenant-capability-watcher: established range feed cache
I230311 07:14:02.915057 1610799 jobs/adopt.go:247  [T1,n1] 684  job 101: resuming execution
I230311 07:14:02.927235 1610212 13@kv/kvserver/replicate_queue.go:819  [T1,n1,replicate,s1,r58/1:/{Table/57-Max}] 685  error processing replica: 0 of 1 live stores are able to take a new replica for the range (1 already has a voter, 0 already have a non-voter); likely not enough nodes in cluster
I230311 07:14:02.930231 1612855 13@kv/kvserver/replicate_queue.go:819  [T1,n1,replicate,s1,r12/1:/Table/{8-11}] 686  error processing replica: 0 of 1 live stores are able to take a new replica for the range (1 already has a voter, 0 already have a non-voter); likely not enough nodes in cluster
I230311 07:14:02.933221 1610702 multitenant/tenantcapabilities/tenantcapabilitieswatcher/watcher.go:147  [T1,n1] 687  received results of a full table scan for tenant capabilities
I230311 07:14:02.939414 1608497 server/auto_upgrade.go:78  [T1,n1] 688  no need to upgrade, cluster already at the newest version
I230311 07:14:02.963371 1617083 13@kv/kvserver/replicate_queue.go:819  [T1,n1,replicate,s1,r52/1:/Table/5{1-2}] 689  error processing replica: 0 of 1 live stores are able to take a new replica for the range (1 already has a voter, 0 already have a non-voter); likely not enough nodes in cluster
I230311 07:14:02.971188 1608486 sql/syntheticprivilegecache/cache.go:195  [T1,n1] 690  warmed privileges for virtual tables in 70.039203ms
I230311 07:14:02.992690 1618252 jobs/registry.go:1495  [T1,n1] 691  POLL JOBS STATS job 101: stepping through state running
I230311 07:14:03.039319 1625383 13@kv/kvserver/replicate_queue.go:819  [T1,n1,replicate,s1,r4/1:/System{/tsd-tse}] 692  error processing replica: 0 of 1 live stores are able to take a new replica for the range (1 already has a voter, 0 already have a non-voter); likely not enough nodes in cluster
I230311 07:14:03.064738 1358026 3@pebble/event.go:694  [T1,n1,s1,pebble] 693  [JOB 6] WAL created 000005
I230311 07:14:03.076032 1350351 1@server/server_sql.go:1627  [T1,n1] 694  serving sql connections
I230311 07:14:03.077388 1350351 testutils/testcluster/testcluster.go:1338  [T1] 695  WaitForFullReplication
I230311 07:14:03.077444 1350351 testutils/testcluster/testcluster.go:1342  [T1] 696  WaitForFullReplication took: 296ns
I230311 07:14:03.116459 1628549 util/log/event_log.go:32  [T1,n1,client=127.0.0.1:37824,user=root] 697 ={"Timestamp":1678518843100772625,"EventType":"set_cluster_setting","Statement":"SET CLUSTER SETTING \"kv.rangefeed.enabled\" = true","Tag":"SET CLUSTER SETTING","User":"root","SettingName":"kv.rangefeed.enabled","Value":"true"}
I230311 07:14:03.152803 1628549 util/log/event_log.go:32  [T1,n1,client=127.0.0.1:37824,user=root] 698 ={"Timestamp":1678518843138527355,"EventType":"set_cluster_setting","Statement":"SET CLUSTER SETTING \"kv.closed_timestamp.target_duration\" = '100ms'","Tag":"SET CLUSTER SETTING","User":"root","SettingName":"kv.closed_timestamp.target_duration","Value":"00:00:00.1"}
I230311 07:14:03.244013 1638877 13@kv/kvserver/replicate_queue.go:819  [T1,n1,replicate,s1,r7/1:/Table/{3-4}] 699  error processing replica: 0 of 1 live stores are able to take a new replica for the range (1 already has a voter, 0 already have a non-voter); likely not enough nodes in cluster
I230311 07:14:03.275555 1628549 5@util/log/event_log.go:32  [T1,n1,client=127.0.0.1:37824,user=root] 700 ={"Timestamp":1678518843155087733,"EventType":"create_table","Statement":"CREATE TABLE defaultdb.public.t ()","Tag":"CREATE TABLE","User":"root","DescriptorID":104,"TableName":"defaultdb.public.t"}
I230311 07:14:03.276951 1350351 spanconfig/spanconfigsqlwatcher/sqlwatcher.go:245  [T1] 701  established range feed over system.descriptors starting at time 1678518843.154843113,0
I230311 07:14:03.277138 1350351 spanconfig/spanconfigsqlwatcher/sqlwatcher.go:304  [T1] 702  established range feed over system.zones starting at time 1678518843.154843113,0
I230311 07:14:03.281844 1350351 spanconfig/spanconfigsqlwatcher/sqlwatcher.go:400  [T1] 703  established range feed over system.protected_ts_records starting at time 1678518843.154843113,0
    sqlwatcher_test.go:460: handler should never run
W230311 07:14:03.282505 1641306 spanconfig/spanconfigsqlwatcher/sqlwatcher.go:133  [T1,rangefeed=sql-watcher-descriptor-rangefeed] 704  error adding event {{1678518843155087733 0 false} {{104 relation} {false {0}}}}: boom
I230311 07:14:03.287030 1350351 testutils/testcluster/testcluster.go:148  [T1] 705  TestCluster quiescing nodes
I230311 07:14:03.288532 1583069 jobs/registry.go:1495  [T1,n1] 706  KEY VISUALIZER job 100: stepping through state succeeded
I230311 07:14:03.290247 1355838 server/start_listen.go:103  [T1,n1] 707  server shutting down: instructing cmux to stop accepting
W230311 07:14:03.290590 1370436 sql/sqlliveness/slinstance/slinstance.go:312  [T1,n1] 708  exiting heartbeat loop
I230311 07:14:03.287731 1374857 sql/stats/automatic_stats.go:507  [T1,n1] 709  quiescing auto stats refresher
I230311 07:14:03.291615 1455668 rpc/context.go:2249  [T1,n1,rnode=1,raddr=127.0.0.1:36951,class=default,rpc] 710  connection heartbeat loop ended with err: <nil>
W230311 07:14:03.291101 1370436 sql/sqlliveness/slinstance/slinstance.go:297  [T1,n1] 711  exiting heartbeat loop
E230311 07:14:03.297490 1370436 server/server_sql.go:477  [T1,n1] 712  failed to run update of instance with new session ID: node unavailable; try another peer
W230311 07:14:03.289016 1635212 kv/kvserver/intentresolver/intent_resolver.go:826  [T1] 713  failed to gc transaction record: could not GC completed transaction anchored at /Table/6/1/"kv.closed_timestamp.target_duration"/0: node unavailable; try another peer
W230311 07:14:03.300142 1641235 1@sql/event_log.go:653  [T1,n1,client=127.0.0.1:37824,user=root] 714  unable to save 1 entries to system.eventlog: log-event: failed to read query result: query execution canceled
E230311 07:14:03.303726 1375132 jobs/registry.go:1004  [T1,n1] 715  failed to serve pause and cancel requests: could not query jobs table: cancel/pause-requested: failed to read query result: query execution canceled
W230311 07:14:03.304153 1375132 jobs/registry.go:1080  [T1,n1] 716  canceling all adopted jobs due to stopper quiescing
E230311 07:14:03.304506 1618252 jobs/registry.go:952  [T1,n1] 717  error getting live session: node unavailable; try another peer
E230311 07:14:03.305435 1375149 jobs/registry.go:1064  [T1,n1] 718  error processing claimed jobs: could not query for claimed jobs: select-running/get-claimed-jobs: failed to read query result: query execution canceled
--- FAIL: TestSQLWatcherOnEventError (3.41s)

Parameters: TAGS=bazel,gss,deadlock

Help

See also: How To Investigate a Go Test Failure (internal)

Same failure on other branches

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

pkg/ccl/spanconfigccl/spanconfigsqlwatcherccl/spanconfigsqlwatcherccl_test.TestSQLWatcherOnEventError failed with artifacts on master @ 95684388bfbb372ac8c7e1a8ebd4a0d6f447e147:

W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +  | github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunTaskWithErr
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +  | 	github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:322
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +  | github.com/cockroachdb/cockroach/pkg/rpc.NewServerEx.func1
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +  | 	github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/context.go:233
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +  | github.com/cockroachdb/cockroach/pkg/rpc.bindUnaryServerInterceptorToHandler.func1
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +  | 	github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/context.go:938
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +  | github.com/cockroachdb/cockroach/pkg/rpc.makeInternalClientAdapter.func2
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +  | 	github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/context.go:837
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +  | github.com/cockroachdb/cockroach/pkg/util/tracing/grpcinterceptor.ClientInterceptor.func2
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +  | 	github.com/cockroachdb/cockroach/pkg/util/tracing/grpcinterceptor/grpc_interceptor.go:227
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +  | github.com/cockroachdb/cockroach/pkg/rpc.getChainUnaryInvoker.func1
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +  | 	github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/context.go:1022
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +  | github.com/cockroachdb/cockroach/pkg/rpc.makeInternalClientAdapter.func3
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +  | 	github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/context.go:907
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +  | github.com/cockroachdb/cockroach/pkg/rpc.internalClientAdapter.Batch
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +  | 	github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/context.go:1030
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +  | github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*grpcTransport).sendBatch
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +  | 	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/transport.go:210
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +  | github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*grpcTransport).SendNext
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +  | 	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/transport.go:189
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +  | github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*DistSender).sendToReplicas
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +  | 	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/dist_sender.go:2142
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +  | github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*DistSender).sendPartialBatch
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +  | 	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/dist_sender.go:1668
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +  | github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*DistSender).divideAndSendBatchToRanges
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +  | 	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/dist_sender.go:1239
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +  | github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*DistSender).Send
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +  | 	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/dist_sender.go:860
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +  | github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*txnLockGatekeeper).SendLocked
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +  | 	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_lock_gatekeeper.go:82
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +  | github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*txnMetricRecorder).SendLocked
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +  | 	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_interceptor_metric_recorder.go:47
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +  | github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*txnCommitter).SendLocked
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +  | 	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_interceptor_committer.go:203
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +Wraps: (2) aborted before proposing
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +Wraps: (3) context canceled
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +Error types: (1) *errbase.opaqueWrapper (2) *errutil.withPrefix (3) *errors.errorString
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +An inability to maintain liveness will prevent a node from participating in a
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +cluster. If this problem persists, it may be a sign of resource starvation or
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +of network connectivity problems. For help troubleshooting, visit:
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +
W230312 07:21:13.485826 1355808 kv/kvserver/liveness/liveness.go:792  [T1,n1,liveness-hb] 727 +    https://www.cockroachlabs.com/docs/stable/cluster-setup-troubleshooting.html#node-liveness-issues
E230312 07:21:13.486786 1371047 jobs/registry.go:1064  [T1,n1] 728  error processing claimed jobs: could not query for claimed jobs: select-running/get-claimed-jobs: failed to read query result: query execution canceled
E230312 07:21:13.489897 1370981 jobs/registry.go:996  [T1,n1] 729  error expiring job sessions: expire-sessions: failed to read query result: query execution canceled
E230312 07:21:13.490147 1370981 jobs/registry.go:1004  [T1,n1] 730  failed to serve pause and cancel requests: txn exec: context canceled
W230312 07:21:13.490273 1370981 jobs/registry.go:1080  [T1,n1] 731  canceling all adopted jobs due to stopper quiescing
E230312 07:21:13.490553 1607679 jobs/registry.go:952  [T1,n1] 732  error getting live session: node unavailable; try another peer
W230312 07:21:13.507231 1633896 1@sql/event_log.go:653  [T1,n1,client=127.0.0.1:56012,user=root] 733  unable to save 1 entries to system.eventlog: log-event: failed to read query result: query execution canceled
--- FAIL: TestSQLWatcherOnEventError (3.34s)

Parameters: TAGS=bazel,gss,deadlock

Help

See also: How To Investigate a Go Test Failure (internal)

Same failure on other branches

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

pkg/ccl/spanconfigccl/spanconfigsqlwatcherccl/spanconfigsqlwatcherccl_test.TestSQLWatcherOnEventError failed with artifacts on master @ 43f7cde6f036f720388912bde7310095c2f8ca9e:

I230314 06:17:56.053794 2148143 jobs/registry.go:1489  [T1,n1] 665  KEY VISUALIZER job 100: stepping through state running
I230314 06:17:56.123654 1823186 jobs/wait.go:142  [T1,n1] 666  waited for 1 [847722169889030145] queued jobs to complete 562.992225ms
I230314 06:17:56.125298 1823186 upgrade/upgrademanager/manager.go:268  [T1,n1] 667  running permanent upgrade for version 1000022.2-38
I230314 06:17:56.233229 1823186 upgrade/upgrademanager/manager.go:584  [T1,n1] 668  running Upgrade to 1000022.2-38: "create jobs metrics polling job"
I230314 06:17:56.233405 2169610 jobs/adopt.go:247  [T1,n1] 669  job 847722172008005633: resuming execution
I230314 06:17:56.245503 2170948 jobs/adopt.go:247  [T1,n1] 670  job 847722172008005633: resuming execution
I230314 06:17:56.314671 2171723 jobs/registry.go:1489  [T1,n1] 671  MIGRATION job 847722172008005633: stepping through state running
I230314 06:17:56.435489 1845359 jobs/adopt.go:110  [T1,n1] 672  claimed 1 jobs
I230314 06:17:56.446936 2171723 jobs/registry.go:1489  [T1,n1] 673  MIGRATION job 847722172008005633: stepping through state succeeded
I230314 06:17:56.481155 2193120 jobs/adopt.go:247  [T1,n1] 674  job 101: resuming execution
I230314 06:17:56.576648 1823186 jobs/wait.go:142  [T1,n1] 675  waited for 1 [847722172008005633] queued jobs to complete 343.259367ms
I230314 06:17:56.576776 1823186 server/server_sql.go:1540  [T1,n1] 676  done ensuring all necessary startup migrations have run
I230314 06:17:56.579181 2202498 jobs/job_scheduler.go:407  [T1,n1] 677  waiting 2m0s before scheduled jobs daemon start
I230314 06:17:56.582368 2203155 kv/kvclient/rangefeed/rangefeedcache/watcher.go:335  [T1,n1] 678  tenant-settings-watcher: established range feed cache
I230314 06:17:56.588501 2204143 kv/kvclient/rangefeed/rangefeedcache/watcher.go:335  [T1,n1] 679  tenant-capability-watcher: established range feed cache
I230314 06:17:56.598863 2204550 13@kv/kvserver/replicate_queue.go:834  [T1,n1,replicate,s1,r58/1:/{Table/57-Max}] 680  error processing replica: 0 of 1 live stores are able to take a new replica for the range (1 already has a voter, 0 already have a non-voter); likely not enough nodes in cluster
I230314 06:17:56.602891 2206300 13@kv/kvserver/replicate_queue.go:834  [T1,n1,replicate,s1,r12/1:/Table/{8-11}] 681  error processing replica: 0 of 1 live stores are able to take a new replica for the range (1 already has a voter, 0 already have a non-voter); likely not enough nodes in cluster
I230314 06:17:56.603587 2204143 multitenant/tenantcapabilities/tenantcapabilitieswatcher/watcher.go:147  [T1,n1] 682  received results of a full table scan for tenant capabilities
I230314 06:17:56.613826 2194723 jobs/registry.go:1489  [T1,n1] 683  POLL JOBS STATS job 101: stepping through state running
I230314 06:17:56.625367 2202535 server/auto_upgrade.go:78  [T1,n1] 684  no need to upgrade, cluster already at the newest version
I230314 06:17:56.653398 2213020 13@kv/kvserver/replicate_queue.go:834  [T1,n1,replicate,s1,r52/1:/Table/5{1-2}] 685  error processing replica: 0 of 1 live stores are able to take a new replica for the range (1 already has a voter, 0 already have a non-voter); likely not enough nodes in cluster
I230314 06:17:56.663806 2202508 sql/syntheticprivilegecache/cache.go:195  [T1,n1] 686  warmed privileges for virtual tables in 84.697575ms
I230314 06:17:56.752369 2222971 13@kv/kvserver/replicate_queue.go:834  [T1,n1,replicate,s1,r4/1:/System{/tsd-tse}] 687  error processing replica: 0 of 1 live stores are able to take a new replica for the range (1 already has a voter, 0 already have a non-voter); likely not enough nodes in cluster
I230314 06:17:56.762701 1823186 1@server/server_sql.go:1627  [T1,n1] 688  serving sql connections
I230314 06:17:56.764567 1823186 testutils/testcluster/testcluster.go:1338  [T1] 689  WaitForFullReplication
I230314 06:17:56.765171 1823186 testutils/testcluster/testcluster.go:1342  [T1] 690  WaitForFullReplication took: 366ns
I230314 06:17:56.797686 2225277 util/log/event_log.go:32  [T1,n1,client=127.0.0.1:51010,user=root] 691 ={"Timestamp":1678774676784755611,"EventType":"set_cluster_setting","Statement":"SET CLUSTER SETTING \"kv.rangefeed.enabled\" = true","Tag":"SET CLUSTER SETTING","User":"root","SettingName":"kv.rangefeed.enabled","Value":"true"}
I230314 06:17:56.821452 2225277 util/log/event_log.go:32  [T1,n1,client=127.0.0.1:51010,user=root] 692 ={"Timestamp":1678774676803272298,"EventType":"set_cluster_setting","Statement":"SET CLUSTER SETTING \"kv.closed_timestamp.target_duration\" = '100ms'","Tag":"SET CLUSTER SETTING","User":"root","SettingName":"kv.closed_timestamp.target_duration","Value":"00:00:00.1"}
I230314 06:17:56.930124 2225277 5@util/log/event_log.go:32  [T1,n1,client=127.0.0.1:51010,user=root] 693 ={"Timestamp":1678774676830241862,"EventType":"create_table","Statement":"CREATE TABLE defaultdb.public.t ()","Tag":"CREATE TABLE","User":"root","DescriptorID":104,"TableName":"defaultdb.public.t"}
I230314 06:17:56.932721 1823186 spanconfig/spanconfigsqlwatcher/sqlwatcher.go:245  [T1] 694  established range feed over system.descriptors starting at time 1678774676.824485738,0
I230314 06:17:56.933386 1823186 spanconfig/spanconfigsqlwatcher/sqlwatcher.go:304  [T1] 695  established range feed over system.zones starting at time 1678774676.824485738,0
I230314 06:17:56.933674 1823186 spanconfig/spanconfigsqlwatcher/sqlwatcher.go:400  [T1] 696  established range feed over system.protected_ts_records starting at time 1678774676.824485738,0
    sqlwatcher_test.go:460: handler should never run
I230314 06:17:56.936109 1823186 testutils/testcluster/testcluster.go:148  [T1] 697  TestCluster quiescing nodes
I230314 06:17:56.936980 1828039 server/start_listen.go:103  [T1,n1] 698  server shutting down: instructing cmux to stop accepting
I230314 06:17:56.937128 1844941 sql/stats/automatic_stats.go:507  [T1,n1] 699  quiescing auto stats refresher
I230314 06:17:56.938219 1921194 rpc/context.go:2249  [T1,n1,rnode=1,raddr=127.0.0.1:41747,class=default,rpc] 700  connection heartbeat loop ended with err: <nil>
W230314 06:17:56.938498 2231051 kv/kvserver/intentresolver/intent_resolver.go:826  [T1] 701  failed to gc transaction record: could not GC completed transaction anchored at /Table/6/1/"kv.closed_timestamp.target_duration"/0: node unavailable; try another peer
I230314 06:17:56.943069 2148143 jobs/registry.go:1489  [T1,n1] 702  KEY VISUALIZER job 100: stepping through state succeeded
E230314 06:17:56.944306 2235869 kv/kvclient/kvcoord/txn_interceptor_committer.go:489  [T1] 703  making txn commit explicit failed for "sql txn" meta={id=4b684a05 key=/Table/3/1/104/2/1 pri=0.01860429 epo=0 ts=1678774676.830241862,0 min=1678774676.830241862,0 seq=3} lock=true stat=STAGING rts=1678774676.830241862,0 wto=false gul=1678774677.330241862,0 ifw=2: result is ambiguous: server shutdown
E230314 06:17:56.946724 1845349 jobs/registry.go:965  [T1,n1] 704  error expiring job sessions: expire-sessions: failed to read query result: query execution canceled
E230314 06:17:56.947134 1845349 jobs/registry.go:973  [T1,n1] 705  failed to serve pause and cancel requests: txn exec: context canceled
W230314 06:17:56.947305 1845349 jobs/registry.go:1049  [T1,n1] 706  canceling all adopted jobs due to stopper quiescing
W230314 06:17:56.950381 2235945 1@sql/event_log.go:653  [T1,n1,client=127.0.0.1:51010,user=root] 707  unable to save 1 entries to system.eventlog: log-event: failed to read query result: query execution canceled
W230314 06:17:56.939153 1840017 sql/sqlliveness/slinstance/slinstance.go:312  [T1,n1] 708  exiting heartbeat loop
W230314 06:17:56.959703 1840017 sql/sqlliveness/slinstance/slinstance.go:297  [T1,n1] 709  exiting heartbeat loop
E230314 06:17:56.959763 1840017 server/server_sql.go:477  [T1,n1] 710  failed to run update of instance with new session ID: node unavailable; try another peer
W230314 06:17:56.965222 2148143 jobs/adopt.go:461  [T1,n1] 711  could not clear job claim: clear-job-claim: node unavailable; try another peer
W230314 06:17:56.975362 2194723 jobs/adopt.go:461  [T1,n1] 712  could not clear job claim: clear-job-claim: failed to send RPC: sending to all replicas failed; last error: ba: Scan [/Table/15/1/101,/Table/15/1/102), [txn: b31fd413], [can-forward-ts] RPC error: node unavailable; try another peer
--- FAIL: TestSQLWatcherOnEventError (5.01s)

Parameters: TAGS=bazel,gss,deadlock

Help

See also: How To Investigate a Go Test Failure (internal)

Same failure on other branches

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

pkg/ccl/spanconfigccl/spanconfigsqlwatcherccl/spanconfigsqlwatcherccl_test.TestSQLWatcherOnEventError failed with artifacts on master @ 9b43dc50d468171bc9919dc684b7274b63e13da2:

I230315 06:20:13.225633 2067962 jobs/adopt.go:247  [T1,n1] 662  job 100: resuming execution
I230315 06:20:13.239162 1745769 jobs/wait.go:142  [T1,n1] 663  waited for 1 [848005735232503809] queued jobs to complete 360.040468ms
I230315 06:20:13.240498 1745769 upgrade/upgrademanager/manager.go:268  [T1,n1] 664  running permanent upgrade for version 1000022.2-38
I230315 06:20:13.342239 2070264 jobs/registry.go:1489  [T1,n1] 665  KEY VISUALIZER job 100: stepping through state running
I230315 06:20:13.347661 1745769 upgrade/upgrademanager/manager.go:585  [T1,n1] 666  running Upgrade to 1000022.2-38: "create jobs metrics polling job"
I230315 06:20:13.348066 2079450 jobs/adopt.go:247  [T1,n1] 667  job 848005736826699777: resuming execution
I230315 06:20:13.359573 2081067 jobs/adopt.go:247  [T1,n1] 668  job 848005736826699777: resuming execution
I230315 06:20:13.432421 2083608 jobs/registry.go:1489  [T1,n1] 669  MIGRATION job 848005736826699777: stepping through state running
I230315 06:20:13.545161 2083608 jobs/registry.go:1489  [T1,n1] 670  MIGRATION job 848005736826699777: stepping through state succeeded
I230315 06:20:13.546257 1771613 jobs/adopt.go:110  [T1,n1] 671  claimed 1 jobs
I230315 06:20:13.637376 1745769 jobs/wait.go:142  [T1,n1] 672  waited for 1 [848005736826699777] queued jobs to complete 289.328757ms
I230315 06:20:13.637495 1745769 server/server_sql.go:1542  [T1,n1] 673  done ensuring all necessary startup migrations have run
I230315 06:20:13.645654 2110249 jobs/job_scheduler.go:407  [T1,n1] 674  waiting 4m0s before scheduled jobs daemon start
I230315 06:20:13.652257 2111111 kv/kvclient/rangefeed/rangefeedcache/watcher.go:335  [T1,n1] 675  tenant-settings-watcher: established range feed cache
I230315 06:20:13.668737 2112335 13@kv/kvserver/replicate_queue.go:834  [T1,n1,replicate,s1,r58/1:/{Table/57-Max}] 677  error processing replica: 0 of 1 live stores are able to take a new replica for the range (1 already has a voter, 0 already have a non-voter); likely not enough nodes in cluster
I230315 06:20:13.668191 2113083 jobs/adopt.go:247  [T1,n1] 676  job 101: resuming execution
I230315 06:20:13.670211 2113445 kv/kvclient/rangefeed/rangefeedcache/watcher.go:335  [T1,n1] 678  tenant-capability-watcher: established range feed cache
I230315 06:20:13.695952 2110398 server/auto_upgrade.go:78  [T1,n1] 679  no need to upgrade, cluster already at the newest version
I230315 06:20:13.696447 2116180 13@kv/kvserver/replicate_queue.go:834  [T1,n1,replicate,s1,r12/1:/Table/{8-11}] 680  error processing replica: 0 of 1 live stores are able to take a new replica for the range (1 already has a voter, 0 already have a non-voter); likely not enough nodes in cluster
I230315 06:20:13.697356 2113445 multitenant/tenantcapabilities/tenantcapabilitieswatcher/watcher.go:147  [T1,n1] 681  received results of a full table scan for tenant capabilities
I230315 06:20:13.698739 2117313 13@kv/kvserver/replicate_queue.go:834  [T1,n1,replicate,s1,r52/1:/Table/5{1-2}] 682  error processing replica: 0 of 1 live stores are able to take a new replica for the range (1 already has a voter, 0 already have a non-voter); likely not enough nodes in cluster
I230315 06:20:13.705070 2110387 sql/syntheticprivilegecache/cache.go:195  [T1,n1] 683  warmed privileges for virtual tables in 57.923954ms
I230315 06:20:13.792983 2118743 jobs/registry.go:1489  [T1,n1] 684  POLL JOBS STATS job 101: stepping through state running
I230315 06:20:13.805597 1745769 1@server/server_sql.go:1629  [T1,n1] 685  serving sql connections
I230315 06:20:13.829329 1745769 testutils/testcluster/testcluster.go:1338  [-] 686  WaitForFullReplication
I230315 06:20:13.829417 1745769 testutils/testcluster/testcluster.go:1342  [-] 687  WaitForFullReplication took: 265ns
I230315 06:20:13.869103 2134159 util/log/event_log.go:32  [T1,n1,client=127.0.0.1:44982,user=root] 688 ={"Timestamp":1678861213851667615,"EventType":"set_cluster_setting","Statement":"SET CLUSTER SETTING \"kv.rangefeed.enabled\" = true","Tag":"SET CLUSTER SETTING","User":"root","SettingName":"kv.rangefeed.enabled","Value":"true"}
I230315 06:20:13.901197 2134159 util/log/event_log.go:32  [T1,n1,client=127.0.0.1:44982,user=root] 689 ={"Timestamp":1678861213871292216,"EventType":"set_cluster_setting","Statement":"SET CLUSTER SETTING \"kv.closed_timestamp.target_duration\" = '100ms'","Tag":"SET CLUSTER SETTING","User":"root","SettingName":"kv.closed_timestamp.target_duration","Value":"00:00:00.1"}
I230315 06:20:14.019665 2134159 5@util/log/event_log.go:32  [T1,n1,client=127.0.0.1:44982,user=root] 690 ={"Timestamp":1678861213902285655,"EventType":"create_table","Statement":"CREATE TABLE defaultdb.public.t ()","Tag":"CREATE TABLE","User":"root","DescriptorID":104,"TableName":"defaultdb.public.t"}
I230315 06:20:14.020865 1745769 spanconfig/spanconfigsqlwatcher/sqlwatcher.go:245  [-] 691  established range feed over system.descriptors starting at time 1678861213.902025550,0
I230315 06:20:14.021013 1745769 spanconfig/spanconfigsqlwatcher/sqlwatcher.go:304  [-] 692  established range feed over system.zones starting at time 1678861213.902025550,0
I230315 06:20:14.021149 1745769 spanconfig/spanconfigsqlwatcher/sqlwatcher.go:400  [-] 693  established range feed over system.protected_ts_records starting at time 1678861213.902025550,0
    sqlwatcher_test.go:460: handler should never run
I230315 06:20:14.025135 1745769 testutils/testcluster/testcluster.go:148  [-] 694  TestCluster quiescing nodes
I230315 06:20:14.026516 1751303 server/start_listen.go:103  [T1,n1] 695  server shutting down: instructing cmux to stop accepting
I230315 06:20:14.026967 1771547 sql/stats/automatic_stats.go:507  [T1,n1] 696  quiescing auto stats refresher
W230315 06:20:14.027638 2137520 kv/kvserver/intentresolver/intent_resolver.go:826  [-] 697  failed to gc transaction record: could not GC completed transaction anchored at /Table/15/1/101/0: node unavailable; try another peer
E230315 06:20:14.027846 2146280 kv/kvclient/kvcoord/txn_interceptor_committer.go:489  [-] 698  making txn commit explicit failed for "sql txn" meta={id=e5c4cbc8 key=/Table/3/1/104/2/1 pri=0.02715676 epo=0 ts=1678861213.902285655,0 min=1678861213.902285655,0 seq=3} lock=true stat=STAGING rts=1678861213.902285655,0 wto=false gul=1678861214.402285655,0 ifw=2: result is ambiguous: server shutdown
I230315 06:20:14.029564 1853916 rpc/context.go:2249  [T1,n1,rnode=1,raddr=127.0.0.1:45891,class=default,rpc] 699  connection heartbeat loop ended with err: <nil>
W230315 06:20:14.029983 1765108 sql/sqlliveness/slinstance/slinstance.go:312  [n1] 701  exiting heartbeat loop
W230315 06:20:14.030041 1765108 sql/sqlliveness/slinstance/slinstance.go:297  [n1] 702  exiting heartbeat loop
E230315 06:20:14.030609 1765108 server/server_sql.go:477  [n1] 703  failed to run update of instance with new session ID: node unavailable; try another peer
I230315 06:20:14.028799 2070264 jobs/registry.go:1489  [T1,n1] 700  KEY VISUALIZER job 100: stepping through state succeeded
E230315 06:20:14.033367 1771613 jobs/registry.go:993  [T1,n1] 704  error claiming jobs: could not query jobs table: claim-jobs: failed to read query result: query execution canceled
E230315 06:20:14.035046 1771603 jobs/registry.go:965  [T1,n1] 705  error expiring job sessions: expire-sessions: failed to read query result: query execution canceled
E230315 06:20:14.035460 1771603 jobs/registry.go:973  [T1,n1] 706  failed to serve pause and cancel requests: txn exec: context canceled
W230315 06:20:14.035818 2146350 1@sql/event_log.go:653  [T1,n1,client=127.0.0.1:44982,user=root] 707  unable to save 1 entries to system.eventlog: log-event: failed to read query result: query execution canceled
W230315 06:20:14.036383 1771603 jobs/registry.go:1049  [T1,n1] 708  canceling all adopted jobs due to stopper quiescing
E230315 06:20:14.036719 2118743 jobs/registry.go:921  [T1,n1] 709  error getting live session: node unavailable; try another peer
--- FAIL: TestSQLWatcherOnEventError (4.48s)

Parameters: TAGS=bazel,gss,deadlock

Help

See also: How To Investigate a Go Test Failure (internal)

Same failure on other branches

This test on roachdash | Improve this report!

craig bot pushed a commit that referenced this issue Mar 28, 2023
…99752 #99774

99433: opt: fixup CTE stats on placeholder queries r=cucaroach a=cucaroach

During optbuilder phase we copy the initial expressions stats into the
fake-rel but this value can change when placeholders are assigned so add
code in AssignPlaceholders to rebuild the cte if the stats change.

Fixes: #99389
Epic: none
Release note: none

99516: metrics: improve ux around _status/vars output r=aadityasondhi a=dhartunian

Previously, the addition of the `tenant` metric label was applied uniformly and could result in confusion for customers who never enable multi-tenancy or c2c. The `tenant="system"` label carries little meaning when there's no tenancy in use.

This change modifies the system tenant label application to only happen when a non- sytem in-process tenant is created.

Additionally, an environment variable:
`COCKROACH_DISABLE_NODE_AND_TENANT_METRIC_LABELS` can be set to `false` to disable the new `tenant` and `node_id` labels. This can be used on single-process tenants to disable the `tenant` label.

Resolves: #94668

Epic: CRDB-18798

Release note (ops change): The
`COCKROACH_DISABLE_NODE_AND_TENANT_METRIC_LABELS` env var can be used to disable the newly introduced metric labels in the `_status/vars` output if they conflict with a customer's scrape configuration.

99522: jobsprofiler: store DistSQL diagram of jobs in job info r=dt a=adityamaru

This change teaches import, cdc, backup and restore
to store their DistSQL plans in the job_info table
under a timestamped info key. The generation and writing
of the plan diagram is done asynchronously so as to not
slow down the execution of the job. A new plan will be
stored everytime the job sets up its DistSQL flow.

Release note: None
Epic: [CRDB-8964](https://cockroachlabs.atlassian.net/browse/CRDB-8964)
Informs: #99729

99574: streamingccl: skip acceptance/c2c on remote cluster setup r=stevendanna a=msbutler

acceptance/c2c currently fails when run on a remote cluster. This patch ensures the test gets skipped when run on a remote cluster. There's no need to run the test on a remote cluster because the other c2c roachtests provide sufficient coverage.

Fixes #99553

Release note: none

99691: codeowners: update sql obs to cluster obs r=maryliag a=maryliag

Update mentions of `sql-observability` to
`cluster-observability`.

Epic: none
Release note: None

99712: ui: connect metrics provider to metrics timescale object r=xinhaoz a=dhartunian

Previously, the `MetricsDataProvider` component queried the redux store for the `TimeScale` object which contained details of the currently active time window. This piece of state was assumed to update to account for the "live" moving window that metrics show when pre-set lookback time windows are selected.

A recent PR: #98331 removed the feature that polled new data from SQL pages, which also disabled polling on metrics pages due to the re-use of `TimeScale`.

This commit modifies the `MetricsDataProvider` to instead read the `metricsTime` field of the `TimeScaleState` object. This object was constructed for use by the `MetricsDataProvider` but was not wired up to the component.

Resolves #99524

Epic: None

Release note: None

99733: telemetry: add FIPS-specific channel r=knz a=rail

Previously, all official builds were reporting using the same telemetry channel.

This PR adds an new telemetry channel for the FIPS build target.

Fixes: CC-24110
Epic: DEVINF-478
Release note: None

99745: spanconfigsqlwatcher: deflake TestSQLWatcherOnEventError r=arulajmani a=arulajmani

Previously, this test was setting the no-op checkpoint duration to be
 every hour to effectively disable checkpoints. Doing so is integral to
what the test is testing. However, this was a lie, given how `util.Every` works -- A call to `ShouldProcess` returns true the very first time.

This patch achieves the original goal by introducing a new testing knob. Previously, the test would fail in < 40 runs locally.  Have this running strong for ~1000 runs.

Fixes #76765

Release note: None

99747: roachtest: use persistent disks for disk-stall tests r=jbowens a=nicktrav

Currently, the `disk-stall` tests use local SSDs. When run on GCE VMs, a higher test flake rate is observed due to known issues with fsync latency for local SSDs.

Switch the test to use persistent disks instead.

Touches: #99372.

Release note: None.

Epic: CRDB-20293

99752: kvserver: bump tolerance more r=ajwerner a=ajwerner

I'm not digging into this more, but the test is flakey.

Epic: none

https://teamcity.cockroachdb.com/buildConfiguration/Cockroach_UnitTests_BazelUnitTests/9161972?showRootCauses=false&expandBuildChangesSection=true&expandBuildProblemsSection=true&expandBuildTestsSection=true

Release note: None

99774: *: identify remaining uses of TODOSQLCodec r=stevendanna a=knz

The `TODOSQLCodec` was a bug waiting to happen.

The only reasonable remaining purpose is for use in tests. As such, this change moves its definition to a test-only package (we have a linter that verifies that `testutils` is not included in non-test code).

This change then identifies the one non-reasonable remaining purposes and identifies it properly as a bug linked to #48123.

Release note: None
Epic: None

Co-authored-by: Tommy Reilly <treilly@cockroachlabs.com>
Co-authored-by: David Hartunian <davidh@cockroachlabs.com>
Co-authored-by: adityamaru <adityamaru@gmail.com>
Co-authored-by: Michael Butler <butler@cockroachlabs.com>
Co-authored-by: maryliag <marylia@cockroachlabs.com>
Co-authored-by: Rail Aliiev <rail@iqchoice.com>
Co-authored-by: Arul Ajmani <arulajmani@gmail.com>
Co-authored-by: Nick Travers <travers@cockroachlabs.com>
Co-authored-by: ajwerner <awerner32@gmail.com>
Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net>
@craig craig bot closed this as completed in a7b6432 Mar 28, 2023
blathers-crl bot pushed a commit that referenced this issue Mar 28, 2023
Previously, this test was setting the no-op checkpoint duration to be
 every hour to effectively disable checkpoints. Doing so is integral to
what the test is testing. However, this was a lie, given how
`util.Every` works -- A call to `ShouldProcess` returns true the very
first time.

This patch achieves the original goal by introducing a new testing knob.
Previously, the test would fail in < 40 runs locally.  Have this running
strong for ~1000 runs.

Fixes #76765

Release note: None
blathers-crl bot pushed a commit that referenced this issue Mar 28, 2023
Previously, this test was setting the no-op checkpoint duration to be
 every hour to effectively disable checkpoints. Doing so is integral to
what the test is testing. However, this was a lie, given how
`util.Every` works -- A call to `ShouldProcess` returns true the very
first time.

This patch achieves the original goal by introducing a new testing knob.
Previously, the test would fail in < 40 runs locally.  Have this running
strong for ~1000 runs.

Fixes #76765

Release note: None
blathers-crl bot pushed a commit that referenced this issue Mar 28, 2023
Previously, this test was setting the no-op checkpoint duration to be
 every hour to effectively disable checkpoints. Doing so is integral to
what the test is testing. However, this was a lie, given how
`util.Every` works -- A call to `ShouldProcess` returns true the very
first time.

This patch achieves the original goal by introducing a new testing knob.
Previously, the test would fail in < 40 runs locally.  Have this running
strong for ~1000 runs.

Fixes #76765

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. sync-me-8
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants