Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server: TestListenerFileCreation failed #43949

Closed
cockroach-teamcity opened this issue Jan 14, 2020 · 15 comments
Closed

server: TestListenerFileCreation failed #43949

cockroach-teamcity opened this issue Jan 14, 2020 · 15 comments
Assignees
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot.
Milestone

Comments

@cockroach-teamcity
Copy link
Member

(server).TestListenerFileCreation failed on master@3d98dae853edbde083cc23b60ad6696958909849:

Fatal error:

F200114 03:18:16.212253 70735 sqlmigrations/migrations.go:531  [n1] not enough time left on migration lease, terminating for safety

Stack:

goroutine 70735 [running]:
github.com/cockroachdb/cockroach/pkg/util/log.getStacks(0x6285c01, 0xed5af29f8, 0x0, 0xc000875ec0)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/log/get_stacks.go:25 +0xb8
github.com/cockroachdb/cockroach/pkg/util/log.(*loggerT).outputLogEntry(0x6282c40, 0xc000000004, 0x5c52d33, 0x1b, 0x213, 0xc002f2fcc0, 0x44)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/log/clog.go:211 +0xa0c
github.com/cockroachdb/cockroach/pkg/util/log.addStructured(0x41367c0, 0xc002a07f50, 0x4, 0x2, 0x0, 0x0, 0xc000875e28, 0x1, 0x1)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/log/structured.go:66 +0x2c9
github.com/cockroachdb/cockroach/pkg/util/log.logDepth(0x41367c0, 0xc002a07f50, 0x1, 0xc000000004, 0x0, 0x0, 0xc000875e28, 0x1, 0x1)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/log/log.go:44 +0x8c
github.com/cockroachdb/cockroach/pkg/util/log.Fatal(...)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/log/log.go:164
github.com/cockroachdb/cockroach/pkg/sqlmigrations.(*Manager).EnsureMigrations.func2(0x41367c0, 0xc002a07f50)
	/go/src/github.com/cockroachdb/cockroach/pkg/sqlmigrations/migrations.go:531 +0x284
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTask.func1(0xc004d6c120, 0x41367c0, 0xc002a07f50, 0xc00142a6f0, 0x29, 0x0, 0x0, 0xc0003e7da0)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:322 +0x152
created by github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTask
	/go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:317 +0x131
Log preceding fatal error

 Sum      0/0    0.00 KB   0.0      0.0     0.0      0.0       0.0      0.0       0.0   0.0      0.0      0.0      0.00              0.00         0    0.000       0      0
 Int      0/0    0.00 KB   0.0      0.0     0.0      0.0       0.0      0.0       0.0   0.0      0.0      0.0      0.00              0.00         0    0.000       0      0

** Compaction Stats [default] **
Priority    Files   Size     Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Uptime(secs): 141.2 total, 14.2 interval
Flush(GB): cumulative 0.000, interval 0.000
AddFile(GB): cumulative 0.000, interval 0.000
AddFile(Total Files): cumulative 0, interval 0
AddFile(L0 Files): cumulative 0, interval 0
AddFile(Keys): cumulative 0, interval 0
Cumulative compaction: 0.00 GB write, 0.00 MB/s write, 0.00 GB read, 0.00 MB/s read, 0.0 seconds
Interval compaction: 0.00 GB write, 0.00 MB/s write, 0.00 GB read, 0.00 MB/s read, 0.0 seconds
Stalls(count): 0 level0_slowdown, 0 level0_slowdown_with_compaction, 0 level0_numfiles, 0 level0_numfiles_with_compaction, 0 stop for pending_compaction_bytes, 0 slowdown for pending_compaction_bytes, 0 memtable_compaction, 0 memtable_slowdown, interval 0 total count
estimated_pending_compaction_bytes: 0 B
W200114 03:17:52.827670 70535 storage/node_liveness.go:544  [n1,liveness-hb] slow heartbeat took 4.5s
W200114 03:17:52.827727 70535 storage/node_liveness.go:469  [n1,liveness-hb] failed node liveness heartbeat: operation "node liveness heartbeat" timed out after 4.5s
W200114 03:17:57.327946 70535 storage/node_liveness.go:544  [n1,liveness-hb] slow heartbeat took 4.5s
W200114 03:17:57.328010 70535 storage/node_liveness.go:469  [n1,liveness-hb] failed node liveness heartbeat: operation "node liveness heartbeat" timed out after 4.5s
I200114 03:17:59.140054 70509 gossip/gossip.go:566  [n1] gossip status (stalled, 1 node)
gossip client (0/3 cur/max conns)
gossip server (0/3 cur/max conns, infos 0/0 sent/received, bytes 0B/0B sent/received)
I200114 03:17:59.151121 70513 server/status/runtime.go:498  [n1] runtime stats: 263 MiB RSS, 183 goroutines, 46 MiB/131 MiB/92 MiB GO alloc/idle/total, 16 MiB/52 MiB CGO alloc/total, 10.0 CGO/sec, 0.6/0.1 %(u/s)time, 0.0 %gc (0x), 4.0 MiB/4.0 MiB (r/w)net
W200114 03:18:01.607831 70439 storage/store_raft.go:496  [n1,s1,r2/1:/System/NodeLiveness{-Max}] handle raft ready: 16.3s [applied=1, batches=1, state_assertions=0]
W200114 03:18:01.607891 70443 storage/store_raft.go:496  [n1,s1,r4/1:/System{/tsd-tse}] handle raft ready: 16.3s [applied=1, batches=1, state_assertions=1]
W200114 03:18:01.608026 70468 storage/store_raft.go:496  [n1,s1,r1/1:/{Min-System/NodeL…}] handle raft ready: 16.3s [applied=1, batches=1, state_assertions=0]
W200114 03:18:01.828264 70535 storage/node_liveness.go:544  [n1,liveness-hb] slow heartbeat took 4.5s
W200114 03:18:01.828323 70535 storage/node_liveness.go:469  [n1,liveness-hb] failed node liveness heartbeat: operation "node liveness heartbeat" timed out after 4.5s
W200114 03:18:05.072647 70225 storage/store_raft.go:496  [n1,s1,r2/1:/System/NodeLiveness{-Max}] handle raft ready: 3.4s [applied=1, batches=1, state_assertions=1]
W200114 03:18:05.561053 70458 storage/store_raft.go:496  [n1,s1,r1/1:/{Min-System/NodeL…}] handle raft ready: 3.9s [applied=1, batches=1, state_assertions=1]
W200114 03:18:05.581830 70442 storage/store_raft.go:496  [n1,s1,r2/1:/System/NodeLiveness{-Max}] handle raft ready: 0.5s [applied=1, batches=1, state_assertions=0]
W200114 03:18:05.582129 71882 storage/node_liveness.go:544  [n1,s1,r15/1:/Table/{19-20}] slow heartbeat took 20.3s
W200114 03:18:05.582301 71804 storage/node_liveness.go:544  [n1,s1,r6/1:/Table/{SystemCon…-11}] slow heartbeat took 20.2s
E200114 03:18:05.582329 71804 storage/replica_range_lease.go:339  [n1,s1,r6/1:/Table/{SystemCon…-11}] heartbeat failed on epoch increment
W200114 03:18:05.582392 71805 storage/node_liveness.go:544  [n1,s1,r8/1:/Table/1{2-3}] slow heartbeat took 20.2s
E200114 03:18:05.582406 71805 storage/replica_range_lease.go:339  [n1,s1,r8/1:/Table/1{2-3}] heartbeat failed on epoch increment
W200114 03:18:05.582564 71884 storage/node_liveness.go:544  [n1,s1,r3/1:/System/{NodeLive…-tsd}] slow heartbeat took 20.2s
E200114 03:18:05.582582 71884 storage/replica_range_lease.go:339  [n1,s1,r3/1:/System/{NodeLive…-tsd}] heartbeat failed on epoch increment
W200114 03:18:05.582741 72107 storage/node_liveness.go:544  [n1,s1,r4/1:/System{/tsd-tse}] slow heartbeat took 4.0s
E200114 03:18:05.582763 72107 storage/replica_range_lease.go:339  [n1,s1,r4/1:/System{/tsd-tse}] heartbeat failed on epoch increment
W200114 03:18:05.582820 70535 storage/node_liveness.go:544  [n1,liveness-hb] slow heartbeat took 3.8s
I200114 03:18:05.582833 70535 storage/node_liveness.go:459  [n1,liveness-hb] heartbeat failed on epoch increment; retrying
I200114 03:18:05.597058 70423 gossip/gossip.go:1531  [n1] node has connected to cluster via gossip
W200114 03:18:08.134792 70464 storage/store_raft.go:496  [n1,s1,r4/1:/System{/tsd-tse}] handle raft ready: 2.5s [applied=1, batches=1, state_assertions=0]
I200114 03:18:09.151757 70513 server/status/runtime.go:498  [n1] runtime stats: 263 MiB RSS, 176 goroutines, 51 MiB/126 MiB/92 MiB GO alloc/idle/total, 16 MiB/52 MiB CGO alloc/total, 56.2 CGO/sec, 0.9/0.1 %(u/s)time, 0.0 %gc (0x), 7.1 MiB/7.1 MiB (r/w)net
W200114 03:18:16.000469 70506 storage/store_rebalancer.go:223  [n1,s1,store-rebalancer] StorePool missing descriptor for local store
W200114 03:18:16.211328 70492 storage/store_raft.go:496  [n1,s1,r6/1:/Table/{SystemCon…-11}] handle raft ready: 10.5s [applied=1, batches=1, state_assertions=0]
W200114 03:18:16.211564 70446 storage/store_raft.go:496  [n1,s1,r3/1:/System/{NodeLive…-tsd}] handle raft ready: 10.4s [applied=1, batches=1, state_assertions=0]
W200114 03:18:16.212122 70486 storage/store_raft.go:496  [n1,s1,r1/1:/{Min-System/NodeL…}] handle raft ready: 8.8s [applied=1, batches=1, state_assertions=1]

Repro

Parameters:

  • GOFLAGS=-json
make stressrace TESTS=TestListenerFileCreation PKG=./pkg/server TESTTIMEOUT=5m STRESSFLAGS='-timeout 5m' 2>&1

powered by pkg/cmd/internal/issues

@cockroach-teamcity cockroach-teamcity added C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. labels Jan 14, 2020
@cockroach-teamcity cockroach-teamcity added this to the 20.1 milestone Jan 14, 2020
@andreimatei
Copy link
Contributor

@andreimatei andreimatei assigned dt and unassigned tbg Jan 14, 2020
@andreimatei
Copy link
Contributor

@dt for triage ?

@andreimatei
Copy link
Contributor

@dt wants out. @solongordon can this timeout be caused by the same problem as #43957?
I think the timeout is 10s, and I think that other failure indicates the namespace migration blocking for 10s.

@yuzefovich
Copy link
Member

#42845 is not to blame here. I'm not sure what this failure is about.

@tbg tbg added the branch-master Failures and bugs on the master branch. label Jan 22, 2020
@cockroach-teamcity
Copy link
Member Author

(server).TestListenerFileCreation failed on master@9f6d850389d4912253643912053da196f84e366d:

Fatal error:

F200127 14:38:52.388774 73262 sqlmigrations/migrations.go:565  [n1] not enough time left on migration lease, terminating for safety

Stack:

goroutine 73262 [running]:
github.com/cockroachdb/cockroach/pkg/util/log.getStacks(0x62c3b01, 0xed5c0ecfc, 0x0, 0xc00238dec0)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/log/get_stacks.go:25 +0xb8
github.com/cockroachdb/cockroach/pkg/util/log.(*loggerT).outputLogEntry(0x62c0c60, 0xc000000004, 0x5c8d2c7, 0x1b, 0x235, 0xc003238960, 0x44)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/log/clog.go:211 +0xa0c
github.com/cockroachdb/cockroach/pkg/util/log.addStructured(0x415ba60, 0xc0075a9050, 0x4, 0x2, 0x0, 0x0, 0xc00238de28, 0x1, 0x1)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/log/structured.go:66 +0x2c9
github.com/cockroachdb/cockroach/pkg/util/log.logDepth(0x415ba60, 0xc0075a9050, 0x1, 0xc000000004, 0x0, 0x0, 0xc00238de28, 0x1, 0x1)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/log/log.go:44 +0x8c
github.com/cockroachdb/cockroach/pkg/util/log.Fatal(...)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/log/log.go:164
github.com/cockroachdb/cockroach/pkg/sqlmigrations.(*Manager).EnsureMigrations.func2(0x415ba60, 0xc0075a9050)
	/go/src/github.com/cockroachdb/cockroach/pkg/sqlmigrations/migrations.go:565 +0x284
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTask.func1(0xc0010798c0, 0x415ba60, 0xc0075a9050, 0xc0010e2330, 0x29, 0x0, 0x0, 0xc0032156e0)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:322 +0x152
created by github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTask
	/go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:317 +0x131

Log preceding fatal error

runtime.doInit
	/usr/local/go/src/runtime/proc.go:5217
runtime.doInit
	/usr/local/go/src/runtime/proc.go:5217
runtime.main
	/usr/local/go/src/runtime/proc.go:190
runtime.goexit
	/usr/local/go/src/runtime/asm_amd64.s:1357
I200127 14:38:20.630824 72931 server/status/runtime.go:498  [n1] runtime stats: 283 MiB RSS, 181 goroutines, 42 MiB/134 MiB/81 MiB GO alloc/idle/total, 17 MiB/53 MiB CGO alloc/total, 12.0 CGO/sec, 0.8/0.1 %(u/s)time, 0.0 %gc (0x), 515 KiB/515 KiB (r/w)net
W200127 14:38:21.023765 72953 storage/node_liveness.go:546  [n1,liveness-hb] slow heartbeat took 4.5s
W200127 14:38:21.023823 72953 storage/node_liveness.go:471  [n1,liveness-hb] failed node liveness heartbeat: operation "node liveness heartbeat" timed out after 4.5s
W200127 14:38:23.272060 72864 storage/store_raft.go:496  [n1,s1,r5/1:/{Systemtse-Table/System…}] handle raft ready: 12.2s [applied=1, batches=1, state_assertions=1]
W200127 14:38:23.326547 72859 storage/store_raft.go:496  [n1,s1,r2/1:/System/NodeLiveness{-Max}] handle raft ready: 12.3s [applied=1, batches=1, state_assertions=0]
W200127 14:38:23.326614 72877 storage/store_raft.go:496  [n1,s1,r1/1:/{Min-System/NodeL…}] handle raft ready: 9.0s [applied=1, batches=1, state_assertions=1]
W200127 14:38:23.326634 74172 storage/node_liveness.go:546  [n1,s1,r4/1:/System{/tsd-tse}] slow heartbeat took 12.3s
W200127 14:38:23.326793 74173 storage/node_liveness.go:546  [n1,s1,r3/1:/System/{NodeLive…-tsd}] slow heartbeat took 12.3s
E200127 14:38:23.326820 74173 storage/replica_range_lease.go:339  [n1,s1,r3/1:/System/{NodeLive…-tsd}] heartbeat failed on epoch increment
W200127 14:38:23.326945 74174 storage/node_liveness.go:546  [n1,s1,r26/1:/NamespaceTable/{30-Max}] slow heartbeat took 12.2s
E200127 14:38:23.326967 74174 storage/replica_range_lease.go:339  [n1,s1,r26/1:/NamespaceTable/{30-Max}] heartbeat failed on epoch increment
W200127 14:38:23.327035 72953 storage/node_liveness.go:546  [n1,liveness-hb] slow heartbeat took 2.3s
I200127 14:38:23.327061 72953 storage/node_liveness.go:461  [n1,liveness-hb] heartbeat failed on epoch increment; retrying
E200127 14:38:23.327113 74178 storage/replica_range_lease.go:339  [n1,s1,r5/1:/{Systemtse-Table/System…}] heartbeat failed on epoch increment
W200127 14:38:25.524127 72953 storage/node_liveness.go:546  [n1,liveness-hb] slow heartbeat took 2.1s
W200127 14:38:25.524186 72953 storage/node_liveness.go:471  [n1,liveness-hb] failed node liveness heartbeat: operation "node liveness heartbeat" timed out after 4.5s
W200127 14:38:27.631670 72859 storage/store_raft.go:496  [n1,s1,r2/1:/System/NodeLiveness{-Max}] handle raft ready: 4.3s [applied=1, batches=1, state_assertions=1]
W200127 14:38:28.187319 72877 storage/store_raft.go:496  [n1,s1,r1/1:/{Min-System/NodeL…}] handle raft ready: 4.8s [applied=1, batches=1, state_assertions=1]
I200127 14:38:29.182271 72828 gossip/gossip.go:1532  [n1] node has connected to cluster via gossip
W200127 14:38:30.024390 72953 storage/node_liveness.go:546  [n1,liveness-hb] slow heartbeat took 4.5s
W200127 14:38:30.024435 72953 storage/node_liveness.go:471  [n1,liveness-hb] failed node liveness heartbeat: operation "node liveness heartbeat" timed out after 4.5s
W200127 14:38:30.535979 72865 storage/store_raft.go:496  [n1,s1,r2/1:/System/NodeLiveness{-Max}] handle raft ready: 2.9s [applied=1, batches=1, state_assertions=0]
W200127 14:38:30.536323 74303 storage/node_liveness.go:546  [n1,s1,r3/1:/System/{NodeLive…-tsd}] slow heartbeat took 7.2s
W200127 14:38:30.536583 74289 storage/node_liveness.go:546  [n1,s1,r4/1:/System{/tsd-tse}] slow heartbeat took 7.2s
W200127 14:38:30.536844 74420 storage/node_liveness.go:546  [n1,s1,r8/1:/Table/1{2-3}] slow heartbeat took 7.2s
W200127 14:38:30.537074 74304 storage/node_liveness.go:546  [n1,s1,r26/1:/NamespaceTable/{30-Max}] slow heartbeat took 7.1s
W200127 14:38:30.597232 72901 storage/store_raft.go:496  [n1,s1,r1/1:/{Min-System/NodeL…}] handle raft ready: 2.4s [applied=1, batches=1, state_assertions=1]
I200127 14:38:30.630921 72931 server/status/runtime.go:498  [n1] runtime stats: 283 MiB RSS, 179 goroutines, 45 MiB/132 MiB/82 MiB GO alloc/idle/total, 17 MiB/53 MiB CGO alloc/total, 40.1 CGO/sec, 0.8/0.2 %(u/s)time, 0.0 %gc (0x), 1.8 MiB/1.8 MiB (r/w)net
W200127 14:38:33.543647 72863 storage/store_raft.go:496  [n1,s1,r3/1:/System/{NodeLive…-tsd}] handle raft ready: 2.9s [applied=2, batches=1, state_assertions=0]
W200127 14:38:33.544640 72885 storage/store_raft.go:496  [n1,s1,r12/1:/Table/1{6-7}] handle raft ready: 2.9s [applied=1, batches=1, state_assertions=1]
W200127 14:38:40.564005 72899 storage/store_raft.go:496  [n1,s1,r24/1:/Table/2{8-9}] handle raft ready: 9.9s [applied=1, batches=1, state_assertions=1]
W200127 14:38:40.565698 72886 storage/store_raft.go:496  [n1,s1,r4/1:/System{/tsd-tse}] handle raft ready: 9.9s [applied=2, batches=1, state_assertions=0]
I200127 14:38:40.630973 72931 server/status/runtime.go:498  [n1] runtime stats: 283 MiB RSS, 181 goroutines, 40 MiB/136 MiB/83 MiB GO alloc/idle/total, 17 MiB/53 MiB CGO alloc/total, 26.5 CGO/sec, 1.6/0.3 %(u/s)time, 0.0 %gc (1x), 49 MiB/49 MiB (r/w)net
W200127 14:38:49.028402 72858 storage/store_raft.go:496  [n1,s1,r11/1:/Table/1{5-6}] handle raft ready: 15.5s [applied=1, batches=1, state_assertions=1]
W200127 14:38:49.028564 72901 storage/store_raft.go:496  [n1,s1,r1/1:/{Min-System/NodeL…}] handle raft ready: 14.4s [applied=1, batches=1, state_assertions=1]
W200127 14:38:49.028848 72863 storage/store_raft.go:496  [n1,s1,r3/1:/System/{NodeLive…-tsd}] handle raft ready: 15.5s [applied=1, batches=1, state_assertions=0]
W200127 14:38:49.029130 72893 storage/store_raft.go:496  [n1,s1,r2/1:/System/NodeLiveness{-Max}] handle raft ready: 14.4s [applied=1, batches=1, state_assertions=0]
I200127 14:38:51.080764 72931 server/status/runtime.go:498  [n1] runtime stats: 283 MiB RSS, 184 goroutines, 44 MiB/132 MiB/84 MiB GO alloc/idle/total, 17 MiB/53 MiB CGO alloc/total, 12.8 CGO/sec, 0.7/0.1 %(u/s)time, 0.0 %gc (0x), 4.5 MiB/4.5 MiB (r/w)net
W200127 14:38:51.141830 72924 storage/store_rebalancer.go:223  [n1,s1,store-rebalancer] StorePool missing descriptor for local store
W200127 14:38:52.300483 72901 storage/store_raft.go:496  [n1,s1,r1/1:/{Min-System/NodeL…}] handle raft ready: 3.3s [applied=1, batches=1, state_assertions=0]
W200127 14:38:52.388393 72863 storage/store_raft.go:496  [n1,s1,r3/1:/System/{NodeLive…-tsd}] handle raft ready: 3.4s [applied=1, batches=1, state_assertions=0]
W200127 14:38:52.388560 72893 storage/store_raft.go:496  [n1,s1,r2/1:/System/NodeLiveness{-Max}] handle raft ready: 3.4s [applied=1, batches=1, state_assertions=0]

More

Parameters:

  • GOFLAGS=-json
make stressrace TESTS=TestListenerFileCreation PKG=./pkg/server TESTTIMEOUT=5m STRESSFLAGS='-timeout 5m' 2>&1

See this test on roachdash
powered by pkg/cmd/internal/issues

@andreimatei
Copy link
Contributor

We fixed #43957 but this is still happening. Another instance: #44352

@andreimatei
Copy link
Contributor

@spaskob are you a good victim here at all? I think the migrations are sometimes not finishing in time. Could be just TC overload...

@spaskob
Copy link
Contributor

spaskob commented Jan 30, 2020

Can you provide more context? I don't understand migrations...

@cockroach-teamcity
Copy link
Member Author

(server).TestListenerFileCreation failed on master@954ff78b4a7193ba795170cd13ba2be180faa4d1:

Fatal error:

F200131 23:31:54.064924 74241 sqlmigrations/migrations.go:565  [n1] not enough time left on migration lease, terminating for safety

Stack:

goroutine 74241 [running]:
github.com/cockroachdb/cockroach/pkg/util/log.getStacks(0x63ced01, 0xed5c6afea, 0x0, 0xc0043e7ec0)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/log/get_stacks.go:25 +0xb8
github.com/cockroachdb/cockroach/pkg/util/log.(*loggerT).outputLogEntry(0x63cbe40, 0xc000000004, 0x5d97c33, 0x1b, 0x235, 0xc001a66280, 0x44)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/log/clog.go:211 +0xa0c
github.com/cockroachdb/cockroach/pkg/util/log.addStructured(0x4222da0, 0xc0051d0cf0, 0x4, 0x2, 0x0, 0x0, 0xc0043e7e28, 0x1, 0x1)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/log/structured.go:66 +0x2c9
github.com/cockroachdb/cockroach/pkg/util/log.logDepth(0x4222da0, 0xc0051d0cf0, 0x1, 0xc000000004, 0x0, 0x0, 0xc0043e7e28, 0x1, 0x1)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/log/log.go:44 +0x8c
github.com/cockroachdb/cockroach/pkg/util/log.Fatal(...)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/log/log.go:164
github.com/cockroachdb/cockroach/pkg/sqlmigrations.(*Manager).EnsureMigrations.func2(0x4222da0, 0xc0051d0cf0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sqlmigrations/migrations.go:565 +0x284
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTask.func1(0xc007d1c000, 0x4222da0, 0xc0051d0cf0, 0xc0007ec240, 0x29, 0x0, 0x0, 0xc004c83c60)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:322 +0x152
created by github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTask
	/go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:317 +0x131

Log preceding fatal error

W200131 23:31:32.801448 73203 storage/node_liveness.go:546  [n1,liveness-hb] slow heartbeat took 4.5s
W200131 23:31:32.801522 73203 storage/node_liveness.go:471  [n1,liveness-hb] failed node liveness heartbeat: operation "node liveness heartbeat" timed out after 4.5s
I200131 23:31:35.912750 73197 server/status/runtime.go:498  [n1] runtime stats: 272 MiB RSS, 178 goroutines, 42 MiB/135 MiB/90 MiB GO alloc/idle/total, 17 MiB/53 MiB CGO alloc/total, 12.8 CGO/sec, 0.7/0.2 %(u/s)time, 0.0 %gc (0x), 591 KiB/591 KiB (r/w)net
W200131 23:31:37.301823 73203 storage/node_liveness.go:546  [n1,liveness-hb] slow heartbeat took 4.5s
W200131 23:31:37.301895 73203 storage/node_liveness.go:471  [n1,liveness-hb] failed node liveness heartbeat: operation "node liveness heartbeat" timed out after 4.5s
W200131 23:31:38.266665 73281 storage/store_raft.go:496  [n1,s1,r2/1:/System/NodeLiveness{-Max}] handle raft ready: 14.0s [applied=1, batches=1, state_assertions=1]
W200131 23:31:41.374453 74330 server/node_engine_health.go:73  [n1] disk stall detected: unable to write to <no-attributes>=/tmp/TestListenerFileCreation660492334 within 10s 

** Compaction Stats [default] **
Level    Files   Size     Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sum      0/0    0.00 KB   0.0      0.0     0.0      0.0       0.0      0.0       0.0   0.0      0.0      0.0      0.00              0.00         0    0.000       0      0
 Int      0/0    0.00 KB   0.0      0.0     0.0      0.0       0.0      0.0       0.0   0.0      0.0      0.0      0.00              0.00         0    0.000       0      0

** Compaction Stats [default] **
Priority    Files   Size     Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Uptime(secs): 89.6 total, 25.4 interval
Flush(GB): cumulative 0.000, interval 0.000
AddFile(GB): cumulative 0.000, interval 0.000
AddFile(Total Files): cumulative 0, interval 0
AddFile(L0 Files): cumulative 0, interval 0
AddFile(Keys): cumulative 0, interval 0
Cumulative compaction: 0.00 GB write, 0.00 MB/s write, 0.00 GB read, 0.00 MB/s read, 0.0 seconds
Interval compaction: 0.00 GB write, 0.00 MB/s write, 0.00 GB read, 0.00 MB/s read, 0.0 seconds
Stalls(count): 0 level0_slowdown, 0 level0_slowdown_with_compaction, 0 level0_numfiles, 0 level0_numfiles_with_compaction, 0 stop for pending_compaction_bytes, 0 slowdown for pending_compaction_bytes, 0 memtable_compaction, 0 memtable_slowdown, interval 0 total count
estimated_pending_compaction_bytes: 0 B
W200131 23:31:41.802199 73203 storage/node_liveness.go:546  [n1,liveness-hb] slow heartbeat took 4.5s
W200131 23:31:41.802280 73203 storage/node_liveness.go:471  [n1,liveness-hb] failed node liveness heartbeat: operation "node liveness heartbeat" timed out after 4.5s
W200131 23:31:41.948223 73297 storage/store_raft.go:496  [n1,s1,r1/1:/{Min-System/NodeL…}] handle raft ready: 10.6s [applied=1, batches=1, state_assertions=1]
W200131 23:31:41.986741 73260 storage/store_raft.go:496  [n1,s1,r2/1:/System/NodeLiveness{-Max}] handle raft ready: 3.7s [applied=1, batches=1, state_assertions=1]
W200131 23:31:42.009086 73948 storage/node_liveness.go:546  [n1,s1,r4/1:/System{/tsd-tse}] slow heartbeat took 37.8s
W200131 23:31:42.009437 73932 storage/node_liveness.go:546  [n1,s1,r26/1:/NamespaceTable/{30-Max}] slow heartbeat took 37.8s
E200131 23:31:42.009463 73932 storage/replica_range_lease.go:339  [n1,s1,r26/1:/NamespaceTable/{30-Max}] heartbeat failed on epoch increment
W200131 23:31:42.009615 74039 storage/node_liveness.go:546  [n1,s1,r6/1:/Table/{SystemCon…-11}] slow heartbeat took 26.1s
E200131 23:31:42.009631 74039 storage/replica_range_lease.go:339  [n1,s1,r6/1:/Table/{SystemCon…-11}] heartbeat failed on epoch increment
W200131 23:31:42.009687 73714 storage/node_liveness.go:546  [n1,s1,r8/1:/Table/1{2-3}] slow heartbeat took 25.1s
E200131 23:31:42.009701 73714 storage/replica_range_lease.go:339  [n1,s1,r8/1:/Table/1{2-3}] heartbeat failed on epoch increment
W200131 23:31:42.009763 74207 storage/node_liveness.go:546  [n1,s1,r13/1:/Table/1{7-8}] slow heartbeat took 10.6s
E200131 23:31:42.009776 74207 storage/replica_range_lease.go:339  [n1,s1,r13/1:/Table/1{7-8}] heartbeat failed on epoch increment
W200131 23:31:42.010258 74220 storage/node_liveness.go:546  [n1,s1,r3/1:/System/{NodeLive…-tsd}] slow heartbeat took 10.6s
E200131 23:31:42.010285 74220 storage/replica_range_lease.go:339  [n1,s1,r3/1:/System/{NodeLive…-tsd}] heartbeat failed on epoch increment
I200131 23:31:42.010394 73203 storage/node_liveness.go:461  [n1,liveness-hb] heartbeat failed on epoch increment; retrying
I200131 23:31:42.152499 74244 sql/event_log.go:132  [n1,intExec=optInToDiagnosticsStatReporting] Event: "set_cluster_setting", target: 0, info: {SettingName:diagnostics.reporting.enabled Value:true User:root}
I200131 23:31:42.181607 74406 sql/event_log.go:132  [n1,intExec=set-setting] Event: "set_cluster_setting", target: 0, info: {SettingName:version Value:19.2-11 User:root}
I200131 23:31:42.540888 72321 gossip/gossip.go:1532  [n1] node has connected to cluster via gossip
W200131 23:31:44.394724 73299 storage/store_raft.go:496  [n1,s1,r3/1:/System/{NodeLive…-tsd}] handle raft ready: 2.2s [applied=1, batches=1, state_assertions=0]
W200131 23:31:44.429670 73268 storage/store_raft.go:496  [n1,s1,r6/1:/Table/{SystemCon…-11}] handle raft ready: 1.3s [applied=1, batches=1, state_assertions=0]
I200131 23:31:45.913136 73197 server/status/runtime.go:498  [n1] runtime stats: 272 MiB RSS, 168 goroutines, 52 MiB/128 MiB/90 MiB GO alloc/idle/total, 17 MiB/53 MiB CGO alloc/total, 138.9 CGO/sec, 1.3/0.5 %(u/s)time, 0.0 %gc (0x), 96 MiB/96 MiB (r/w)net
W200131 23:31:54.064851 74241 sqlmigrations/migrations.go:551  [n1] unable to extend ownership of expiration lease: can't extend lease that expired at time 1580513475.993053782,0

More

Parameters:

  • GOFLAGS=-json
make stressrace TESTS=TestListenerFileCreation PKG=./pkg/server TESTTIMEOUT=5m STRESSFLAGS='-timeout 5m' 2>&1

See this test on roachdash
powered by pkg/cmd/internal/issues

@spaskob
Copy link
Contributor

spaskob commented Feb 6, 2020

@andreimatei @solongordon should this be assigned to me?

@andreimatei
Copy link
Contributor

I've unassigned you. But you're closer to pass the hot potato I think than I am...

@ajwerner
Copy link
Contributor

This also failed with

W200131 23:30:31.823659 73494 server/node_engine_health.go:73  [n1] disk stall detected: unable to write to <no-attributes>=/tmp/TestListenerFileCreation660492334 within 10s 

What's going on with that? Did we change something about /tmp and now we're just destroying whatever device that is?

@cockroach-teamcity
Copy link
Member Author

(server).TestListenerFileCreation failed on master@c778c12240e68228a4af5c781d43f38ea139fbcb:

Fatal error:

F200228 23:56:38.724178 82872 sqlmigrations/migrations.go:583  [n1] not enough time left on migration lease, terminating for safety

Stack:

goroutine 82872 [running]:
github.com/cockroachdb/cockroach/pkg/util/log.getStacks(0x64ede01, 0xed5eb9fb6, 0x0, 0xc001651ec0)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/log/get_stacks.go:25 +0xb8
github.com/cockroachdb/cockroach/pkg/util/log.(*loggerT).outputLogEntry(0x64eaea0, 0xc000000004, 0x5ea87dc, 0x1b, 0x247, 0xc00170cf50, 0x44)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/log/clog.go:211 +0xa0c
github.com/cockroachdb/cockroach/pkg/util/log.addStructured(0x4311540, 0xc004182240, 0x4, 0x2, 0x0, 0x0, 0xc001651e28, 0x1, 0x1)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/log/structured.go:66 +0x2c9
github.com/cockroachdb/cockroach/pkg/util/log.logDepth(0x4311540, 0xc004182240, 0x1, 0xc000000004, 0x0, 0x0, 0xc001651e28, 0x1, 0x1)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/log/log.go:44 +0x8c
github.com/cockroachdb/cockroach/pkg/util/log.Fatal(...)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/log/log.go:164
github.com/cockroachdb/cockroach/pkg/sqlmigrations.(*Manager).EnsureMigrations.func2(0x4311540, 0xc004182240)
	/go/src/github.com/cockroachdb/cockroach/pkg/sqlmigrations/migrations.go:583 +0x284
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTask.func1(0xc003691560, 0x4311540, 0xc004182240, 0xc0016a8db0, 0x29, 0x0, 0x0, 0xc001912c60)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:322 +0x152
created by github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTask
	/go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:317 +0x131

Log preceding fatal error

I200228 23:56:15.713606 82841 server/status/runtime.go:498  [n1] runtime stats: 245 MiB RSS, 182 goroutines, 29 MiB/147 MiB/74 MiB GO alloc/idle/total, 16 MiB/51 MiB CGO alloc/total, 10.4 CGO/sec, 0.6/0.1 %(u/s)time, 0.0 %gc (0x), 4.5 MiB/4.5 MiB (r/w)net
W200228 23:56:16.931515 82762 storage/store_raft.go:496  [n1,s1,r2/1:/System/NodeLiveness{-Max}] handle raft ready: 25.7s [applied=2, batches=2, state_assertions=1]
I200228 23:56:16.931652 82850 storage/node_liveness.go:804  [n1,liveness-hb] retrying liveness update after storage.errRetryLiveness: result is ambiguous (context deadline exceeded)
W200228 23:56:16.931673 82850 storage/node_liveness.go:546  [n1,liveness-hb] slow heartbeat took 36.1s
W200228 23:56:16.931700 82850 storage/node_liveness.go:471  [n1,liveness-hb] failed node liveness heartbeat: operation "node liveness heartbeat" timed out after 4.5s
I200228 23:56:19.932160 83929 internal/client/txn.go:698  [n1] async rollback failed: context deadline exceeded
W200228 23:56:21.432326 82850 storage/node_liveness.go:546  [n1,liveness-hb] slow heartbeat took 4.5s
W200228 23:56:21.432377 82850 storage/node_liveness.go:471  [n1,liveness-hb] failed node liveness heartbeat: operation "node liveness heartbeat" timed out after 4.5s
W200228 23:56:25.316686 82714 storage/closedts/provider/provider.go:152  [ct-closer] unable to move closed timestamp forward: not live
github.com/cockroachdb/cockroach/pkg/storage.init
	/go/src/github.com/cockroachdb/cockroach/pkg/storage/node_liveness.go:57
runtime.doInit
	/usr/local/go/src/runtime/proc.go:5222
runtime.doInit
	/usr/local/go/src/runtime/proc.go:5217
runtime.doInit
	/usr/local/go/src/runtime/proc.go:5217
runtime.main
	/usr/local/go/src/runtime/proc.go:190
runtime.goexit
	/usr/local/go/src/runtime/asm_amd64.s:1357
W200228 23:56:25.932650 82850 storage/node_liveness.go:546  [n1,liveness-hb] slow heartbeat took 4.5s
W200228 23:56:25.932709 82850 storage/node_liveness.go:471  [n1,liveness-hb] failed node liveness heartbeat: operation "node liveness heartbeat" timed out after 4.5s
I200228 23:56:26.666807 82841 server/status/runtime.go:498  [n1] runtime stats: 245 MiB RSS, 181 goroutines, 31 MiB/146 MiB/74 MiB GO alloc/idle/total, 16 MiB/51 MiB CGO alloc/total, 10.8 CGO/sec, 0.5/0.1 %(u/s)time, 0.0 %gc (0x), 2.3 MiB/2.3 MiB (r/w)net
W200228 23:56:29.967673 82751 storage/store_raft.go:496  [n1,s1,r1/1:/{Min-System/NodeL…}] handle raft ready: 30.0s [applied=1, batches=1, state_assertions=0]
W200228 23:56:30.432904 82850 storage/node_liveness.go:546  [n1,liveness-hb] slow heartbeat took 4.5s
W200228 23:56:30.432966 82850 storage/node_liveness.go:471  [n1,liveness-hb] failed node liveness heartbeat: operation "node liveness heartbeat" timed out after 4.5s
W200228 23:56:34.933212 82850 storage/node_liveness.go:546  [n1,liveness-hb] slow heartbeat took 4.5s
W200228 23:56:34.933278 82850 storage/node_liveness.go:471  [n1,liveness-hb] failed node liveness heartbeat: operation "node liveness heartbeat" timed out after 4.5s
W200228 23:56:35.523965 82762 storage/store_raft.go:496  [n1,s1,r2/1:/System/NodeLiveness{-Max}] handle raft ready: 18.6s [applied=1, batches=1, state_assertions=1]
W200228 23:56:35.570274 82751 storage/store_raft.go:496  [n1,s1,r1/1:/{Min-System/NodeL…}] handle raft ready: 5.6s [applied=1, batches=1, state_assertions=1]
I200228 23:56:35.719191 82841 server/status/runtime.go:498  [n1] runtime stats: 245 MiB RSS, 181 goroutines, 32 MiB/144 MiB/74 MiB GO alloc/idle/total, 16 MiB/51 MiB CGO alloc/total, 15.8 CGO/sec, 0.6/0.2 %(u/s)time, 0.0 %gc (0x), 3.3 MiB/3.3 MiB (r/w)net
W200228 23:56:35.948764 83537 storage/node_liveness.go:546  [n1,s1,r3/1:/System/{NodeLive…-tsd}] slow heartbeat took 55.0s
E200228 23:56:35.948811 83537 storage/replica_range_lease.go:339  [n1,s1,r3/1:/System/{NodeLive…-tsd}] heartbeat failed on epoch increment
W200228 23:56:35.950337 83361 storage/node_liveness.go:546  [n1,s1,r6/1:/Table/{SystemCon…-11}] slow heartbeat took 55.0s
E200228 23:56:35.950362 83361 storage/replica_range_lease.go:339  [n1,s1,r6/1:/Table/{SystemCon…-11}] heartbeat failed on epoch increment
W200228 23:56:35.950861 83538 storage/node_liveness.go:546  [n1,s1,r4/1:/System{/tsd-tse}] slow heartbeat took 55.0s
E200228 23:56:35.950883 83538 storage/replica_range_lease.go:339  [n1,s1,r4/1:/System{/tsd-tse}] heartbeat failed on epoch increment
W200228 23:56:35.950957 83610 storage/node_liveness.go:546  [n1,s1,r20/1:/Table/2{4-5}] slow heartbeat took 44.7s
E200228 23:56:35.950970 83610 storage/replica_range_lease.go:339  [n1,s1,r20/1:/Table/2{4-5}] heartbeat failed on epoch increment
W200228 23:56:35.952087 82850 storage/node_liveness.go:546  [n1,liveness-hb] slow heartbeat took 1.0s
I200228 23:56:35.952130 82850 storage/node_liveness.go:461  [n1,liveness-hb] heartbeat failed on epoch increment; retrying
I200228 23:56:36.421514 82728 gossip/gossip.go:1532  [n1] node has connected to cluster via gossip
W200228 23:56:37.799590 82752 storage/store_raft.go:496  [n1,s1,r4/1:/System{/tsd-tse}] handle raft ready: 1.7s [applied=1, batches=1, state_assertions=0]
W200228 23:56:38.394686 82743 storage/store_raft.go:496  [n1,s1,r3/1:/System/{NodeLive…-tsd}] handle raft ready: 2.3s [applied=1, batches=1, state_assertions=0]
W200228 23:56:38.395120 82758 storage/store_raft.go:496  [n1,s1,r6/1:/Table/{SystemCon…-11}] handle raft ready: 2.3s [applied=1, batches=1, state_assertions=0]
W200228 23:56:38.395375 82795 storage/store_raft.go:496  [n1,s1,r2/1:/System/NodeLiveness{-Max}] handle raft ready: 1.7s [applied=1, batches=1, state_assertions=0]
W200228 23:56:38.396788 82740 storage/store_raft.go:496  [n1,s1,r19/1:/Table/2{3-4}] handle raft ready: 2.3s [applied=1, batches=1, state_assertions=1]
W200228 23:56:38.398005 82850 storage/node_liveness.go:546  [n1,liveness-hb] slow heartbeat took 1.7s
W200228 23:56:38.450706 82752 storage/store_raft.go:496  [n1,s1,r4/1:/System{/tsd-tse}] handle raft ready: 0.7s [applied=1, batches=1, state_assertions=0]

More

Parameters:

  • GOFLAGS=-json
make stressrace TESTS=TestListenerFileCreation PKG=./pkg/server TESTTIMEOUT=5m STRESSFLAGS='-timeout 5m' 2>&1

See this test on roachdash
powered by pkg/cmd/internal/issues

@cockroach-teamcity
Copy link
Member Author

(server).TestListenerFileCreation failed on master@c4d3e4e8b83f33890a61199d9612cccbbae01a00:

Fatal error:

F200301 23:05:37.035596 82477 sqlmigrations/migrations.go:583  [n1] not enough time left on migration lease, terminating for safety

Stack:

goroutine 82477 [running]:
github.com/cockroachdb/cockroach/pkg/util/log.getStacks(0x64fbf01, 0xed5ee36c1, 0x0, 0xc001715ec0)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/log/get_stacks.go:25 +0xb8
github.com/cockroachdb/cockroach/pkg/util/log.(*loggerT).outputLogEntry(0x64f8f00, 0xc000000004, 0x5eb63b8, 0x1b, 0x247, 0xc0034c1bd0, 0x44)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/log/clog.go:211 +0xa0c
github.com/cockroachdb/cockroach/pkg/util/log.addStructured(0x4319780, 0xc0034b14a0, 0x4, 0x2, 0x0, 0x0, 0xc001715e28, 0x1, 0x1)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/log/structured.go:66 +0x2c9
github.com/cockroachdb/cockroach/pkg/util/log.logDepth(0x4319780, 0xc0034b14a0, 0x1, 0xc000000004, 0x0, 0x0, 0xc001715e28, 0x1, 0x1)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/log/log.go:44 +0x8c
github.com/cockroachdb/cockroach/pkg/util/log.Fatal(...)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/log/log.go:164
github.com/cockroachdb/cockroach/pkg/sqlmigrations.(*Manager).EnsureMigrations.func2(0x4319780, 0xc0034b14a0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sqlmigrations/migrations.go:583 +0x284
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTask.func1(0xc0026a0090, 0x4319780, 0xc0034b14a0, 0xc000679110, 0x29, 0x0, 0x0, 0xc0067d5dc0)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:323 +0x152
created by github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTask
	/go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:318 +0x131

Log preceding fatal error

W200301 23:05:27.550796 82828 storage/node_liveness.go:546  [n1,s1,r4/1:/System{/tsd-tse}] slow heartbeat took 49.8s
E200301 23:05:27.550837 82828 storage/replica_range_lease.go:339  [n1,s1,r4/1:/System{/tsd-tse}] heartbeat failed on epoch increment
W200301 23:05:27.550909 82263 storage/node_liveness.go:546  [n1,s1,r3/1:/System/{NodeLive…-tsd}] slow heartbeat took 48.7s
E200301 23:05:27.550928 82263 storage/replica_range_lease.go:339  [n1,s1,r3/1:/System/{NodeLive…-tsd}] heartbeat failed on epoch increment
W200301 23:05:27.551021 82880 storage/node_liveness.go:546  [n1,s1,r6/1:/Table/{SystemCon…-11}] slow heartbeat took 48.7s
E200301 23:05:27.551058 82880 storage/replica_range_lease.go:339  [n1,s1,r6/1:/Table/{SystemCon…-11}] heartbeat failed on epoch increment
W200301 23:05:27.551240 82849 storage/node_liveness.go:546  [n1,s1,r20/1:/Table/2{4-5}] slow heartbeat took 43.1s
E200301 23:05:27.551256 82849 storage/replica_range_lease.go:339  [n1,s1,r20/1:/Table/2{4-5}] heartbeat failed on epoch increment
W200301 23:05:27.551665 82850 storage/node_liveness.go:546  [n1,s1,r5/1:/{Systemtse-Table/System…}] slow heartbeat took 43.1s
E200301 23:05:27.551690 82850 storage/replica_range_lease.go:339  [n1,s1,r5/1:/{Systemtse-Table/System…}] heartbeat failed on epoch increment
I200301 23:05:28.737951 82291 gossip/gossip.go:567  [n1] gossip status (stalled, 1 node)
gossip client (0/3 cur/max conns)
gossip server (0/3 cur/max conns, infos 0/0 sent/received, bytes 0B/0B sent/received)
gossip connectivity
I200301 23:05:28.745161 82439 server/status/runtime.go:498  [n1] runtime stats: 262 MiB RSS, 184 goroutines, 49 MiB/128 MiB/82 MiB GO alloc/idle/total, 16 MiB/53 MiB CGO alloc/total, 16.2 CGO/sec, 0.8/0.1 %(u/s)time, 0.0 %gc (0x), 213 KiB/213 KiB (r/w)net
W200301 23:05:29.451553 82448 storage/node_liveness.go:546  [n1,liveness-hb] slow heartbeat took 4.5s
W200301 23:05:29.451596 82448 storage/node_liveness.go:471  [n1,liveness-hb] failed node liveness heartbeat: operation "node liveness heartbeat" timed out after 4.5s:
  - aborted during DistSender.Send: context deadline exceeded
W200301 23:05:31.489038 83409 server/node_engine_health.go:73  [n1] disk stall detected: unable to write to <no-attributes>=/tmp/TestListenerFileCreation423577670 within 10s 

** Compaction Stats [default] **
Level    Files   Size     Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sum      0/0    0.00 KB   0.0      0.0     0.0      0.0       0.0      0.0       0.0   0.0      0.0      0.0      0.00              0.00         0    0.000       0      0
 Int      0/0    0.00 KB   0.0      0.0     0.0      0.0       0.0      0.0       0.0   0.0      0.0      0.0      0.00              0.00         0    0.000       0      0

** Compaction Stats [default] **
Priority    Files   Size     Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Uptime(secs): 67.9 total, 23.5 interval
Flush(GB): cumulative 0.000, interval 0.000
AddFile(GB): cumulative 0.000, interval 0.000
AddFile(Total Files): cumulative 0, interval 0
AddFile(L0 Files): cumulative 0, interval 0
AddFile(Keys): cumulative 0, interval 0
Cumulative compaction: 0.00 GB write, 0.00 MB/s write, 0.00 GB read, 0.00 MB/s read, 0.0 seconds
Interval compaction: 0.00 GB write, 0.00 MB/s write, 0.00 GB read, 0.00 MB/s read, 0.0 seconds
Stalls(count): 0 level0_slowdown, 0 level0_slowdown_with_compaction, 0 level0_numfiles, 0 level0_numfiles_with_compaction, 0 stop for pending_compaction_bytes, 0 slowdown for pending_compaction_bytes, 0 memtable_compaction, 0 memtable_slowdown, interval 0 total count
estimated_pending_compaction_bytes: 0 B
W200301 23:05:31.892995 82419 storage/store_rebalancer.go:223  [n1,s1,store-rebalancer] StorePool missing descriptor for local store
I200301 23:05:32.452263 83408 internal/client/txn.go:698  [n1] async rollback failed: context deadline exceeded
W200301 23:05:33.951857 82448 storage/node_liveness.go:546  [n1,liveness-hb] slow heartbeat took 4.5s
W200301 23:05:33.951917 82448 storage/node_liveness.go:471  [n1,liveness-hb] failed node liveness heartbeat: operation "node liveness heartbeat" timed out after 4.5s
W200301 23:05:36.957170 82350 storage/store_raft.go:496  [n1,s1,r1/1:/{Min-System/NodeL…}] handle raft ready: 16.3s [applied=1, batches=1, state_assertions=1]
W200301 23:05:36.975959 82386 storage/store_raft.go:496  [n1,s1,r2/1:/System/NodeLiveness{-Max}] handle raft ready: 9.4s [applied=1, batches=1, state_assertions=1]
W200301 23:05:36.976669 82377 storage/store_raft.go:496  [n1,s1,r17/1:/Table/2{1-2}] handle raft ready: 9.4s [applied=1, batches=1, state_assertions=1]
W200301 23:05:37.000157 83416 storage/node_liveness.go:546  [n1,s1,r6/1:/Table/{SystemCon…-11}] slow heartbeat took 9.4s
W200301 23:05:37.001808 83392 storage/node_liveness.go:546  [n1,s1,r3/1:/System/{NodeLive…-tsd}] slow heartbeat took 9.5s
W200301 23:05:37.002163 83314 storage/node_liveness.go:546  [n1,s1,r4/1:/System{/tsd-tse}] slow heartbeat took 9.5s
W200301 23:05:37.002424 82448 storage/node_liveness.go:546  [n1,liveness-hb] slow heartbeat took 3.1s

More

Parameters:

  • GOFLAGS=-json
make stressrace TESTS=TestListenerFileCreation PKG=./pkg/server TESTTIMEOUT=5m STRESSFLAGS='-timeout 5m' 2>&1

See this test on roachdash
powered by pkg/cmd/internal/issues

@knz
Copy link
Contributor

knz commented Jun 4, 2020

Closing due to inactivity.
We have other issues tracking the sql migration lease already.

@knz knz closed this as completed Jun 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot.
Projects
None yet
Development

No branches or pull requests

10 participants