runtime: ThreadSanitizer CHECK failed #27660

benesch · 2018-09-13T18:30:20Z

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (`go version`)?

$ go version
go version go1.11 linux/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (`go env`)?

$ go env
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/benesch/.cache/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/benesch/go"
GOPROXY=""
GORACE=""
GOROOT="/usr/local/go"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build445193843=/tmp/go-build -gno-record-gcc-switches"

What did you do?

go get -u github.com/cockroachdb/cockroach
cd $(go env GOPATH)/src/github.com/cockroachdb/cockroach
make testrace PKG=./pkg/sql/logictest TESTS=TestLogic

This builds some C and C++ dependencies and ultimately runs:

go test -race -tags ' make x86_64_linux_gnu' -ldflags '-X github.com/cockroachdb/cockroach/pkg/build.typ=development -extldflags "" -X "github.com/cockroachdb/cockroach/pkg/build.tag=v2.2.0-alpha.00000000-793-g33c7d27d82-dirty" -X "github.com/cockroachdb/cockroach/pkg/build.rev=33c7d27d8216b543ac77f0fe39d440ebebfa9e70" -X "github.com/cockroachdb/cockroach/pkg/build.cgoTargetTriple=x86_64-linux-gnu"  ' -run "TestLogic"  -timeout 25m ./pkg/sql/logictest

What did you expect to see?

Either a successful test run or an actual race.

What did you see instead?

FATAL: ThreadSanitizer CHECK failed: ./gotsan.cc:380 "((v)) > ((0))" (0x0, 0x0)

I tried to dig into this, but gotsan.cc is apparently generated deep inside the build process for the race detector and it wasn't clear to me how to generate a gotsan.cc that matched my Go version.

Totally possible that this our (CockroachDB's) bug, but there's not a lot here for me to go on, so figured I'd file here and see if someone with more expertise in the race detector had any insight.

The text was updated successfully, but these errors were encountered:

benesch · 2018-09-13T18:35:05Z

And here's a different tsan failure I just observed with the same command:

FATAL: ThreadSanitizer CHECK failed: ./gotsan.cc:828 "((get_block(bi))) == ((0))" (0x603f, 0x0)

benesch · 2018-09-13T19:10:55Z

And another slightly different presentation:

0x00000000026cb4e6 in __tsan::SyncClock::Unshare(__tsan::DenseSlabAllocCache*) ()
(gdb) bt
#0  0x00000000026cb4e6 in __tsan::SyncClock::Unshare(__tsan::DenseSlabAllocCache*) ()
#1  0x00000000026cb773 in __tsan::ThreadClock::ReleaseStore(__tsan::DenseSlabAllocCache*, __tsan::SyncClock*) ()
#2  0x00000000026dd43a in __tsan_go_atomic32_compare_exchange ()
#3  0x000000000062cfc3 in racecall () at /usr/local/go/src/runtime/race_amd64.s:381
#4  0x00007ffff6a906d0 in ?? ()

ianlancetaylor · 2018-09-13T19:38:58Z

CC @dvyukov

dvyukov · 2018-09-14T09:57:33Z

These instructions fail for me:

cockroach$ go test -race -tags ' make x86_64_linux_gnu' -ldflags '-X github.com/cockroachdb/cockroach/pkg/build.typ=development -extldflags "" -X "github.com/cockroachdb/cockroach/pkg/build.tag=v2.2.0-alpha.00000000-793-g33c7d27d82-dirty" -X "github.com/cockroachdb/cockroach/pkg/build.rev=33c7d27d8216b543ac77f0fe39d440ebebfa9e70" -X "github.com/cockroachdb/cockroach/pkg/build.cgoTargetTriple=x86_64-linux-gnu"  ' -run "TestLogic"  -timeout 25m ./pkg/sql/logictest 
# github.com/cockroachdb/cockroach/pkg/sql/lex
pkg/sql/lex/predicates.go:60:14: undefined: reservedKeywords
FAIL	github.com/cockroachdb/cockroach/pkg/sql/logictest [build failed]

This is on commit:

cockroach$ git show HEAD
commit 418b9169b547069cd43252055196204af8f18040 (HEAD -> master, origin/staging, origin/master, origin/HEAD)
Merge: a9c7295dd6 5504e3f9ae
Author: craig[bot] <bors@cockroachlabs.com>
Date:   Fri Sep 14 01:24:32 2018 +0000

    Merge #29163

benesch · 2018-09-14T13:13:28Z

Yes, I’m afraid you’ll need to run the `make test` command I posted at least once to generate some Go code and build some C/C++ dependencies. After that, you can run just the `go test` command. I apologize that the surface area of this reproduction is so large, but I really don’t know how to whittle it down. We run the race detector on every other CockroachDB package with no problems. Let me know if there’s anything I can do to assist in tracking this down!

…

On Fri, Sep 14, 2018 at 6:00 AM Dmitry Vyukov ***@***.***> wrote: These instructions fail for me: cockroach$ go test -race -tags ' make x86_64_linux_gnu' -ldflags '-X github.com/cockroachdb/cockroach/pkg/build.typ=development -extldflags "" -X "github.com/cockroachdb/cockroach/pkg/build.tag=v2.2.0-alpha.00000000-793-g33c7d27d82-dirty" -X "github.com/cockroachdb/cockroach/pkg/build.rev=33c7d27d8216b543ac77f0fe39d440ebebfa9e70" -X "github.com/cockroachdb/cockroach/pkg/build.cgoTargetTriple=x86_64-linux-gnu" ' -run "TestLogic" -timeout 25m ./pkg/sql/logictest # github.com/cockroachdb/cockroach/pkg/sql/lex pkg/sql/lex/predicates.go:60:14 <http://github.com/cockroachdb/cockroach/pkg/sql/lexpkg/sql/lex/predicates.go:60:14>: undefined: reservedKeywords FAIL github.com/cockroachdb/cockroach/pkg/sql/logictest [build failed] This is on commit: cockroach$ git show HEAD commit 418b9169b547069cd43252055196204af8f18040 (HEAD -> master, origin/staging, origin/master, origin/HEAD) Merge: a9c7295dd6 5504e3f9ae Author: craig[bot] ***@***.***> Date: Fri Sep 14 01:24:32 2018 +0000 Merge #29163 — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#27660 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AA15IByVPvtHvHaZVvkSEownHzmlif_tks5ua348gaJpZM4Wn-Lc> .

dvyukov · 2018-09-14T13:19:00Z

I did that. The error I posted happens during go test.

benesch · 2018-09-14T13:33:48Z

Are you sure the make testrace command succeeded? It spews a lot of output from parallel commands so the error can really get buried. I just created a fresh VM and verified that the instructions I posted work with commit 418b9169, so I'm not sure what else it could be. I can give you access to this VM if that would be helpful.

dvyukov · 2018-09-14T13:41:55Z

Right, it failed somewhere in the beginning:

cockroach$ IGNORE_GOVERS=1 make testrace PKG=./pkg/sql/logictest TESTS=TestLogic IGNORE_GOVERS=1
Running make with -j4
GOPATH set to /go
Detected change in build system. Rebooting Make.
Running make with -j4
GOPATH set to /go
cd /go/src/github.com/cockroachdb/cockroach/c-deps/jemalloc && autoconf
rm -rf /go/native/x86_64-linux-gnu/protobuf
bash: autoconf: command not found
Makefile:511: recipe for target '/go/src/github.com/cockroachdb/cockroach/c-deps/jemalloc/configure' failed
make: *** [/go/src/github.com/cockroachdb/cockroach/c-deps/jemalloc/configure] Error 127
make: *** Waiting for unfinished jobs....
go install -v uptodate
bin/prereqs ./pkg/cmd/uptodate > bin/uptodate.d.tmp
mkdir -p /go/native/x86_64-linux-gnu/protobuf
cd /go/native/x86_64-linux-gnu/protobuf && cmake -DCMAKE_TARGET_MESSAGES=OFF   -Dprotobuf_BUILD_TESTS=OFF /go/src/github.com/cockroachdb/cockroach/c-deps/protobuf/cmake \
  -DCMAKE_BUILD_TYPE=Release
mv -f bin/uptodate.d.tmp bin/uptodate.d
github.com/cockroachdb/cockroach/vendor/github.com/MichaelTJones/walk
-- The C compiler identification is GNU 7.3.0
github.com/cockroachdb/cockroach/vendor/github.com/spf13/pflag
github.com/cockroachdb/cockroach/pkg/cmd/uptodate
-- The CXX compiler identification is GNU 7.3.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE  
-- Found ZLIB: /usr/lib/x86_64-linux-gnu/libz.so (found version "1.2.8") 
-- Configuring done
-- Generating done
-- Build files have been written to: /go/native/x86_64-linux-gnu/protobuf

benesch · 2018-09-14T13:47:03Z

Oh, yeah, you’ll need autoconf and CMake installed. It looks like you have CMake but not autoconf.

…

On Fri, Sep 14, 2018 at 9:43 AM Dmitry Vyukov ***@***.***> wrote: Right, it failed somewhere in the beginning: cockroach$ IGNORE_GOVERS=1 make testrace PKG=./pkg/sql/logictest TESTS=TestLogic IGNORE_GOVERS=1 Running make with -j4 GOPATH set to /go Detected change in build system. Rebooting Make. Running make with -j4 GOPATH set to /go cd /go/src/github.com/cockroachdb/cockroach/c-deps/jemalloc && autoconf rm -rf /go/native/x86_64-linux-gnu/protobuf bash: autoconf: command not found Makefile:511: recipe for target '/go/src/github.com/cockroachdb/cockroach/c-deps/jemalloc/configure' failed make: *** [/go/src/github.com/cockroachdb/cockroach/c-deps/jemalloc/configure] Error 127 make: *** Waiting for unfinished jobs.... go install -v uptodate bin/prereqs ./pkg/cmd/uptodate > bin/uptodate.d.tmp mkdir -p /go/native/x86_64-linux-gnu/protobuf cd /go/native/x86_64-linux-gnu/protobuf && cmake -DCMAKE_TARGET_MESSAGES=OFF -Dprotobuf_BUILD_TESTS=OFF /go/src/github.com/cockroachdb/cockroach/c-deps/protobuf/cmake \ -DCMAKE_BUILD_TYPE=Release mv -f bin/uptodate.d.tmp bin/uptodate.dgit.luolix.top/cockroachdb/cockroach/vendor/github.com/MichaelTJones/walk -- The C compiler identification is GNU 7.3.0git.luolix.top/cockroachdb/cockroach/vendor/github.com/spf13/pflaggit.luolix.top/cockroachdb/cockroach/pkg/cmd/uptodate -- The CXX compiler identification is GNU 7.3.0 -- Check for working C compiler: /usr/bin/cc -- Check for working C compiler: /usr/bin/cc -- works -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Detecting C compile features -- Detecting C compile features - done -- Check for working CXX compiler: /usr/bin/c++ -- Check for working CXX compiler: /usr/bin/c++ -- works -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Detecting CXX compile features -- Detecting CXX compile features - done -- Looking for pthread.h -- Looking for pthread.h - found -- Looking for pthread_create -- Looking for pthread_create - not found -- Looking for pthread_create in pthreads -- Looking for pthread_create in pthreads - not found -- Looking for pthread_create in pthread -- Looking for pthread_create in pthread - found -- Found Threads: TRUE -- Found ZLIB: /usr/lib/x86_64-linux-gnu/libz.so (found version "1.2.8") -- Configuring done -- Generating done -- Build files have been written to: /go/native/x86_64-linux-gnu/protobuf — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#27660 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AA15INHhn9MKCMjIVRqDwGFG2k8XPr6aks5ua7J4gaJpZM4Wn-Lc> .

dvyukov · 2018-09-14T15:30:07Z

How long does it generally take to crash? I've run it twice and both times got:

cockroach$ go test -race -tags ' make x86_64_linux_gnu' -ldflags '-X github.com/cockroachdb/cockroach/pkg/build.typ=development -extldflags "" -X "github.com/cockroachdb/cockroach/pkg/build.tag=v2.2.0-alpha.00000000-793-g33c7d27d82-dirty" -X "github.com/cockroachdb/cockroach/pkg/build.rev=33c7d27d8216b543ac77f0fe39d440ebebfa9e70" -X "github.com/cockroachdb/cockroach/pkg/build.cgoTargetTriple=x86_64-linux-gnu"  ' -run "TestLogic"  -timeout 25m ./pkg/sql/logictest 
I180914 14:16:05.303611 1 rand.go:75  Random seed: -5069250136795247745
E180914 14:19:23.006360 77 sql/logictest/logic.go:2228  

testdata/logic_test/crdb_internal:295: SELECT node_id, network, regexp_replace(address, '\d+$', '<port>') as address, attrs, locality, regexp_replace(server_version, '^\d+\.\d+(-\d+)?$', '<server_version>') as server_version, regexp_replace(go_version, '^go.+$', '<go_version>') as go_version
FROM crdb_internal.kv_node_status WHERE node_id = 1
expected:
    node_id  network  address           attrs  locality            server_version    go_version
    1        tcp      127.0.0.1:<port>  []     {"region": "test"}  <server_version>  <go_version>
    
but found (query options: "colnames") :
    node_id  network  address           attrs  locality            server_version    go_version
    1        tcp      127.0.0.1:<port>  []     {"region": "test"}  <server_version>  devel +9f59918cae Thu Sep 13 21:34:56 2018 +0000
    
E180914 14:19:23.006776 77 sql/logictest/logic.go:2259  testdata/logic_test/crdb_internal:302: too many errors encountered, skipping the rest of the input
E180914 14:19:32.896862 50624 sql/distsqlrun/server.go:440  [n1,client=127.0.0.1:49210,user=root,scExec] error setting up flow: column "d" does not exist
E180914 14:19:34.520305 50624 sql/distsqlrun/server.go:440  [n1,client=127.0.0.1:49210,user=root,scExec] error setting up flow: error type checking constant value: could not parse "a" as type int: strconv.ParseInt: parsing "a": invalid syntax
E180914 14:19:35.740206 50624 sql/distsqlrun/server.go:440  [n1,client=127.0.0.1:49210,user=root,scExec] error setting up flow: could not parse "a" as type int: strconv.ParseInt: parsing "a": invalid syntax
E180914 14:19:36.830273 50624 sql/distsqlrun/server.go:440  [n1,client=127.0.0.1:49210,user=root,scExec] error setting up flow: unsupported binary operator: <int> / <int> (desired <int>)
E180914 14:24:01.186527 68088 storage/consistency_queue.go:128  [n1,consistencyChecker,s1,r1/1:/M{in-ax}] computing own checksum: could not dial node ID 1: initial connection heartbeat failed: rpc error: code = Canceled desc = context canceled
E180914 14:24:01.186855 68088 storage/queue.go:788  [n1,consistencyChecker,s1,r1/1:/M{in-ax}] computing own checksum: could not dial node ID 1: initial connection heartbeat failed: rpc error: code = Canceled desc = context canceled
E180914 14:26:08.285725 96089 storage/queue.go:788  [n1,raftlog,s1,r1/1:/M{in-ax}] result is ambiguous (server shutdown)
E180914 14:26:15.409641 64257 sql/logictest/logic.go:2228  

testdata/logic_test/crdb_internal:295: SELECT node_id, network, regexp_replace(address, '\d+$', '<port>') as address, attrs, locality, regexp_replace(server_version, '^\d+\.\d+(-\d+)?$', '<server_version>') as server_version, regexp_replace(go_version, '^go.+$', '<go_version>') as go_version
FROM crdb_internal.kv_node_status WHERE node_id = 1
expected:
    node_id  network  address           attrs  locality            server_version    go_version
    1        tcp      127.0.0.1:<port>  []     {"region": "test"}  <server_version>  <go_version>
    
but found (query options: "colnames") :
    node_id  network  address           attrs  locality            server_version    go_version
    1        tcp      127.0.0.1:<port>  []     {"region": "test"}  <server_version>  devel +9f59918cae Thu Sep 13 21:34:56 2018 +0000
    
E180914 14:26:15.410048 64257 sql/logictest/logic.go:2259  testdata/logic_test/crdb_internal:302: too many errors encountered, skipping the rest of the input
E180914 14:26:27.787026 99569 sql/distsqlrun/server.go:440  [n1,client=127.0.0.1:36516,user=root,scExec] error setting up flow: column "d" does not exist
E180914 14:26:29.171953 99569 sql/distsqlrun/server.go:440  [n1,client=127.0.0.1:36516,user=root,scExec] error setting up flow: error type checking constant value: could not parse "a" as type int: strconv.ParseInt: parsing "a": invalid syntax
E180914 14:26:30.247536 99569 sql/distsqlrun/server.go:440  [n1,client=127.0.0.1:36516,user=root,scExec] error setting up flow: could not parse "a" as type int: strconv.ParseInt: parsing "a": invalid syntax
E180914 14:26:31.426094 99569 sql/distsqlrun/server.go:440  [n1,client=127.0.0.1:36516,user=root,scExec] error setting up flow: unsupported binary operator: <int> / <int> (desired <int>)
E180914 14:37:20.722139 157227 sql/distsqlrun/server.go:440  [n1,client=127.0.0.1:35082,user=root,scExec] error setting up flow: column "d" does not exist
E180914 14:37:22.977964 157227 sql/distsqlrun/server.go:440  [n1,client=127.0.0.1:35082,user=root,scExec] error setting up flow: error type checking constant value: could not parse "a" as type int: strconv.ParseInt: parsing "a": invalid syntax
E180914 14:37:26.350066 157227 sql/distsqlrun/server.go:440  [n1,client=127.0.0.1:35082,user=root,scExec] error setting up flow: could not parse "a" as type int: strconv.ParseInt: parsing "a": invalid syntax
E180914 14:37:28.390702 157227 sql/distsqlrun/server.go:440  [n1,client=127.0.0.1:35082,user=root,scExec] error setting up flow: unsupported binary operator: <int> / <int> (desired <int>)
panic: test timed out after 25m0s

goroutine 175652 [running]:
testing.(*M).startAlarm.func1()
	/src/go/src/testing/testing.go:1303 +0x13a
created by time.goFunc
	/src/go/src/time/sleep.go:172 +0x52

goroutine 1 [chan receive, 25 minutes]:
testing.(*T).Run(0xc00034e300, 0x2ee939f, 0x9, 0x3010de0, 0xc0006b7bd0)
	/src/go/src/testing/testing.go:886 +0x689
testing.runTests.func1(0xc00034e300)
	/src/go/src/testing/testing.go:1126 +0xa9
testing.tRunner(0xc00034e300, 0xc0006b7d40)
	/src/go/src/testing/testing.go:834 +0x163
testing.runTests(0xc000299ba0, 0x4f9b7a0, 0x5, 0x5, 0x0)
	/src/go/src/testing/testing.go:1124 +0x4ef
testing.(*M).Run(0xc0004c8580, 0x0)
	/src/go/src/testing/testing.go:1041 +0x2ef
github.com/cockroachdb/cockroach/pkg/sql/logictest.TestMain(0xc0004c8580)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/logictest/main_test.go:37 +0x15b
main.main()
	_testmain.go:48 +0x222

benesch · 2018-09-17T01:56:18Z

It's been reproducing for me in less than a minute. Oddly enough, it's failing to reproduce for me on the latest master. Could you try commit 33c7d27d8216b543ac77f0fe39d440ebebfa9e70? And just to double check, you are running go1.11, right? I'm also having trouble reproducing on go1.10.3.

In the meantime I'm going to bisect to see if there's something obviously wrong that we fixed.

benesch · 2018-09-17T02:33:04Z

Bisecting proved challenging thanks to some actual races that were introduced and fixed within the range of commits I was bisecting. Regardless, I can semi-reliably repro on 33c7d27d8216b543ac77f0fe39d440ebebfa9e70 on a fresh Ubuntu machine with Go 1.11.

The problem seems to occur early or not at all. In five runs, this is what I observed:

Test timed out after 2m.
ThreadSanitizer CHECK failed: ./gotsan.cc:746 "((v)) > ((0))" (0x0, 0x0) after 54.42s.
Test timed out after 2m.
ThreadSanitizer CHECK failed: ./gotsan.cc:380 "((v)) > ((0))" (0x0, 0x0) after 113.751s.
Test timed out after 2m.

I could have sworn this reproduced more reliably when I original filed the bug. Still, I'm hopeful that you might be able to reproduce if you run a few more trials with a short timeout. The magic incantation to thread in a timeout via the Makefile, if that's easier, is:

$ make testrace PKG=./pkg/sql/logictest TESTS=TestLogic TESTTIMEOUT=2m

dvyukov · 2018-09-17T08:21:15Z

I have no luck reproducing this so far:

logictest$ go version
go version go1.11 linux/amd64
logictest$ stress -failure="ThreadSanitizer" -p=10 -timeout=2m ./logictest.test -test.run=TestLogic -test.timeout=2m
...
70 runs so far, 0 failures

p.s. stress is:
https://github.com/golang/tools/blob/master/cmd/stress/stress.go

dvyukov · 2018-09-17T08:22:42Z

My bets would be on:

Memory corruption in native code (there is some, right). Can make sense to test with ASan/TSan/MSan.
Kernel bug.

benesch · 2018-09-17T23:19:19Z

Oh, cool, I didn't know about the -failure flag to stress.

Anyway, thanks again for looking into this. We upgrade our CI environment to go1.11 today and various other tests started producing assertions like this under the race detector. We also started seeing outright segfaults:

unexpected fault address 0x27400
fatal error: fault
[signal SIGSEGV: segmentation violation code=0x1 addr=0x27400 pc=0x2bf0df6]

goroutine 1487 [running]:
runtime: unexpected return pc for runtime.throw called from 0x361d16b
stack: frame={sp:0x7fc94e9fe990, fp:0x7fc94e9fe9c0} stack=[0xc000ee3000,0xc000ee3800)

I bisected this down to a commit in Go itself: 8994444, titled "runtime/race: update most syso files to compiler-rt". That sure looks like our culprit, though I'm willing to believe we're still doing something wrong (e.g., corrupting memory, like you suggest) and the update to the race detector just happened to tickle this preexisting condition.

I'm going to try to get somewhere by bisecting compiler-rt itself. I may need your help in producing those syso files.

benesch · 2018-09-18T02:52:34Z

Whew, that was an exercise in patience. Bisecting was complicated by the fact that a number of revisions of compiler-rt between 68e15324 and fe2c72c5 did not compile with my GCC toolchain. Regardless, I managed to pinpoint the problem to this commit: llvm-mirror/compiler-rt@12d1690. Still unclear what the actual problem is. I'm not in the position to understand that tsan code, I'm afraid.

By the way, I'm now using this to repro:

make stressrace PKG=./pkg/ccl/backupccl TESTS=TestBackupRestoreResume STRESSFLAGS="-p 8" IGNORE_GOVERS=1 GO=~/dev/go/bin/go

The -p 8 avoids overloading my 24 CPU VM; each test execution seems to require about 3 CPUs. The IGNORE_GOVERS=1 GO=~/dev/go/bin/go is only necessary because I'm pointing at a custom version of Go. I've found that this reliably repros in under 5m.

I am once again feeling that this is more likely to be a bug in tsan than in CockroachDB. @dvyukov let me know if you disagree. My offer of providing you a VM that can reproduce this issue out of the box still stands.

dvyukov · 2018-09-18T14:26:58Z

I am once again feeling that this is more likely to be a bug in tsan than in CockroachDB.

Maybe. Hard to say. What's strange is that nobody reported any corruptions before and that I can't reproduce this on my machine. Can anybody else reproduce this?

benesch · 2018-09-18T17:42:33Z

Can anybody else reproduce this?

I'm going to check if any of my fellow Linux-using engineers can reproduce this issue on their laptop. Everyone else uses the same VM configuration (Ubuntu 18.04 on Google Compute Engine), so it's not all that interesting that they can reproduce. Here's the output of uname -r:

Linux gceworker-benesch 4.15.0-1018-gcp #19-Ubuntu SMP Thu Aug 16 13:38:55 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Are you willing to share the output of uname -r on your machine?

benesch · 2018-09-18T19:10:44Z

Ok, @mberhault was able to reproduce on his Linux desktop running Ubuntu 16.04 (kernel 4.4.0-135-generic) using the make stressrace PKG=./pkg/ccl/backupccl... with go1.11 and the latest CockroachDB master.

dvyukov · 2018-09-19T07:53:01Z

Running make stressrace PKG=./pkg/ccl/backupccl... 3 times I got:

unexpected fault address 0x2600
fatal error: fault
[signal SIGSEGV: segmentation violation code=0x1 addr=0x2600 pc=0x2bff9a6]

goroutine 5587 [running]:
runtime: unexpected return pc for runtime.throw called from 0x360c2ed
stack: frame={sp:0x7f7e059ec9b0, fp:0x7f7e059ec9e0} stack=[0xc0048c4000,0xc0048cc000)

runtime.throw(0x3, 0x3)
	src/runtime/panic.go:608 +0x72 fp=0x7f7e059ec9e0 sp=0x7f7e059ec9b0 pc=0x697d72
created by github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunWorker
	gopath/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:192 +0xc4

goroutine 1 [chan receive]:
testing.(*T).Run(0xc000244200, 0x36521a6, 0x23, 0x374daf0, 0xc0007c1b01)
	src/testing/testing.go:879 +0x689
testing.runTests.func1(0xc000244200)
	src/testing/testing.go:1119 +0xa9
testing.tRunner(0xc000244200, 0xc0007c1d40)
	src/testing/testing.go:827 +0x163
testing.runTests(0xc0004de4e0, 0x5bacac0, 0x2c, 0x2c, 0x0)
	src/testing/testing.go:1117 +0x4ef
testing.(*M).Run(0xc0007de000, 0x0)
	src/testing/testing.go:1034 +0x2ef
github.com/cockroachdb/cockroach/pkg/ccl/backupccl.TestMain(0xc0007de000)
	gopath/src/github.com/cockroachdb/cockroach/pkg/ccl/backupccl/main_test.go:31 +0x17d
main.main()
	_testmain.go:138 +0x222



unexpected fault address 0xe400
fatal error: fault
[signal SIGSEGV: segmentation violation code=0x1 addr=0xe400 pc=0x2bff519]

goroutine 7540 [running]:
runtime: unexpected return pc for runtime.throw called from 0x4
stack: frame={sp:0x7f46ad9fe930, fp:0x7f46ad9fe960} stack=[0xc0036da000,0xc0036fa000)

runtime.throw(0x4, 0x7f46ad9fe9a0)
	src/runtime/panic.go:608 +0x72 fp=0x7f46ad9fe960 sp=0x7f46ad9fe930 pc=0x697d72
created by github.com/cockroachdb/cockroach/pkg/sql/pgwire.(*conn).serveImpl
	gopath/src/github.com/cockroachdb/cockroach/pkg/sql/pgwire/conn.go:267 +0x13f4

goroutine 1 [chan receive]:
testing.(*T).Run(0xc000232100, 0x36521a6, 0x23, 0x374daf0, 0xc0007f1b01)
	src/testing/testing.go:879 +0x689
testing.runTests.func1(0xc000232100)
	src/testing/testing.go:1119 +0xa9
testing.tRunner(0xc000232100, 0xc0007f1d40)
	src/testing/testing.go:827 +0x163
testing.runTests(0xc0002af640, 0x5bacac0, 0x2c, 0x2c, 0x0)
	src/testing/testing.go:1117 +0x4ef
testing.(*M).Run(0xc00058e000, 0x0)
	src/testing/testing.go:1034 +0x2ef
github.com/cockroachdb/cockroach/pkg/ccl/backupccl.TestMain(0xc00058e000)
	gopath/src/github.com/cockroachdb/cockroach/pkg/ccl/backupccl/main_test.go:31 +0x17d
main.main()
	_testmain.go:138 +0x222



unexpected fault address 0x26a00
fatal error: fault
[signal SIGSEGV: segmentation violation code=0x1 addr=0x26a00 pc=0x2bff519]

goroutine 6520 [running]:
runtime: unexpected return pc for runtime.throw called from 0x4
stack: frame={sp:0x7fec833fe930, fp:0x7fec833fe960} stack=[0xc002736000,0xc002738000)

runtime.throw(0x4, 0x7fec833fe9a0)
	src/runtime/panic.go:608 +0x72 fp=0x7fec833fe960 sp=0x7fec833fe930 pc=0x697d72
created by github.com/cockroachdb/cockroach/vendor/google.golang.org/grpc.(*Server).handleRawConn
	gopath/src/github.com/cockroachdb/cockroach/vendor/google.golang.org/grpc/server.go:638 +0x7c2

goroutine 1 [chan receive]:
testing.(*T).Run(0xc000754000, 0x36521a6, 0x23, 0x374daf0, 0xc0007cdb01)
	src/testing/testing.go:879 +0x689
testing.runTests.func1(0xc000754000)
	src/testing/testing.go:1119 +0xa9
testing.tRunner(0xc000754000, 0xc0007cdd40)
	src/testing/testing.go:827 +0x163
testing.runTests(0xc00031b840, 0x5bacac0, 0x2c, 0x2c, 0x0)
	src/testing/testing.go:1117 +0x4ef
testing.(*M).Run(0xc00074a100, 0x0)
	src/testing/testing.go:1034 +0x2ef
github.com/cockroachdb/cockroach/pkg/ccl/backupccl.TestMain(0xc00074a100)
	gopath/src/github.com/cockroachdb/cockroach/pkg/ccl/backupccl/main_test.go:31 +0x17d
main.main()
	_testmain.go:138 +0x222

dvyukov · 2018-09-19T08:06:22Z

@benesch do you know if native part of cockroachdb was testing with asan/tsan/msan? If not, that would be the first thing I would do.

benesch · 2018-09-19T14:30:46Z

Ok, great, you can reproduce!

Do you know how to read that stack trace you posted?

unexpected fault address 0x2600
fatal error: fault
[signal SIGSEGV: segmentation violation code=0x1 addr=0x2600 pc=0x2bff9a6]

goroutine 5587 [running]:
runtime: unexpected return pc for runtime.throw called from 0x360c2ed
stack: frame={sp:0x7f7e059ec9b0, fp:0x7f7e059ec9e0} stack=[0xc0048c4000,0xc0048cc000)

runtime.throw(0x3, 0x3)
	src/runtime/panic.go:608 +0x72 fp=0x7f7e059ec9e0 sp=0x7f7e059ec9b0 pc=0x697d72
created by github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunWorker
	gopath/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:192 +0xc4

I don't understand how we go from stopper.go:192 (which launches a goroutine) to runtime.throw. (Other segfaults I've triggered in Go are understandable, by contrast.) I suppose the answer is probably memory corruption?

@benesch do you know if native part of cockroachdb was testing with asan/tsan/msan? If not, that would be the first thing I would do.

It's not tested with any of the sanitizers, unfortunately. Most of our serious testing is done in Go, so we exercise the C/C++ code via cgo. That makes it tricky to use any of the sanitizers. I've tried to get msan to work but it triggers a number of false positives due to structs with padding: #26167. I suppose it should be possible to use asan and tsan on our native code by plumbing in -fsanitize during compilation. I'll see if I can find some time to do that today.

That said, there are only four C/C++ libraries that we link: snappy, protobuf, RocksDB, and jemalloc. I've verified that this reproduces without jemalloc, so I think we can rule that one out. Snappy and protobuf are both used extensively at Google, so I'd be surprised if they were to blame. RocksDB is extremely complicated, but they test upstream with asan/tsan/ubsan. So that leaves our first-party C/C++ that glues everything together. It's certainly possible that we've had a lurking memory bug there, but a) the code's not that complicated, and b) I feel like we'd have seen it crop up somewhere before. This is all speculation, of course.

leventeliu · 2018-11-04T09:35:33Z

Hi, @dvyukov . Any update on this issue? We got a similar problem in our project after we updated go to version 1.11. And I just confirm that this issue exists in 1.11.2 too.

benesch · 2018-11-07T01:40:19Z

Indeed, this seems to be a straightforward double free. I'm going to try to write up a targeted test case and figure out how to submit a patch to LLVM.

dvyukov · 2018-11-07T01:50:06Z

You debugged the root cause? That was brave!

You already know how to build it, right?
Here are some docs on contributing:
https://github.com/google/sanitizers/wiki/AddressSanitizerHowToContribute
https://llvm.org/docs/Contributing.html
https://llvm.org/docs/Phabricator.html

There is probably lots of irrelevant stuff. The main part is to send a Phabricator review.

dvyukov · 2018-11-07T01:50:27Z

And a test would be good.

benesch · 2018-11-07T23:04:07Z

I think I spoke too soon. I thought the bug was a simple reference counting error with shared SyncClocks, but the problem appears to be deeper than that. In the course of my investigation I added a bit of logging to the DenseSlabAllocator. Here's a snapshot of the period before an assertion failure:

clock allocator: alloc 2522 (curproc 0x00000279c798, tid e7fc700, pos 60)
clock allocator: free  4442 (curproc 0x00000279c798, tid a1ffb700, pos 60)
clock allocator: free  4441 (curproc 0x00000279c798, tid a1ffb700, pos 61)
clock allocator: free  4440 (curproc 0x00000279c798, tid a1ffb700, pos 62)
clock allocator: free  4439 (curproc 0x00000279c798, tid a1ffb700, pos 63)
clock allocator: alloc 4441 (curproc 0x00000279c798, tid e7fc700, pos 61)
clock allocator: free  4443 (curproc 0x00000279c798, tid a1ffb700, pos 64)
clock allocator: alloc 4443 (curproc 0x00000279c798, tid e7fc700, pos 64)
clock allocator: alloc 4439 (curproc 0x00000279c798, tid e7fc700, pos 63)
clock allocator: alloc 4440 (curproc 0x00000279c798, tid e7fc700, pos 62)
clock allocator: alloc 4441 (curproc 0x00000279c798, tid e7fc700, pos 61)
clock allocator: alloc 5257 (curproc 0x00000279da58, tid 42ffd700, pos 9)
FATAL: ThreadSanitizer CHECK failed: ../rtl/tsan_dense_alloc.h:86 "((idx)) != ((0))" (0x0, 0x0)

Notice how slot 4441 is allocated twice without an intervening free! This is a guaranteed disaster.
curproc is the address of the proc struct returned by get_cur_proc(), while tid is the opaque ID reported pthread_self().

The problem seems to be that the proc struct 0x0279c798, which contains a non-threadsafe allocator cache, is simultaneously in use by goroutines e7fc700 and a1ffb700. (It's possible that those two goroutines were quickly scheduled and rescheduled on the same pthread, but I doubt it.)

So is it possible that the Go runtime is sometimes failing to update the proc struct when it moves a goroutine between pthreads? That's my best guess as to what's happening. It explains why this badness has only been observed in Go programs, and not any C++ programs.

Disclaimer that I'm not 100% confident in my instrumentation, so I might be a little bit off here.

benesch · 2018-11-08T21:15:02Z

I gathered some more evidence for the theory that the Go runtime is sharing processor data across multiple threads. I started by annotating several of the calls into tsan. Here's a snippet.

diff --git a/lib/tsan/rtl/tsan_rtl.h b/lib/tsan/rtl/tsan_rtl.h
index 5e2a745c9..130b33609 100644
--- a/lib/tsan/rtl/tsan_rtl.h
+++ b/lib/tsan/rtl/tsan_rtl.h
@@ -356,6 +356,7 @@ struct Processor {
   DenseSlabAllocCache sync_cache;
   DenseSlabAllocCache clock_cache;
   DDPhysicalThread *dd_pt;
+  bool inuse;
 };
 
 #if !SANITIZER_GO

diff --git a/lib/tsan/go/tsan_go.cc b/lib/tsan/go/tsan_go.cc
index 71a660683..ba0dd4a08 100644
--- a/lib/tsan/go/tsan_go.cc
+++ b/lib/tsan/go/tsan_go.cc
@@ -231,48 +231,84 @@ void __tsan_proc_destroy(Processor *proc) {
 }
 
 void __tsan_acquire(ThreadState *thr, void *addr) {
+  CHECK(!thr->proc()->inuse);
+  thr->proc()->inuse = true;
   Acquire(thr, 0, (uptr)addr);
+  CHECK(thr->proc()->inuse);
+  thr->proc()->inuse = false;
 }

The full diff is here: https://gist.github.com/benesch/cb3258a3eb4b573b5a3b7891db000e66

The reproductions of the original bug now fail with

FATAL: ThreadSanitizer CHECK failed: ./gotsan.cc:234 "((!thr->proc()->inuse)) != (0)" (0x0, 0x0)

instead of the various assertion errors they would previously fail with. That strongly implies that the root cause of all the assertion failures is the incorrect sharing of thr->proc(). And programs which did not trigger the previous assertions do not trigger this new assertion.

I've started poking around the Go scheduler but I'm way out of my depth here. I'll note that this diff
to disable stealing of Ps in syscalls

diff --git a/src/runtime/proc.go b/src/runtime/proc.go
index f82014eb92..5993e2a6da 100644
--- a/src/runtime/proc.go
+++ b/src/runtime/proc.go
@@ -1081,7 +1081,7 @@ func stopTheWorldWithSema() {
 	// try to retake all P's in Psyscall status
 	for _, p := range allp {
 		s := p.status
-		if s == _Psyscall && atomic.Cas(&p.status, s, _Pgcstop) {
+		if s == _Psyscall && false && atomic.Cas(&p.status, s, _Pgcstop) {
 			if trace.enabled {
 				traceGoSysBlock(p)
 				traceProcStop(p)
@@ -1455,7 +1455,7 @@ func forEachP(fn func(*p)) {
 	// off to induce safe point function execution.
 	for _, p := range allp {
 		s := p.status
-		if s == _Psyscall && p.runSafePointFn == 1 && atomic.Cas(&p.status, s, _Pidle) {
+		if s == _Psyscall && false && p.runSafePointFn == 1 && atomic.Cas(&p.status, s, _Pidle) {
 			if trace.enabled {
 				traceGoSysBlock(p)
 				traceProcStop(p)
@@ -4457,7 +4457,7 @@ func retake(now int64) uint32 {
 		}
 		pd := &_p_.sysmontick
 		s := _p_.status
-		if s == _Psyscall {
+		if s == _Psyscall && false {
 			// Retake P from syscall if it's there for more than 1 sysmon tick (at least 20us).
 			t := int64(_p_.syscalltick)
 			if int64(pd.syscalltick) != t {

vastly decreases the probability that xenomint/sqlite will hit the tsan assertion. (I initially thought it had fixed the problem, but alas.) With the tsan patch but not the Go scheduler patch, running xenomint/sqlite's tests under race reliably hits the tsan assertion on every run. With both patches, I managed to get 187 runs before it failed. I'm not sure what to make of that.

dvyukov · 2018-11-08T23:05:51Z

If 2 threads use the same Processor, it would explain all kinds of weird memory corruptions in tsan runtime.

thr->proc() obtains current Processor from Go runtime as g.m.p.racectx. The firing CHECK means that either 2 goroutines run with the same g.m, or 2 m's runtime with the same m.p. Both of which looks badly wrong.

I am thinking of some way to check this condition reliably in Go runtime and catch it es early as possible. But so far have not come up with anything.

dvyukov · 2018-11-08T23:07:28Z

This does not happen with Go1.10, right? Then one option is to bisect Go from Go1.10 to Go1.11 to find the root cause.

benesch · 2018-11-08T23:28:24Z

This does not happen with Go1.10, right? Then one option is to bisect Go from Go1.10 to Go1.11 to find the root cause.

Actually, I think this does still happen with Go 1.10, just far, far less frequently. I checked my recollection with @petermattis today, and we remembered seeing spurious failures from race-enabled CockroachDB builds at least as far back as early 2017. The difference was that those failures would only show up once every few weeks, back when we used to run a several-node CockroachDB cluster with race binaries. As far as I can remember, the symptoms were the same: various assertion failures inside tsan that were rather inexplicable.

I'll double-check now.

benesch · 2018-11-08T23:41:35Z

Yep, xenomint/sqlite blows up on go1.10 if you add the additional assertions to the race detector. Looks like this has been a latent bug in the Go runtime.

benesch · 2018-11-09T03:45:58Z

Perhaps this isn't so bad. On a hunch, I added a bit more information to the new assertions:

diff --git a/lib/tsan/go/tsan_go.cc b/lib/tsan/go/tsan_go.cc
index 5f2507b7d..d3345c6b7 100644
--- a/lib/tsan/go/tsan_go.cc
+++ b/lib/tsan/go/tsan_go.cc
 void __tsan_acquire(ThreadState *thr, void *addr) {
+  auto inuse_goid = thr->proc()->inuse_goid;
+  if (inuse_goid != 0) {
+    __sanitizer::Printf("tsan_acquire(%p): proc in use by %d (this goid: %d)\n", addr, inuse_goid, thr->goid);
+    __sanitizer::Die();
+  }
+  thr->proc()->inuse_goid = thr->goid;
   Acquire(thr, 0, (uptr)addr);
+  CHECK_EQ(thr->proc()->inuse_goid, thr->goid);
+  thr->proc()->inuse_goid = 0;
 }

Then I noticed that all of the failures seemed to be triggered by a call to __tsan_acquire:

tsan_acquire(0x000005035be8): proc in use by 48 (this goid: 49)

Even more suspiciously, every failure was an acquisition of the exactly the same memory address, 0x5035be8. Kind of seems like a global variable...

$ nm sqlite.test | grep 000005035be8
0000000005035be8 s runtime.racecgosync

Looks like cgocall is just calling raceacquire(&racecgosync) before it allocates a P (via exitsyscall), and that's not valid.

I'm hopeful that I've actually found the root cause this time.

dvyukov · 2018-11-09T03:50:46Z

Just a minute ago I started looking at:

func endcgo(mp *m) {
	mp.incgo = false
	mp.ncgo--

	if raceenabled {
		raceacquire(unsafe.Pointer(&racecgosync))
	}
}

:)

dvyukov · 2018-11-09T03:56:12Z

It needs to do raceacquire after exitsyscall. I think we beaten by the dangling g.m.p pointer left in reentersyscall.

dvyukov · 2018-11-09T03:58:24Z

cgocallbackg is also badly broken.

dvyukov · 2018-11-09T04:20:09Z

@benesch do you want to fix yourself? Since you already done so much work on this, don't want to steal it from you.

I think we also need to change reentersyscall to leave the dangling p pointer in some shadow variable instead of the g.m.p. g.m.p set makes impression that m and p are wired when they are actually not. Then only exitsyscallfast and retake will use the shadow p variable. This should have been caught this bug.

benesch · 2018-11-09T06:00:21Z

Whoops, sorry, missed your messages! About to send a CL.

I like your idea of a shadow variable but I didn't want to embark on that refactor myself, so I took a less intrusive approach. Now that I have your blessing for the refactor I'm happy to do that instead since it will prevent future mistakes of this kind—I agree that it is extremely confusing that an M can point to a P that it doesn't currently own. But I'll send the CL for what I have in the meantime to get the discussion started.

gopherbot · 2018-11-09T06:04:53Z

Change https://golang.org/cl/148717 mentions this issue: runtime: never call into race detector without runnable P

benesch · 2018-11-09T09:24:45Z

I think we also need to change reentersyscall to leave the dangling p pointer in some shadow variable instead of the g.m.p. g.m.p set makes impression that m and p are wired when they are actually not. Then only exitsyscallfast and retake will use the shadow p variable. This should have been caught this bug.

Ok, updated the CL to take this approach.

petermattis · 2018-11-09T14:00:57Z

Nice sleuthing! Gentle plea for this to be back-ported to go1.11 as this bug is preventing CockroachDB from upgrading to go1.11 (yes, it only affects race builds, but we run go test -race during CI).

ianlancetaylor · 2018-11-09T15:08:50Z

@gopherbot Please open an issue to backport to 1.11

gopherbot · 2018-11-09T15:08:57Z

Backport issue(s) opened: #28690 (for 1.11).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://golang.org/wiki/MinorReleases.

gopherbot · 2018-11-10T00:10:20Z

Change https://golang.org/cl/148899 mentions this issue: runtime: ensure m.p is never stale

gopherbot · 2018-11-10T04:42:32Z

Change https://golang.org/cl/148823 mentions this issue: runtime: wire g and p

When a goroutine enters a syscall, its M unwires from its P to allow the P to be retaken by another M if the syscall is slow. The M retains a reference to its old P, however, so that if its old P has not been retaken when the syscall returns, it can quickly reacquire that P. The implementation, however, was confusing, as it left the reference to the potentially-retaken P in m.p, which implied that the P was still wired. Make the code clearer by enforcing the invariant that m.p is never stale. entersyscall now moves m.p to m.oldp and sets m.p to 0; exitsyscall does the reverse, provided m.oldp has not been retaken. With this scheme in place, the issue described in #27660 (assertion failures in the race detector) would have resulted in a clean segfault instead of silently corrupting memory. Change-Id: Ib3e03623ebed4f410e852a716919fe4538858f0a Reviewed-on: https://go-review.googlesource.com/c/148899 Run-TryBot: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>

related issues: golang/go#23899, golang/go#28458, golang/go#27660 Update #3

benesch mentioned this issue Sep 13, 2018

logictest: stressracing the logictest reveals data race in time.Now() cockroachdb/cockroach#30027

Closed

ianlancetaylor added RaceDetector NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Sep 13, 2018

ianlancetaylor added this to the Go1.12 milestone Sep 13, 2018

benesch mentioned this issue Sep 18, 2018

perf: upgrade from go1.10 to go1.11 cockroachdb/cockroach#29827

Closed

benesch mentioned this issue Sep 25, 2018

SIGSEGV crash in storage layer cockroachdb/cockroach#30637

Closed

katiehockman mentioned this issue Oct 31, 2018

runtime: seg fault on Darwin in GC while under debugger #28373

Closed

gopherbot mentioned this issue Nov 9, 2018

runtime: ThreadSanitizer CHECK failed [1.11 backport] #28690

Closed

gopherbot closed this as completed in e496e61 Nov 9, 2018

dvyukov mentioned this issue Nov 10, 2018

runtime: add g.p #24686

Open

changkun added a commit to golang-design/under-the-hood that referenced this issue Jan 27, 2019

runtime/cgo: revise for go1.12

d9baa4a

related issues: golang/go#23899, golang/go#28458, golang/go#27660 Update #3

gjrtimmer mentioned this issue Aug 22, 2019

possible flaky build? mattn/go-sqlite3#659

Closed

golang locked and limited conversation to collaborators Nov 10, 2019

gopherbot added the FrozenDueToAge label Nov 10, 2019

runtime: ThreadSanitizer CHECK failed #27660

runtime: ThreadSanitizer CHECK failed #27660

Comments

benesch commented Sep 13, 2018

What version of Go are you using (go version)?

Does this issue reproduce with the latest release?

What operating system and processor architecture are you using (go env)?

What did you do?

What did you expect to see?

What did you see instead?

benesch commented Sep 13, 2018

benesch commented Sep 13, 2018

ianlancetaylor commented Sep 13, 2018

dvyukov commented Sep 14, 2018

benesch commented Sep 14, 2018 via email

dvyukov commented Sep 14, 2018

benesch commented Sep 14, 2018

dvyukov commented Sep 14, 2018

benesch commented Sep 14, 2018 via email

dvyukov commented Sep 14, 2018

benesch commented Sep 17, 2018

benesch commented Sep 17, 2018

dvyukov commented Sep 17, 2018

dvyukov commented Sep 17, 2018

benesch commented Sep 17, 2018

benesch commented Sep 18, 2018

dvyukov commented Sep 18, 2018

benesch commented Sep 18, 2018

benesch commented Sep 18, 2018

dvyukov commented Sep 19, 2018

dvyukov commented Sep 19, 2018

benesch commented Sep 19, 2018

leventeliu commented Nov 4, 2018

benesch commented Nov 7, 2018

dvyukov commented Nov 7, 2018

dvyukov commented Nov 7, 2018

benesch commented Nov 7, 2018 • edited Loading

benesch commented Nov 8, 2018

dvyukov commented Nov 8, 2018

dvyukov commented Nov 8, 2018

benesch commented Nov 8, 2018

benesch commented Nov 8, 2018

benesch commented Nov 9, 2018

dvyukov commented Nov 9, 2018

dvyukov commented Nov 9, 2018

dvyukov commented Nov 9, 2018

dvyukov commented Nov 9, 2018

benesch commented Nov 9, 2018 • edited Loading

gopherbot commented Nov 9, 2018

benesch commented Nov 9, 2018

petermattis commented Nov 9, 2018

ianlancetaylor commented Nov 9, 2018

gopherbot commented Nov 9, 2018

gopherbot commented Nov 10, 2018

gopherbot commented Nov 10, 2018

What version of Go are you using (`go version`)?

What operating system and processor architecture are you using (`go env`)?

benesch commented Nov 7, 2018 •

edited

Loading

benesch commented Nov 9, 2018 •

edited

Loading