Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unique: fatal error: found pointer to free object #69210

Closed
anacrolix opened this issue Sep 2, 2024 · 13 comments
Closed

unique: fatal error: found pointer to free object #69210

anacrolix opened this issue Sep 2, 2024 · 13 comments
Assignees
Labels
FixPending Issues that have a fix which has not yet been reviewed or submitted. NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@anacrolix
Copy link
Contributor

anacrolix commented Sep 2, 2024

Go version

go version go1.23.0 darwin/arm64

Output of go env in your module/workspace:

GO111MODULE=''
GOARCH='arm64'
GOBIN=''
GOCACHE='/Users/anacrolix/Library/Caches/go-build'
GOENV='/Users/anacrolix/Library/Application Support/go/env'
GOEXE=''
GOEXPERIMENT='aliastypeparams'
GOFLAGS=''
GOHOSTARCH='arm64'
GOHOSTOS='darwin'
GOINSECURE=''
GOMODCACHE='/Users/anacrolix/go/pkg/mod'
GOOS='darwin'
GOPATH='/Users/anacrolix/go'
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/Users/anacrolix/src/go1.23'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='local'
GOTOOLDIR='/Users/anacrolix/src/go1.23/pkg/tool/darwin_arm64'
GOVCS=''
GOVERSION='go1.23.0'
GODEBUG=''
GOTELEMETRY='on'
GOTELEMETRYDIR='/Users/anacrolix/Library/Application Support/go/telemetry'
GCCGO='gccgo'
GOARM64='v8.0'
AR='ar'
CC='clang'
CXX='clang++'
CGO_ENABLED='1'
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/m4/f67w9zfx1pl386_2yjs7xtkm0000gn/T/go-build2057168856=/tmp/go-build -gno-record-gcc-switches -fno-common'

What did you do?

Using unique.Handle very heavily.

What did you see happen?

runtime: marked free object in span 0x11e576378, elemsize=24 freeindex=0 (bad use of unsafe.Pointer? try -d=checkptr)
0xc00d5e2000 free  unmarked
0xc00d5e2018 free  unmarked
0xc00d5e2030 alloc marked
0xc00d5e2048 alloc marked
0xc00d5e2060 free  unmarked
0xc00d5e2078 free  unmarked
0xc00d5e2090 free  unmarked
0xc00d5e20a8 free  unmarked
0xc00d5e20c0 free  unmarked
0xc00d5e20d8 free  unmarked
0xc00d5e20f0 free  unmarked
0xc00d5e2108 free  unmarked
0xc00d5e2120 free  unmarked
0xc00d5e2138 alloc marked
0xc00d5e2150 alloc marked
0xc00d5e2168 alloc marked
0xc00d5e2180 free  unmarked
0xc00d5e2198 alloc marked
0xc00d5e21b0 alloc marked
0xc00d5e21c8 free  unmarked
0xc00d5e21e0 free  unmarked
0xc00d5e21f8 free  unmarked
0xc00d5e2210 free  unmarked
0xc00d5e2228 free  unmarked
0xc00d5e2240 free  unmarked
0xc00d5e2258 free  unmarked
0xc00d5e2270 alloc marked
0xc00d5e2288 free  unmarked
0xc00d5e22a0 alloc marked
0xc00d5e22b8 free  unmarked
0xc00d5e22d0 free  unmarked
0xc00d5e22e8 free  unmarked
0xc00d5e2300 free  unmarked
0xc00d5e2318 free  unmarked
0xc00d5e2330 free  unmarked
0xc00d5e2348 free  unmarked
0xc00d5e2360 alloc marked
0xc00d5e2378 free  unmarked
0xc00d5e2390 free  unmarked
0xc00d5e23a8 free  unmarked
0xc00d5e23c0 free  unmarked
0xc00d5e23d8 free  unmarked
0xc00d5e23f0 free  unmarked
0xc00d5e2408 free  unmarked
0xc00d5e2420 free  unmarked
0xc00d5e2438 free  unmarked
0xc00d5e2450 free  unmarked
0xc00d5e2468 free  unmarked
0xc00d5e2480 free  unmarked
0xc00d5e2498 free  unmarked
0xc00d5e24b0 free  unmarked
0xc00d5e24c8 free  unmarked
0xc00d5e24e0 free  unmarked
0xc00d5e24f8 free  unmarked
0xc00d5e2510 free  unmarked
0xc00d5e2528 free  unmarked
0xc00d5e2540 free  unmarked
0xc00d5e2558 free  unmarked
0xc00d5e2570 free  unmarked
0xc00d5e2588 free  unmarked
0xc00d5e25a0 free  unmarked
0xc00d5e25b8 free  unmarked
0xc00d5e25d0 free  unmarked
0xc00d5e25e8 free  unmarked
0xc00d5e2600 free  unmarked
0xc00d5e2618 free  unmarked
0xc00d5e2630 free  unmarked
0xc00d5e2648 free  unmarked
0xc00d5e2660 free  unmarked
0xc00d5e2678 free  unmarked
0xc00d5e2690 free  unmarked
0xc00d5e26a8 free  unmarked
0xc00d5e26c0 free  unmarked
0xc00d5e26d8 free  unmarked
0xc00d5e26f0 free  unmarked
0xc00d5e2708 free  unmarked
0xc00d5e2720 free  unmarked
0xc00d5e2738 free  unmarked
0xc00d5e2750 free  unmarked
0xc00d5e2768 free  unmarked
0xc00d5e2780 free  unmarked
0xc00d5e2798 free  unmarked
0xc00d5e27b0 free  unmarked
0xc00d5e27c8 free  unmarked
0xc00d5e27e0 free  unmarked
0xc00d5e27f8 free  unmarked
0xc00d5e2810 free  unmarked
0xc00d5e2828 free  unmarked
0xc00d5e2840 free  unmarked
0xc00d5e2858 free  unmarked
0xc00d5e2870 free  unmarked
0xc00d5e2888 free  unmarked
0xc00d5e28a0 free  unmarked
0xc00d5e28b8 free  unmarked
0xc00d5e28d0 free  unmarked
0xc00d5e28e8 free  unmarked
0xc00d5e2900 free  unmarked
0xc00d5e2918 free  unmarked
0xc00d5e2930 free  unmarked
0xc00d5e2948 free  unmarked
0xc00d5e2960 free  unmarked
0xc00d5e2978 free  unmarked
0xc00d5e2990 free  unmarked
0xc00d5e29a8 free  unmarked
0xc00d5e29c0 free  unmarked
0xc00d5e29d8 free  unmarked
0xc00d5e29f0 free  unmarked
0xc00d5e2a08 free  unmarked
0xc00d5e2a20 alloc marked
0xc00d5e2a38 free  unmarked
0xc00d5e2a50 free  unmarked
0xc00d5e2a68 free  unmarked
0xc00d5e2a80 free  unmarked
0xc00d5e2a98 alloc marked
0xc00d5e2ab0 alloc marked
0xc00d5e2ac8 free  unmarked
0xc00d5e2ae0 free  unmarked
0xc00d5e2af8 alloc marked
0xc00d5e2b10 free  unmarked
0xc00d5e2b28 alloc marked
0xc00d5e2b40 free  unmarked
0xc00d5e2b58 alloc marked
0xc00d5e2b70 alloc marked
0xc00d5e2b88 free  unmarked
0xc00d5e2ba0 free  unmarked
0xc00d5e2bb8 free  unmarked
0xc00d5e2bd0 free  unmarked
0xc00d5e2be8 free  unmarked
0xc00d5e2c00 free  unmarked
0xc00d5e2c18 free  unmarked
0xc00d5e2c30 free  unmarked
0xc00d5e2c48 alloc marked
0xc00d5e2c60 alloc marked
0xc00d5e2c78 free  unmarked
0xc00d5e2c90 free  unmarked
0xc00d5e2ca8 free  unmarked
0xc00d5e2cc0 free  unmarked
0xc00d5e2cd8 free  unmarked
0xc00d5e2cf0 free  unmarked
0xc00d5e2d08 alloc marked
0xc00d5e2d20 free  unmarked
0xc00d5e2d38 free  unmarked
0xc00d5e2d50 free  unmarked
0xc00d5e2d68 free  unmarked
0xc00d5e2d80 alloc marked
0xc00d5e2d98 free  unmarked
0xc00d5e2db0 free  unmarked
0xc00d5e2dc8 free  unmarked
0xc00d5e2de0 free  unmarked
0xc00d5e2df8 alloc marked
0xc00d5e2e10 free  unmarked
0xc00d5e2e28 free  unmarked
0xc00d5e2e40 alloc marked
0xc00d5e2e58 free  unmarked
0xc00d5e2e70 free  unmarked
0xc00d5e2e88 free  unmarked
0xc00d5e2ea0 free  unmarked
0xc00d5e2eb8 free  unmarked
0xc00d5e2ed0 free  unmarked
0xc00d5e2ee8 free  unmarked
0xc00d5e2f00 alloc marked
0xc00d5e2f18 alloc marked
0xc00d5e2f30 alloc marked
0xc00d5e2f48 alloc marked
0xc00d5e2f60 alloc marked
0xc00d5e2f78 alloc marked
0xc00d5e2f90 free  unmarked
0xc00d5e2fa8 free  unmarked
0xc00d5e2fc0 free  unmarked
0xc00d5e2fd8 alloc marked
0xc00d5e2ff0 free  unmarked
0xc00d5e3008 free  unmarked
0xc00d5e3020 free  unmarked
0xc00d5e3038 free  unmarked
0xc00d5e3050 alloc marked
0xc00d5e3068 free  unmarked
0xc00d5e3080 free  unmarked
0xc00d5e3098 free  unmarked
0xc00d5e30b0 alloc marked
0xc00d5e30c8 free  unmarked
0xc00d5e30e0 free  unmarked
0xc00d5e30f8 free  unmarked
0xc00d5e3110 free  unmarked
0xc00d5e3128 free  unmarked
0xc00d5e3140 alloc marked
0xc00d5e3158 free  unmarked
0xc00d5e3170 free  unmarked
0xc00d5e3188 alloc marked
0xc00d5e31a0 alloc marked
0xc00d5e31b8 free  unmarked
0xc00d5e31d0 free  unmarked
0xc00d5e31e8 free  unmarked
0xc00d5e3200 alloc marked
0xc00d5e3218 free  unmarked
0xc00d5e3230 free  unmarked
0xc00d5e3248 alloc marked
0xc00d5e3260 alloc marked
0xc00d5e3278 alloc marked
0xc00d5e3290 alloc marked
0xc00d5e32a8 free  unmarked
0xc00d5e32c0 free  unmarked
0xc00d5e32d8 alloc marked
0xc00d5e32f0 free  unmarked
0xc00d5e3308 free  unmarked
0xc00d5e3320 free  unmarked
0xc00d5e3338 free  unmarked
0xc00d5e3350 free  unmarked
0xc00d5e3368 alloc marked
0xc00d5e3380 free  unmarked
0xc00d5e3398 free  unmarked
0xc00d5e33b0 free  unmarked
0xc00d5e33c8 alloc marked
0xc00d5e33e0 free  unmarked
0xc00d5e33f8 free  unmarked
0xc00d5e3410 free  unmarked
0xc00d5e3428 free  unmarked
0xc00d5e3440 free  unmarked
0xc00d5e3458 free  unmarked
0xc00d5e3470 free  unmarked
0xc00d5e3488 alloc marked
0xc00d5e34a0 free  unmarked
0xc00d5e34b8 free  marked   zombie
0x000000c00d5e34b8:  0x8e427c5450f04ce1  0x86f08f55d672acf1
0x000000c00d5e34c8:  0x000000001575a5fa
0xc00d5e34d0 free  unmarked
0xc00d5e34e8 free  unmarked
0xc00d5e3500 free  unmarked
0xc00d5e3518 free  unmarked
0xc00d5e3530 free  unmarked
0xc00d5e3548 free  unmarked
0xc00d5e3560 free  unmarked
0xc00d5e3578 free  unmarked
0xc00d5e3590 free  unmarked
0xc00d5e35a8 free  unmarked
0xc00d5e35c0 alloc marked
0xc00d5e35d8 free  unmarked
0xc00d5e35f0 free  unmarked
0xc00d5e3608 free  unmarked
0xc00d5e3620 alloc marked
0xc00d5e3638 free  unmarked
0xc00d5e3650 alloc marked
0xc00d5e3668 free  unmarked
0xc00d5e3680 free  unmarked
0xc00d5e3698 free  unmarked
0xc00d5e36b0 free  unmarked
0xc00d5e36c8 free  unmarked
0xc00d5e36e0 alloc marked
0xc00d5e36f8 free  unmarked
0xc00d5e3710 free  unmarked
0xc00d5e3728 free  unmarked
0xc00d5e3740 free  unmarked
0xc00d5e3758 free  unmarked
0xc00d5e3770 alloc marked
0xc00d5e3788 free  unmarked
0xc00d5e37a0 free  unmarked
0xc00d5e37b8 free  unmarked
0xc00d5e37d0 alloc marked
0xc00d5e37e8 free  unmarked
0xc00d5e3800 free  unmarked
0xc00d5e3818 alloc marked
0xc00d5e3830 free  unmarked
0xc00d5e3848 free  unmarked
0xc00d5e3860 alloc marked
0xc00d5e3878 alloc marked
0xc00d5e3890 free  unmarked
0xc00d5e38a8 free  unmarked
0xc00d5e38c0 alloc marked
0xc00d5e38d8 free  unmarked
0xc00d5e38f0 alloc marked
0xc00d5e3908 alloc marked
0xc00d5e3920 alloc marked
0xc00d5e3938 free  unmarked
0xc00d5e3950 free  unmarked
0xc00d5e3968 free  unmarked
0xc00d5e3980 free  unmarked
0xc00d5e3998 free  unmarked
0xc00d5e39b0 free  unmarked
0xc00d5e39c8 alloc unmarked
0xc00d5e39e0 free  unmarked
0xc00d5e39f8 free  unmarked
0xc00d5e3a10 free  unmarked
0xc00d5e3a28 alloc marked
0xc00d5e3a40 free  unmarked
0xc00d5e3a58 free  unmarked
0xc00d5e3a70 free  unmarked
0xc00d5e3a88 free  unmarked
0xc00d5e3aa0 free  unmarked
0xc00d5e3ab8 free  unmarked
0xc00d5e3ad0 free  unmarked
0xc00d5e3ae8 alloc marked
0xc00d5e3b00 free  unmarked
0xc00d5e3b18 free  unmarked
0xc00d5e3b30 free  unmarked
0xc00d5e3b48 free  unmarked
0xc00d5e3b60 free  unmarked
0xc00d5e3b78 free  unmarked
0xc00d5e3b90 free  unmarked
0xc00d5e3ba8 free  unmarked
0xc00d5e3bc0 free  unmarked
0xc00d5e3bd8 alloc marked
0xc00d5e3bf0 alloc marked
0xc00d5e3c08 alloc marked
0xc00d5e3c20 free  unmarked
0xc00d5e3c38 alloc marked
0xc00d5e3c50 alloc marked
0xc00d5e3c68 free  unmarked
0xc00d5e3c80 alloc marked
0xc00d5e3c98 free  unmarked
0xc00d5e3cb0 alloc marked
0xc00d5e3cc8 free  unmarked
0xc00d5e3ce0 free  unmarked
0xc00d5e3cf8 free  unmarked
0xc00d5e3d10 free  unmarked
0xc00d5e3d28 alloc marked
0xc00d5e3d40 free  unmarked
0xc00d5e3d58 alloc marked
0xc00d5e3d70 free  unmarked
0xc00d5e3d88 free  unmarked
0xc00d5e3da0 alloc marked
0xc00d5e3db8 alloc marked
0xc00d5e3dd0 alloc marked
0xc00d5e3de8 free  unmarked
0xc00d5e3e00 free  unmarked
0xc00d5e3e18 alloc marked
0xc00d5e3e30 free  unmarked
0xc00d5e3e48 alloc marked
0xc00d5e3e60 alloc marked
0xc00d5e3e78 free  unmarked
0xc00d5e3e90 alloc marked
0xc00d5e3ea8 free  unmarked
0xc00d5e3ec0 free  unmarked
0xc00d5e3ed8 free  unmarked
0xc00d5e3ef0 alloc marked
0xc00d5e3f08 alloc marked
0xc00d5e3f20 free  unmarked
0xc00d5e3f38 free  unmarked
0xc00d5e3f50 free  unmarked
0xc00d5e3f68 free  unmarked
0xc00d5e3f80 alloc marked
0xc00d5e3f98 free  unmarked
0xc00d5e3fb0 free  unmarked
0xc00d5e3fc8 free  unmarked
0xc00d5e3fe0 free  unmarked
fatal error: found pointer to free object

goroutine 6 gp=0xc000622540 m=14 mp=0xc000101808 [running]:
runtime.throw({0x104a33cd7?, 0xc00d5e34d0?})
	/Users/anacrolix/src/go1.23/src/runtime/panic.go:1067 +0x38 fp=0xc000092930 sp=0xc000092900 pc=0x102c86b58
runtime.(*mspan).reportZombies(0x11e576378)
	/Users/anacrolix/src/go1.23/src/runtime/mgcsweep.go:890 +0x2f0 fp=0xc0000929b0 sp=0xc000092930 pc=0x102c2dc10
runtime.(*sweepLocked).sweep(0xc000092b18?, 0x0)
	/Users/anacrolix/src/go1.23/src/runtime/mgcsweep.go:658 +0xd14 fp=0xc000092af0 sp=0xc0000929b0 pc=0x102c2d434
runtime.(*mspan).ensureSwept(0x11e576378)
	/Users/anacrolix/src/go1.23/src/runtime/mgcsweep.go:474 +0xf4 fp=0xc000092b20 sp=0xc000092af0 pc=0x102c2c694
internal/weak.runtime_makeStrongFromWeak(0xc002a24768)
	/Users/anacrolix/src/go1.23/src/runtime/mheap.go:2069 +0xc8 fp=0xc000092b40 sp=0xc000092b20 pc=0x102c855e8
internal/weak.Pointer[...].Strong(...)
	/Users/anacrolix/src/go1.23/src/internal/weak/pointer.go:74
unique.addUniqueMap[...].func1.1({0x102c0d6f0?})
	/Users/anacrolix/src/go1.23/src/unique/handle.go:130 +0x54 fp=0xc000092c90 sp=0xc000092b40 pc=0x103bac494
internal/concurrent.(*HashTrieMap[...]).iter(0x1056ff1a0, 0xc011644780, 0xc000092f10)
	/Users/anacrolix/src/go1.23/src/internal/concurrent/hashtriemap.go:298 +0x1f8 fp=0xc000092d00 sp=0xc000092c90 pc=0x103ba9518
internal/concurrent.(*HashTrieMap[...]).iter(0x1056ff1a0, 0xc00c5de8c0, 0xc000092f10)
	/Users/anacrolix/src/go1.23/src/internal/concurrent/hashtriemap.go:291 +0x12c fp=0xc000092d70 sp=0xc000092d00 pc=0x103ba944c
internal/concurrent.(*HashTrieMap[...]).iter(0x1056ff1a0, 0xc001d47540, 0xc000092f10)
	/Users/anacrolix/src/go1.23/src/internal/concurrent/hashtriemap.go:291 +0x12c fp=0xc000092de0 sp=0xc000092d70 pc=0x103ba944c
internal/concurrent.(*HashTrieMap[...]).iter(0x1056ff1a0, 0xc002b84500, 0xc000092f10)
	/Users/anacrolix/src/go1.23/src/internal/concurrent/hashtriemap.go:291 +0x12c fp=0xc000092e50 sp=0xc000092de0 pc=0x103ba944c
internal/concurrent.(*HashTrieMap[...]).iter(0x1056ff1a0, 0xc001ef0f00, 0xc000092f10)
	/Users/anacrolix/src/go1.23/src/internal/concurrent/hashtriemap.go:291 +0x12c fp=0xc000092ec0 sp=0xc000092e50 pc=0x103ba944c
unique.addUniqueMap[...]).All.2(...)
	/Users/anacrolix/src/go1.23/src/internal/concurrent/hashtriemap.go:280
unique.addUniqueMap[...].func1()
	/Users/anacrolix/src/go1.23/src/unique/handle.go:129 +0x11c fp=0xc000092f50 sp=0xc000092ec0 pc=0x103bac41c
unique.registerCleanup.func1()
	/Users/anacrolix/src/go1.23/src/unique/handle.go:157 +0x98 fp=0xc000092fa0 sp=0xc000092f50 pc=0x102e748b8
runtime.unique_runtime_registerUniqueMapCleanup.func1(...)
	/Users/anacrolix/src/go1.23/src/runtime/mgc.go:1733
runtime.unique_runtime_registerUniqueMapCleanup.gowrap1()
	/Users/anacrolix/src/go1.23/src/runtime/mgc.go:1735 +0x48 fp=0xc000092fd0 sp=0xc000092fa0 pc=0x102c21f38
runtime.goexit({})
	/Users/anacrolix/src/go1.23/src/runtime/asm_arm64.s:1223 +0x4 fp=0xc000092fd0 sp=0xc000092fd0 pc=0x102c913b4
created by unique.runtime_registerUniqueMapCleanup in goroutine 1
	/Users/anacrolix/src/go1.23/src/runtime/mgc.go:1730 +0xa0

What did you expect to see?

No crash.

@MikeMitchellWebDev
Copy link
Contributor

Do you have a test program to share?

@seankhliao seankhliao added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Sep 2, 2024
@ianlancetaylor
Copy link
Contributor

CC @mknyszek

@mknyszek mknyszek self-assigned this Sep 3, 2024
@mknyszek mknyszek added this to the Backlog milestone Sep 3, 2024
@mknyszek
Copy link
Contributor

mknyszek commented Sep 3, 2024

As @MikeMitchellWebDev states, a reproducer would be helpful.

A few questions:

  • As the error message suggests, is there any chance this is related to uses of unsafe.Pointer or cgo in your code? Can you reproduce without it? I can believe this is a real issue with unique, but I want to cover my bases.
  • How easily can you reproduce? If it's not too hard to reproduce but difficult to share a reproducer, try running with -d=checkptr. There may be something "obviously wrong" that unique is doing, and I believe it'll get instrumented appropriately.

I've walked over the weak pointer implementation again for some possible reasons this could happen (like, accidentally having a preemption point while a pointer that shouldn't be visible is visible on the stack), but don't see anything there. I'll look at the unique package again tomorrow. I can also try to run a stress test myself and see, but I've benchmarked the package quite a bit and didn't run into anything.

At the very least, is it possible for you to share what types of things you're putting into unique.Make, approximately?

Thanks for reporting.

@anacrolix
Copy link
Contributor Author

I believe it has to do with use in sync.Map. The unique.Handle inner type is a wrapper type of the form type Blah [20]byte. I'll try to narrow it down, it only seems to happen when my system is under load or has memory pressure.

@mknyszek
Copy link
Contributor

mknyszek commented Sep 4, 2024

The unique.Handle inner type is a wrapper type of the form type Blah [20]byte.

Got it, thanks. That aligns with the allocation size class (24 bytes). Together with the stack trace, that suggests that either one of these [20]byte things is getting freed erroneously (the GC is failing to keep it alive when there is still a valid pointer for it out there) or a stale pointer is visible to the GC erroneously. I think I've ruled out the latter, but not the former.

I'll try to narrow it down, it only seems to happen when my system is under load or has memory pressure.

Thanks, I'd appreciate a narrowed-down example. If it's happening only under load or memory pressure, try increasing GOGC. This is starting to smell like an erroneous free, which suggests maybe a bad preemption point during weak->strong conversion.

That all being said, I think I have a lead on a weak pointer bug. Specifically, consider the following scenario:

  • A weak pointer is created to some object P.
  • P is no longer referenced.
  • A GC begins, and continues until it is almost complete. All stacks are black (per the tri-color GC abstraction) at this point.
  • We perform a weak-to-strong conversion, effectively creating a valid pointer out of nowhere. It gets written only to an already-blackened stack.
  • The GC completes, but P was never marked, so its storage is freed.

The fix is straightforward. If the GC mark phase is active, then the weak-to-strong conversion needs to mark and scan the object whose strong pointer was just created.

This can't happen outside the mark phase, because we're careful about making sure that, when we observe a weak pointer, it's always been swept already. So it's either definitely gone, or definitely still around.

Even if this somehow isn't the problem, it's definitely a problem that can manifest as the issue you observed. I will send a patch. If you're able to, please give it a try. I can provide instructions.

Thanks!

@mknyszek
Copy link
Contributor

mknyszek commented Sep 4, 2024

I was able to create my own reproducer, once I had a suspicion of the problem. It might still be worthwhile trying out the patch, but https://go.dev/cl/610396 should fix it. Let me know if you need help applying it and building a custom Go toolchain.

@mknyszek
Copy link
Contributor

mknyszek commented Sep 4, 2024

@gopherbot Please open a backport issue for Go 1.23.

This problem causes rare crashes in the GC with no workaround when using the unique package, which was introduced in the Go 1.23 cycle.

@gopherbot
Copy link
Contributor

Backport issue(s) opened: #69240 (for 1.23).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://go.dev/wiki/MinorReleases.

@mknyszek mknyszek added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Sep 4, 2024
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/610396 mentions this issue: internal/weak: shade pointer in weak-to-strong conversion

@anacrolix
Copy link
Contributor Author

It took me a while to find to de-Googlize and find the git part of gerrit. The test triggers the issue, and the fix works for go1.23. I'll now run this on my original system with the fix and see if the issue comes up again.

@dmitshur dmitshur modified the milestones: Backlog, Go1.24 Sep 4, 2024
@dmitshur dmitshur added the FixPending Issues that have a fix which has not yet been reviewed or submitted. label Sep 4, 2024
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/610696 mentions this issue: [release-branch.go1.23] internal/weak: shade pointer in weak-to-strong conversion

gopherbot pushed a commit that referenced this issue Sep 6, 2024
…g conversion

There's a bug in the weak-to-strong conversion in that creating the
*only* strong pointer to some weakly-held object during the mark phase
may result in that object not being properly marked.

The exact mechanism for this is that the new strong pointer will always
point to a white object (because it was only weakly referenced up until
this point) and it can then be stored in a blackened stack, hiding it
from the garbage collector.

This "hide a white pointer in the stack" problem is pretty much exactly
what the Yuasa part of the hybrid write barrier is trying to catch, so
we need to do the same thing the write barrier would do: shade the
pointer.

Added a test and confirmed that it fails with high probability if the
pointer shading is missing.

For #69210.
Fixes #69240.

Change-Id: Iaae64ae95ea7e975c2f2c3d4d1960e74e1bd1c3f
Reviewed-on: https://go-review.googlesource.com/c/go/+/610396
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
(cherry picked from commit 79fd633)
Reviewed-on: https://go-review.googlesource.com/c/go/+/610696
Auto-Submit: Dmitri Shuralyov <dmitshur@google.com>
@anacrolix
Copy link
Contributor Author

Completely fixed, thank you @mknyszek!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FixPending Issues that have a fix which has not yet been reviewed or submitted. NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

8 participants