Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: slice bounds out of range #61060

Closed
anacrolix opened this issue Jun 29, 2023 · 28 comments
Closed

net: slice bounds out of range #61060

anacrolix opened this issue Jun 29, 2023 · 28 comments
Labels
arch-arm64 compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Darwin
Milestone

Comments

@anacrolix
Copy link
Contributor

What version of Go are you using (go version)?

$ go version
go version go1.21rc2 darwin/arm64

Does this issue reproduce with the latest release?

It occurs with 1.21, 1.20, and earlier version too (untested)

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
darwin/amd64

What did you do?

Probably try to resolve an IP

What did you expect to see?

It resolve

What did you see instead?

panic: runtime error: slice bounds out of range [54:45]

goroutine 65 [running]:
internal/poll.(*FD).Write(0xc0001f6080, {0xc00015e002, 0x2d, 0x200})
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/internal/poll/fd_unix.go:383 +0x49c
net.(*netFD).Write(0xc0001f6080, {0xc00015e002, 0x2d, 0x200})
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/fd_posix.go:96 +0x48
net.(*conn).Write(0xc0001200e0, {0xc00015e002, 0x2d, 0x200})
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/net.go:195 +0x88
net.dnsPacketRoundTrip({_, _}, _, {{{0x6d, 0x79, 0x69, 0x70, 0x2e, 0x6f, 0x70, ...}, ...}, ...}, ...)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/dnsclient_unix.go:102 +0x88
net.(*Resolver).exchange(_, {_, _}, {_, _}, {{{0x6d, 0x79, 0x69, 0x70, 0x2e, ...}, ...}, ...}, ...)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/dnsclient_unix.go:187 +0x3ec
net.(*Resolver).tryOneName(_, {_, _}, _, {_, _}, _)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/dnsclient_unix.go:277 +0x40c
net.(*Resolver).goLookupIPCNAMEOrder.func3.1(0x1c?)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/dnsclient_unix.go:653 +0xa0
created by net.(*Resolver).goLookupIPCNAMEOrder.func3
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/dnsclient_unix.go:652 +0x244
@anacrolix
Copy link
Contributor Author

I believe it has something to do with DNS over UDP over IPv6. I can reproduce it by running:

go install github.com/anacrolix/publicip/cmd/publicip
publicip -6

While using the Mullvad VPN. I suspect even if it's an edge case and the packet is being intercepted, it shouldn't be panicking my application.

@mengzhuo
Copy link
Contributor

I believe it has something to do with DNS over UDP over IPv6. I can reproduce it by running:

go install github.com/anacrolix/publicip/cmd/publicip
publicip -6

While using the Mullvad VPN. I suspect even if it's an edge case and the packet is being intercepted, it shouldn't be panicking my application.

I can run this command without error on linux/amd64, platform related issue?

@mengzhuo mengzhuo changed the title affected/package: net net: slice bounds out of range Jun 29, 2023
@bcmills
Copy link
Contributor

bcmills commented Jun 29, 2023

panic: runtime error: slice bounds out of range [54:45]

goroutine 65 [running]:
internal/poll.(*FD).Write(0xc0001f6080, {0xc00015e002, 0x2d, 0x200})
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/internal/poll/fd_unix.go:383 +0x49c

The relevant block of code (in go1.20.3) is here:
https://cs.opensource.google/go/go/+/refs/tags/go1.20.3:src/internal/poll/fd_unix.go;l=379-386;drc=a2baae6851a157d662dff7cc508659f66249698a

That would seem to imply that at that point nn is 54 and max is 45.

  • The upper bound on max is len(p) or nn + maxRW, whichever is smaller. (maxRW is 1 << 30, so in this case it must be len(p).)
  • The Write loop terminates when nn == len(p) after a call to syscall.Write with a slice of length max - nn.
  • The loop terminates when nn == len(p), and increments nn by the number of bytes reported by syscall.Write.

Unfortunately, the most plausible explanations both seem unlikely:

  • either the previous syscall.Write returned an n larger than len(p[nn:max]),
  • or a something in the program (cgo, or unsafe, or a bug in runtime or syscall, or a kernel or libc bug?) corrupted some local variable in (*FD).Write or syscall.Write or syscall.write.

The latter possibility makes me think of #60449, but note that that is for amd64 whereas this report is for arm64.

But the fact that this reproduces for you “[w]hile using the Mullvad VPN” makes me wonder if something about the VPN is causing the libc write call to return an incorrect count. Perhaps (*FD).Write should check for that explicitly and return an error for it?

(CC @ianlancetaylor, @golang/runtime)

@bcmills bcmills added OS-Darwin NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. arch-arm64 labels Jun 29, 2023
@bcmills bcmills added this to the Backlog milestone Jun 29, 2023
@bcmills
Copy link
Contributor

bcmills commented Jun 29, 2023

Wait, no. In that stack trace goroutine 65 is running, not panicking. Maybe that goroutine stack is a red herring.

@anacrolix, can you post the complete goroutine dump from a failure?

@bcmills bcmills added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Jun 29, 2023
@anacrolix
Copy link
Contributor Author

Thank you for looking into this @bcmills. That's actually the only thing it outputs when it crashes. Here it is with

GOTRACEBACK=system
% GOTRACEBACK=system godo -v -- ./cmd/publicip -6
godo: starting publicip
panic: runtime error: slice bounds out of range [48:45]

goroutine 34 [running]:
panic({0x104359ca0, 0x14000106138})
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/panic.go:987 +0x3c0 fp=0x14000151e30 sp=0x14000151d70 pc=0x10409a2f0
runtime.goPanicSliceB(0x30, 0x2d)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/panic.go:153 +0x7c fp=0x14000151e70 sp=0x14000151e30 pc=0x1040986cc
internal/poll.(*FD).Write(0x14000158000, {0x14000154002, 0x2d, 0x200})
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/internal/poll/fd_unix.go:383 +0x3ac fp=0x14000151f20 sp=0x14000151e70 pc=0x104121cac
net.(*netFD).Write(0x14000158000, {0x14000154002?, 0x0?, 0x104375198?})
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/fd_posix.go:96 +0x28 fp=0x14000151f70 sp=0x14000151f20 pc=0x10414d0b8
net.(*conn).Write(0x1400010e038, {0x14000154002?, 0x14000151fe8?, 0x1041476a4?})
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/net.go:195 +0x34 fp=0x14000151fc0 sp=0x14000151f70 pc=0x104157a04
net.(*UDPConn).Write(0x0?, {0x14000154002?, 0x14000152048?, 0x104145d64?})
	<autogenerated>:1 +0x2c fp=0x14000151ff0 sp=0x14000151fc0 pc=0x10416286c
net.dnsPacketRoundTrip({_, _}, _, {{{0x6d, 0x79, 0x69, 0x70, 0x2e, 0x6f, 0x70, ...}, ...}, ...}, ...)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/dnsclient_unix.go:102 +0x74 fp=0x140001525a0 sp=0x14000151ff0 pc=0x1041476d4
net.(*Resolver).exchange(_, {_, _}, {_, _}, {{{0x6d, 0x79, 0x69, 0x70, 0x2e, ...}, ...}, ...}, ...)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/dnsclient_unix.go:187 +0x38c fp=0x14000152ef0 sp=0x140001525a0 pc=0x1041483ec
net.(*Resolver).tryOneName(_, {_, _}, _, {_, _}, _)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/dnsclient_unix.go:277 +0x368 fp=0x14000153a20 sp=0x14000152ef0 pc=0x1041491e8
net.(*Resolver).goLookupIPCNAMEOrder.func3.1(0x1c?)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/dnsclient_unix.go:653 +0x68 fp=0x14000153fb0 sp=0x14000153a20 pc=0x10414ba68
net.(*Resolver).goLookupIPCNAMEOrder.func3.2()
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/dnsclient_unix.go:656 +0x30 fp=0x14000153fd0 sp=0x14000153fb0 pc=0x10414b9c0
runtime.goexit()
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/asm_arm64.s:1172 +0x4 fp=0x14000153fd0 sp=0x14000153fd0 pc=0x1040cb5d4
created by net.(*Resolver).goLookupIPCNAMEOrder.func3
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/dnsclient_unix.go:652 +0x180

goroutine 1 [select]:
runtime.gopark(0x140000c1d40?, 0x3?, 0x78?, 0x1b?, 0x140000c1cba?)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/proc.go:381 +0xe4 fp=0x140000c1b20 sp=0x140000c1b00 pc=0x10409d0d4
runtime.selectgo(0x140000c1d40, 0x140000c1cb4, 0x140000c1c98?, 0x0, 0x140000c1c88?, 0x1)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/select.go:327 +0x690 fp=0x140000c1c40 sp=0x140000c1b20 pc=0x1040ad3b0
github.com/anacrolix/publicip.race[...]({0x104375ed8, 0x140000a2e40}, {0x140000c1e28, 0x2, 0x0})
	/Users/anacrolix/ags/publicip/async.go:31 +0x27c fp=0x140000c1d80 sp=0x140000c1c40 pc=0x1042692bc
github.com/anacrolix/publicip.Get({0x104375ed8, 0x140000a2e40}, {0x104286e23, 0x3})
	/Users/anacrolix/ags/publicip/publicip.go:80 +0xa8 fp=0x140000c1e40 sp=0x140000c1d80 pc=0x1042682b8
github.com/anacrolix/publicip.Get6({0x104375ea0, 0x140000a2e10})
	/Users/anacrolix/ags/publicip/publicip.go:111 +0x6c fp=0x140000c1e80 sp=0x140000c1e40 pc=0x1042686dc
main.main()
	/Users/anacrolix/ags/publicip/cmd/publicip/main.go:24 +0x128 fp=0x140000c1f70 sp=0x140000c1e80 pc=0x104286698
runtime.main()
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/proc.go:250 +0x248 fp=0x140000c1fd0 sp=0x140000c1f70 pc=0x10409cca8
runtime.goexit()
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/asm_arm64.s:1172 +0x4 fp=0x140000c1fd0 sp=0x140000c1fd0 pc=0x1040cb5d4

goroutine 2 [force gc (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/proc.go:381 +0xe4 fp=0x14000044fa0 sp=0x14000044f80 pc=0x10409d0d4
runtime.goparkunlock(...)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/proc.go:387
runtime.forcegchelper()
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/proc.go:305 +0xb8 fp=0x14000044fd0 sp=0x14000044fa0 pc=0x10409cf18
runtime.goexit()
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/asm_arm64.s:1172 +0x4 fp=0x14000044fd0 sp=0x14000044fd0 pc=0x1040cb5d4
created by runtime.init.6
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/proc.go:293 +0x24

goroutine 17 [GC sweep wait]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/proc.go:381 +0xe4 fp=0x14000040760 sp=0x14000040740 pc=0x10409d0d4
runtime.goparkunlock(...)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/proc.go:387
runtime.bgsweep(0x0?)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/mgcsweep.go:278 +0xa4 fp=0x140000407b0 sp=0x14000040760 pc=0x10408a354
runtime.gcenable.func1()
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/mgc.go:178 +0x28 fp=0x140000407d0 sp=0x140000407b0 pc=0x10407f108
runtime.goexit()
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/asm_arm64.s:1172 +0x4 fp=0x140000407d0 sp=0x140000407d0 pc=0x1040cb5d4
created by runtime.gcenable
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/mgc.go:178 +0x74

goroutine 18 [GC scavenge wait]:
runtime.gopark(0x1400008c000?, 0x1042fb4d0?, 0x1?, 0x0?, 0x0?)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/proc.go:381 +0xe4 fp=0x14000040f50 sp=0x14000040f30 pc=0x10409d0d4
runtime.goparkunlock(...)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/proc.go:387
runtime.(*scavengerState).park(0x10450a7a0)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/mgcscavenge.go:400 +0x5c fp=0x14000040f80 sp=0x14000040f50 pc=0x1040881dc
runtime.bgscavenge(0x0?)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/mgcscavenge.go:628 +0x44 fp=0x14000040fb0 sp=0x14000040f80 pc=0x104088754
runtime.gcenable.func2()
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/mgc.go:179 +0x28 fp=0x14000040fd0 sp=0x14000040fb0 pc=0x10407f0a8
runtime.goexit()
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/asm_arm64.s:1172 +0x4 fp=0x14000040fd0 sp=0x14000040fd0 pc=0x1040cb5d4
created by runtime.gcenable
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/mgc.go:179 +0xb8

goroutine 19 [finalizer wait]:
runtime.gopark(0x140000445a8?, 0x6000010407cff8?, 0x48?, 0x15?, 0x1?)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/proc.go:381 +0xe4 fp=0x14000044580 sp=0x14000044560 pc=0x10409d0d4
runtime.runfinq()
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/mfinal.go:193 +0x10c fp=0x140000447d0 sp=0x14000044580 pc=0x10407e19c
runtime.goexit()
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/asm_arm64.s:1172 +0x4 fp=0x140000447d0 sp=0x140000447d0 pc=0x1040cb5d4
created by runtime.createfing
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/mfinal.go:163 +0x84

goroutine 20 [select]:
runtime.gopark(0x14000098c60?, 0x2?, 0x60?, 0xc0?, 0x14000098ba4?)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/proc.go:381 +0xe4 fp=0x140000989f0 sp=0x140000989d0 pc=0x10409d0d4
runtime.selectgo(0x14000098c60, 0x14000098ba0, 0x14?, 0x0, 0x0?, 0x1)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/select.go:327 +0x690 fp=0x14000098b10 sp=0x140000989f0 pc=0x1040ad3b0
net.(*Resolver).lookupIPAddr(0x1044ffae0, {0x104375e30?, 0x14000094280}, {0x104286e23, 0x3}, {0x10428a2a1, 0x10})
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/lookup.go:334 +0x40c fp=0x14000098d00 sp=0x14000098b10 pc=0x104155dcc
net.(*Resolver).internetAddrList(0x104375e30?, {0x104375e30?, 0x14000094280?}, {0x104286e23, 0x3}, {0x10428a2a1?, 0x0?})
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/ipsock.go:288 +0x4e8 fp=0x14000098e30 sp=0x14000098d00 pc=0x104153ef8
net.(*Resolver).LookupIP(0x0?, {0x104375e30, 0x14000094280}, {0x104286e23, 0x3}, {0x10428a2a1, 0x10})
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/lookup.go:232 +0x17c fp=0x14000098ed0 sp=0x14000098e30 pc=0x10415571c
github.com/anacrolix/publicip.Get.func1({0x104375e30?, 0x14000094280?})
	/Users/anacrolix/ags/publicip/publicip.go:83 +0x50 fp=0x14000098f30 sp=0x14000098ed0 pc=0x104268540
github.com/anacrolix/publicip.race[...].func1()
	/Users/anacrolix/ags/publicip/async.go:18 +0x4c fp=0x14000098fd0 sp=0x14000098f30 pc=0x10426886c
runtime.goexit()
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/asm_arm64.s:1172 +0x4 fp=0x14000098fd0 sp=0x14000098fd0 pc=0x1040cb5d4
created by github.com/anacrolix/publicip.race[...]
	/Users/anacrolix/ags/publicip/async.go:17 +0x114

goroutine 21 [select]:
runtime.gopark(0x140000c36b0?, 0x4?, 0xe8?, 0x33?, 0x140000c3558?)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/proc.go:381 +0xe4 fp=0x140000c33c0 sp=0x140000c33a0 pc=0x10409d0d4
runtime.selectgo(0x140000c36b0, 0x140000c3550, 0x1400008ef80?, 0x0, 0x140000c3538?, 0x1)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/select.go:327 +0x690 fp=0x140000c34e0 sp=0x140000c33c0 pc=0x1040ad3b0
net/http.(*Transport).getConn(0x104504ae0, 0x140000b2200, {{}, 0x0, {0x140000ba120, 0x5}, {0x140000ba138, 0x11}, 0x0})
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/http/transport.go:1382 +0x458 fp=0x140000c3700 sp=0x140000c34e0 pc=0x104256798
net/http.(*Transport).roundTrip(0x104504ae0, 0x140000fc200)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/http/transport.go:590 +0x690 fp=0x140000c3930 sp=0x140000c3700 pc=0x104252cb0
net/http.(*Transport).RoundTrip(0x1400009c978?, 0x104373f08?)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/http/roundtrip.go:17 +0x1c fp=0x140000c3950 sp=0x140000c3930 pc=0x10424a53c
net/http.send(0x140000fc200, {0x104373f08, 0x104504ae0}, {0x10422b550?, 0x8?, 0x0?})
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/http/client.go:252 +0x520 fp=0x140000c3b50 sp=0x140000c3950 pc=0x104229f10
net/http.(*Client).send(0x1044ffea0, 0x140000fc200, {0x1400009cc18?, 0x104073778?, 0x0?})
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/http/client.go:176 +0x98 fp=0x140000c3bd0 sp=0x140000c3b50 pc=0x104229888
net/http.(*Client).do(0x1044ffea0, 0x140000fc200)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/http/client.go:716 +0x6f4 fp=0x140000c3de0 sp=0x140000c3bd0 pc=0x10422b5a4
net/http.(*Client).Do(...)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/http/client.go:582
github.com/anacrolix/publicip.fromHttp({0x104375e30, 0x14000094280}, {0x0?, 0x0?})
	/Users/anacrolix/ags/publicip/http.go:30 +0xb0 fp=0x140000c3ed0 sp=0x140000c3de0 pc=0x104267bd0
github.com/anacrolix/publicip.Get.func2({0x104375e30?, 0x14000094280?})
	/Users/anacrolix/ags/publicip/publicip.go:86 +0x2c fp=0x140000c3f30 sp=0x140000c3ed0 pc=0x10426877c
github.com/anacrolix/publicip.race[...].func1()
	/Users/anacrolix/ags/publicip/async.go:18 +0x4c fp=0x140000c3fd0 sp=0x140000c3f30 pc=0x10426886c
runtime.goexit()
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/asm_arm64.s:1172 +0x4 fp=0x140000c3fd0 sp=0x140000c3fd0 pc=0x1040cb5d4
created by github.com/anacrolix/publicip.race[...]
	/Users/anacrolix/ags/publicip/async.go:17 +0x114

goroutine 33 [chan receive]:
runtime.gopark(0x140000bc9f8?, 0x10407d044?, 0xf8?, 0xc9?, 0x1800010407cff8?)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/proc.go:381 +0xe4 fp=0x140000bc9c0 sp=0x140000bc9a0 pc=0x10409d0d4
runtime.chanrecv(0x1400010a240, 0x140000bcaa0, 0x1)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/chan.go:583 +0x45c fp=0x140000bca50 sp=0x140000bc9c0 pc=0x10406d76c
runtime.chanrecv1(0x140000bcaa8?, 0x104087700?)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/chan.go:442 +0x14 fp=0x140000bca80 sp=0x140000bca50 pc=0x10406d2d4
net.(*Resolver).goLookupIPCNAMEOrder.func4({_, _}, _)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/dnsclient_unix.go:659 +0x64 fp=0x140000bcc10 sp=0x140000bca80 pc=0x10414b7a4
net.(*Resolver).goLookupIPCNAMEOrder(_, {_, _}, {_, _}, {_, _}, _, _)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/dnsclient_unix.go:669 +0x9ac fp=0x140000bd970 sp=0x140000bcc10 pc=0x10414addc
net.(*Resolver).goLookupIP(0x1044ffae0?, {0x104375e30, 0x14000104000}, {0x104286e23, 0x3}, {0x10428a2a1, 0x10})
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/dnsclient_unix.go:591 +0xb8 fp=0x140000bdbd0 sp=0x140000bd970 pc=0x10414a3b8
net.(*Resolver).lookupIP(0x104375e30?, {0x104375e30?, 0x14000104000?}, {0x104286e23?, 0x3?}, {0x10428a2a1?, 0x0?})
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/lookup_unix.go:70 +0x40 fp=0x140000bde30 sp=0x140000bdbd0 pc=0x104156f20
net.(*Resolver).lookupIP-fm({0x104375e30?, 0x14000104000?}, {0x104286e23?, 0x0?}, {0x10428a2a1?, 0x0?})
	<autogenerated>:1 +0x54 fp=0x140000bde80 sp=0x140000bde30 pc=0x1041635c4
net.glob..func1({0x104375e30?, 0x14000104000?}, 0x0?, {0x104286e23?, 0x14000094280?}, {0x10428a2a1?, 0x3?})
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/hook.go:23 +0x44 fp=0x140000bdec0 sp=0x140000bde80 pc=0x10414de14
net.(*Resolver).lookupIPAddr.func1()
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/lookup.go:326 +0x48 fp=0x140000bdf20 sp=0x140000bdec0 pc=0x1041565b8
internal/singleflight.(*Group).doCall(0x1044ffaf0, 0x14000104050, {0x14000106000, 0x14}, 0x0?)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/internal/singleflight/singleflight.go:93 +0x34 fp=0x140000bdf90 sp=0x140000bdf20 pc=0x10413fc64
internal/singleflight.(*Group).DoChan.func1()
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/internal/singleflight/singleflight.go:86 +0x38 fp=0x140000bdfd0 sp=0x140000bdf90 pc=0x10413fbf8
runtime.goexit()
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/asm_arm64.s:1172 +0x4 fp=0x140000bdfd0 sp=0x140000bdfd0 pc=0x1040cb5d4
created by internal/singleflight.(*Group).DoChan
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/internal/singleflight/singleflight.go:86 +0x3b4

goroutine 22 [IO wait]:
runtime.gopark(0x140000b22c0?, 0x40?, 0x38?, 0x0?, 0x104359680?)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/proc.go:381 +0xe4 fp=0x14000184d80 sp=0x14000184d60 pc=0x10409d0d4
runtime.netpollblock(0x14000184dd8?, 0x40a5d18?, 0x1?)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/netpoll.go:527 +0x158 fp=0x14000184dc0 sp=0x14000184d80 pc=0x104096828
internal/poll.runtime_pollWait(0x12ba76628, 0x77)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/netpoll.go:306 +0xa0 fp=0x14000184df0 sp=0x14000184dc0 pc=0x1040c5b60
internal/poll.(*pollDesc).wait(0x140000f0300?, 0x0?, 0x0)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/internal/poll/fd_poll_runtime.go:84 +0x28 fp=0x14000184e20 sp=0x14000184df0 pc=0x104120e88
internal/poll.(*pollDesc).waitWrite(...)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/internal/poll/fd_poll_runtime.go:93
internal/poll.(*FD).WaitWrite(...)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/internal/poll/fd_unix.go:741
net.(*netFD).connect(0x140000f0300, {0x104375ea0?, 0x140000a3170}, {0x0?, 0x14000185058?}, {0x104374208?, 0x140000b81c0?})
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/fd_unix.go:141 +0x5a0 fp=0x14000184fc0 sp=0x14000184e20 pc=0x10414d7e0
net.(*netFD).dial(0x140000f0300, {0x104375ea0, 0x140000a3170}, {0x1043767f8?, 0x0?}, {0x1043767f8?, 0x140000a3110}, 0x104073c24?)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/sock_posix.go:151 +0x324 fp=0x14000185090 sp=0x14000184fc0 pc=0x10415bd24
net.socket({0x104375ea0, 0x140000a3170}, {0x140000a6278, 0x4}, 0x1e, 0x1, 0x104286f9a?, 0xa0?, {0x1043767f8?, 0x0}, ...)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/sock_posix.go:70 +0x220 fp=0x14000185150 sp=0x14000185090 pc=0x10415b900
net.internetSocket({0x104375ea0, 0x140000a3170}, {0x140000a6278, 0x4}, {0x1043767f8, 0x0}, {0x1043767f8, 0x140000a3110}, 0x14000185258?, 0x104073778?, ...)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/ipsock_posix.go:142 +0xa8 fp=0x140001851e0 sp=0x14000185150 pc=0x104154b78
net.(*sysDialer).doDialTCP(0x140000fe1b0, {0x104375ea0, 0x140000a3170}, 0x0, 0x140001852f8?)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/tcpsock_posix.go:74 +0xac fp=0x140001852a0 sp=0x140001851e0 pc=0x10415e0ac
net.(*sysDialer).dialTCP(0x27dfd8da34c01?, {0x104375ea0?, 0x140000a3170?}, 0x104098cb4?, 0x104375e30?)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/tcpsock_posix.go:64 +0x70 fp=0x140001852e0 sp=0x140001852a0 pc=0x10415df80
net.(*sysDialer).dialSingle(0x140000fe1b0, {0x104375ea0, 0x140000a3170}, {0x104375170?, 0x140000a3110})
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/dial.go:580 +0x188 fp=0x140001853b0 sp=0x140001852e0 pc=0x104146a68
net.(*sysDialer).dialSerial(0x140000fe1b0, {0x104375e30, 0x14000094280}, {0x14000090280?, 0x2, 0x14000094370?})
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/dial.go:548 +0x190 fp=0x140001854c0 sp=0x140001853b0 pc=0x104146490
net.(*sysDialer).dialParallel(0xc11fc724476a02a0?, {0x104375e30?, 0x14000094280?}, {0x14000090280?, 0x4?, 0x140000a6278?}, {0x0?, 0x140000ba138?, 0x11?})
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/dial.go:449 +0x2f0 fp=0x14000185720 sp=0x140001854c0 pc=0x104145d50
net.(*Dialer).DialContext(0x140001858a0, {0x104375e30, 0x14000094280}, {0x140000a6278, 0x4}, {0x140000ba138, 0x11})
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/dial.go:440 +0x5ac fp=0x14000185860 sp=0x14000185720 pc=0x1041458ec
github.com/anacrolix/publicip.dialContext({0x104375e30, 0x14000094280}, {0x104286e44, 0x3}, {0x140000ba138, 0x11})
	/Users/anacrolix/ags/publicip/publicip.go:25 +0xe0 fp=0x14000185910 sp=0x14000185860 pc=0x104268040
net/http.(*Transport).dial(0x140000a2f30?, {0x104375e30?, 0x14000094280?}, {0x104286e44?, 0x1400009ca08?}, {0x140000ba138?, 0x1400009c9c8?})
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/http/transport.go:1176 +0xdc fp=0x14000185970 sp=0x14000185910 pc=0x104255bdc
net/http.(*Transport).dialConn(0x104504ae0, {0x104375e30, 0x14000094280}, {{}, 0x0, {0x140000ba120, 0x5}, {0x140000ba138, 0x11}, 0x0})
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/http/transport.go:1614 +0x654 fp=0x14000185ec0 sp=0x14000185970 pc=0x104258414
net/http.(*Transport).dialConnFor(0x0?, 0x140000d62c0)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/http/transport.go:1456 +0x7c fp=0x14000185fb0 sp=0x14000185ec0 pc=0x1042571dc
net/http.(*Transport).queueForDial.func1()
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/http/transport.go:1425 +0x2c fp=0x14000185fd0 sp=0x14000185fb0 pc=0x10425712c
runtime.goexit()
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/asm_arm64.s:1172 +0x4 fp=0x14000185fd0 sp=0x14000185fd0 pc=0x1040cb5d4
created by net/http.(*Transport).queueForDial
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/http/transport.go:1425 +0x398

goroutine 25 [select]:
runtime.gopark(0x140000427a8?, 0x2?, 0x0?, 0x0?, 0x1400004277c?)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/proc.go:381 +0xe4 fp=0x14000042610 sp=0x140000425f0 pc=0x10409d0d4
runtime.selectgo(0x140000427a8, 0x14000042778, 0x0?, 0x0, 0x0?, 0x1)
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/select.go:327 +0x690 fp=0x14000042730 sp=0x14000042610 pc=0x1040ad3b0
net.(*netFD).connect.func2()
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/fd_unix.go:118 +0x70 fp=0x140000427d0 sp=0x14000042730 pc=0x10414db80
runtime.goexit()
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/asm_arm64.s:1172 +0x4 fp=0x140000427d0 sp=0x140000427d0 pc=0x1040cb5d4
created by net.(*netFD).connect
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/net/fd_unix.go:117 +0x2f0

@n-canter
Copy link

n-canter commented Jul 14, 2023

We actually have a similar issue on Intel Macs (go1.20.3, go1.20.5): panics while connecting to VPN using Cisco AnyConnect.
arm64 macs running same go code compiled for Apple Silicon don't panic.

It seems, at some point syscall.Write returns 33554436.

panic: runtime error: slice bounds out of range [33554436:43]
goroutine 7738 [running]:
internal/poll.(*FD).Write(0xc000842100, {0xc0005f01e0, 0x2b, 0x1c5})
  /usr/local/go/src/internal/poll/fd_unix.go:383 +0x4aa
net.(*netFD).Write(0xc000842100, {0xc0005f01e0?, 0xc000837890?, 0x100ed61e0?})
  /usr/local/go/src/net/fd_posix.go:96 +0x29
net.(*conn).Write(0xc000506000, {0xc0005f01e0?, 0xc0008378f0?, 0xc0005f01e0?})
  /usr/local/go/src/net/net.go:195 +0x45
crypto/tls.(*Conn).write(0xc0007dd500, {0xc0005f01e0?, 0x5?, 0x1c5?})
  /usr/local/go/src/crypto/tls/conn.go:923 +0x10d
crypto/tls.(*Conn).writeRecordLocked(0xc0007dd500, 0x17, {0xc000516000, 0x15, 0x1000})
  /usr/local/go/src/crypto/tls/conn.go:991 +0x354
crypto/tls.(*Conn).Write(0x0?, {0xc000516000, 0x15, 0x1000})
  /usr/local/go/src/crypto/tls/conn.go:1186 +0x411
net/http.http2stickyErrWriter.Write({{0x101381f18?, 0xc0007dd500?}, 0xc0004a2460?, 0xc0006602c0?}, {0xc000516000, 0x15, 0x1000})
  /usr/local/go/src/net/http/h2_bundle.go:7429 +0x149
bufio.(*Writer).Flush(0xc000596340)
  /usr/local/go/src/bufio/bufio.go:628 +0x62
net/http.(*http2ClientConn).writeHeaders(0xc000660180, 0x9, 0x0, 0x4000, {0xc00066db00?, 0x0?, 0x2403?})
  /usr/local/go/src/net/http/h2_bundle.go:8579 +0x195
net/http.(*http2clientStream).encodeAndWriteHeaders(0xc000acb380, 0xc0007f3900)
  /usr/local/go/src/net/http/h2_bundle.go:8455 +0x38e
net/http.(*http2clientStream).writeRequest(0xc000acb380, 0xc0007f3900)
  /usr/local/go/src/net/http/h2_bundle.go:8343 +0x528
net/http.(*http2clientStream).doRequest(0xc000ac34f0?, 0xc0005bc701?)
  /usr/local/go/src/net/http/h2_bundle.go:8261 +0x1e
created by net/http.(*http2ClientConn).RoundTrip
  /usr/local/go/src/net/http/h2_bundle.go:8190 +0x34a

What's interesting is that the left index is always the same 33554436, if you convert it into hex you will get 0x02000004, which corresponds to write syscall: (SYSCALL_CLASS_UNIX << SYSCALL_CLASS_SHIFT) + 4

Can this be a trampoline messing with memory layout? or some kind of alignment issue?

UPD: Apple Silicon Macs panic as well but with arbitrary left index.

UPD2:
Managed to build minimal repro script, need to be run while VPN client is establishing connection:

package main

import (
	"log"
	"net/http"
	"time"
)

func main() {
	for {
		time.Sleep(500 * time.Millisecond)
		log.Println("hello")
		resp, err := http.Get("https://google.com")
		if err != nil {
			continue
		}
		resp.Body.Close()
	}
}

Disabling HTTP KeepAlives "fixes" the issue.
It seems, VPN client rewrites some network routes and when Go tries to reuse a connection from a pool something in libc or kernel breaks and Write() syscall returns incorrect value.

@bcmills bcmills removed the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Jul 14, 2023
@bcmills
Copy link
Contributor

bcmills commented Jul 14, 2023

Huh. At the very least (*poll.FD).Write should probably validate that n < max - nn and treat the connection as broken.

Even so, we should narrow down whether it is a Go bug or a macOS bug.

  • Can you reproduce this same behavior with a program written in C?
  • Does the Go reproducer also reproduce the bug in a darwin/amd64 binary?

@bcmills bcmills added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 14, 2023
@ianlancetaylor
Copy link
Contributor

Personally I don't think that the Go standard library should have to double-check that the write system call behaves as expected.

It should be possible to use dtruss to verify that write is returning an impossible length. That would be a useful way to determine whether it is indeed a kernel problem.

@bcmills
Copy link
Contributor

bcmills commented Jul 14, 2023

Personally I don't think that the Go standard library should have to double-check that the write system call behaves as expected.

I agree with that in principle, but I also think that if we have reason to believe that a particular system call may be broken, it benefits our users to make the problem easier to diagnose — and the run-time cost of an else if nn > len(p) here should be negligible compared to the cost of the syscall.

(I don't think we need to try to rush a check into 1.21 or backport it to older releases, but I do think we should consider it for 1.22 so that if this happens for other users they will be able to figure out what's going on more easily.)

@n-canter
Copy link

n-canter commented Jul 19, 2023

  • Can you reproduce this same behavior with a program written in C?

I tried, but no luck so far.

When I run go reproducer under dtruss I can't see any write calls returning more bytes than number of bytes passed in. So, I think it's not kernel but libc or go runtime bug.

  • Does the Go reproducer also reproduce the bug in a darwin/amd64 binary?

Will try next week when I get access to my Intel Mac.

Disabling ipv6 also "fixes" this issue.

@anacrolix
Copy link
Contributor Author

I can run publicip -6 with dtruss and Mullvad if it helps

@bcmills
Copy link
Contributor

bcmills commented Jul 19, 2023

So, I think it's not kernel but libc or go runtime bug.

I'm inclined to suspect libc because of the involvement of VPNs. (I wouldn't be surprised if the VPN layer works by injecting itself into libc at link time.)

I also wouldn't be terribly surprised if the reproducer requires multiple threads. If there is a synchronization bug in the VPN layer, the fact that the Go runtime may make its Write calls from arbitrarily different threads may be important.

@anacrolix
Copy link
Contributor Author

I couldn't determine much from the output. Without a VPN I see:

write(0x6, "\347\017\001\0", 0x2D) = 45 0

and with I see:

write(0x6, "\0", 0x2D) = 0 Err#-2

AFAICT, this is the UDP socket it's sending DNS queries from. Let me know if the full dtruss output is useful.

@ianlancetaylor
Copy link
Contributor

Thanks. Both of those system calls look OK (although error -2 is weird). What we suspect is a case where the system call returns a number that is larger than the third argument. If there are no such calls in dtruss then it could conceivably be a C library problem somehow, as suggested earlier. How is the VPN implemented? Does it somehow intercept calls to the C write function?

@n-canter
Copy link

Does the Go reproducer also reproduce the bug in a darwin/amd64 binary?

Just tried the go reproducer on darwin/amd64, it reproduces the bug.
Behaviour is a bit different though: left index is always 33554436.

if you convert it into hex you will get 0x02000004, which corresponds to write syscall: (SYSCALL_CLASS_UNIX << SYSCALL_CLASS_SHIFT) + 4

@anacrolix
Copy link
Contributor Author

The VPN is Mullvad. I only know that it's using wireguard underneath.

@anacrolix
Copy link
Contributor Author

I also get the same panic when using the same program under wine64 on Darwin. I'm inclined to think this is a Go resolver issue from the original stack trace I provided, and not Darwin specific.

@bcmills
Copy link
Contributor

bcmills commented Aug 15, 2023

@anacrolix, if this is a kernel (or perhaps even a libc) bug, wouldn't wine64 on Darwin be subject to that bug too?

Wine “translates Windows API calls into POSIX calls on-the-fly”, so presumably it would end up in the same POSIX write(2) call under the hood, and if Wine itself doesn't check the result it could very easily write the bogus value obtained from libc back through the lpNumberOfBytesWritten argument of the translated WriteFile call.

@anacrolix
Copy link
Contributor Author

Sorry I forgot to follow up, I tested on Windows directly and there was no issue as far as I can recall.

@n-canter
Copy link

n-canter commented Sep 8, 2023

Can't reproduce this bug on macOS Sonoma Beta 7 release, probably Apple fixed it.

@anacrolix
Copy link
Contributor Author

anacrolix commented Sep 11, 2023

I'm also on Sonoma and can't reproduce it there. Yes, my original app, and the publicip -6 reproduction above both work.

@anacrolix
Copy link
Contributor Author

https://mullvad.net/en/blog/2023/9/13/bug-in-macos-14-sonoma-prevents-our-app-from-working/

@n-canter
Copy link

n-canter commented Oct 6, 2023

Can't reproduce this bug on macOS Sonoma Beta 7 release, probably Apple fixed it.

Apple "un-fixed" it in Sonoma Stable :( Most likely related to pf bug fix @anacrolix mentioned.

Can't reproduce panic while running go repro script together with while true; do sudo pfctl -d; done

@anacrolix
Copy link
Contributor Author

I'll try again with Sonoma stable and report back.

@anacrolix
Copy link
Contributor Author

Yes, it crashes again on Sonoma stable (14.0 (23A344)).

panic: runtime error: slice bounds out of range [48:45]

goroutine 19 [running]:
internal/poll.(*FD).Write(0x1400018a000, {0x140000e2002, 0x2d, 0x200})
	/Users/anacrolix/src/go1.21/src/internal/poll/fd_unix.go:380 +0x3ac
net.(*netFD).Write(0x1400018a000, {0x140000e2002?, 0x140000e0048?, 0x104342a0c?})
	/Users/anacrolix/src/go1.21/src/net/fd_posix.go:96 +0x28
net.(*conn).Write(0x1400018e000, {0x140000e2002?, 0x140000dffe8?, 0x1043372e4?})
	/Users/anacrolix/src/go1.21/src/net/net.go:191 +0x34
net.dnsPacketRoundTrip({_, _}, _, {{{0x6d, 0x79, 0x69, 0x70, 0x2e, 0x6f, 0x70, ...}, ...}, ...}, ...)
	/Users/anacrolix/src/go1.21/src/net/dnsclient_unix.go:102 +0x74
net.(*Resolver).exchange(_, {_, _}, {_, _}, {{{0x6d, 0x79, 0x69, 0x70, 0x2e, ...}, ...}, ...}, ...)
	/Users/anacrolix/src/go1.21/src/net/dnsclient_unix.go:187 +0x394
net.(*Resolver).tryOneName(_, {_, _}, _, {_, _}, _)
	/Users/anacrolix/src/go1.21/src/net/dnsclient_unix.go:277 +0x370
net.(*Resolver).goLookupIPCNAMEOrder.func3.1(0x1c?)
	/Users/anacrolix/src/go1.21/src/net/dnsclient_unix.go:653 +0x68
created by net.(*Resolver).goLookupIPCNAMEOrder.func3 in goroutine 18
	/Users/anacrolix/src/go1.21/src/net/dnsclient_unix.go:652 +0x15c

@stratg5
Copy link

stratg5 commented Nov 25, 2023

I'm having the same issue, running Sonoma 14.1.1 and Go 1.21.3

This happens when my device is cut off from the network with firewall rules, but DNS IPs are whitelisted and I try to hit a host.

Without going into too much detail I'm setting up some pfctl rules, starting with block all and then allowing some select traffic to pass through.

This has been working fine for a while, just started to encounter this with either an updated Go version or updated Mac version I think?

I also see the exact same error as above: panic: runtime error: slice bounds out of range [33554436:69]

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/577955 mentions this issue: internal/poll: better panic for invalid write return value

gopherbot pushed a commit that referenced this issue Apr 11, 2024
For #61060

Change-Id: I13cd73b4062cb7bd248d2a4afae06dfa29ac0203
Reviewed-on: https://go-review.googlesource.com/c/go/+/577955
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Commit-Queue: Ian Lance Taylor <iant@google.com>
Reviewed-by: Damien Neil <dneil@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
@ianlancetaylor
Copy link
Contributor

I've improved the panic message so that it clearly indicates the problem. I don't think there is anything else to do here. If the write system call reports that it wrote more bytes than we asked it to write, there is no reasonable way for us to continue. We have no idea how many bytes were actually written.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arch-arm64 compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Darwin
Projects
Development

No branches or pull requests

7 participants