Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: can't atomic access of first word of tiny-allocated struct on 32-bit architecture #37262

Closed
NewbMiao opened this issue Feb 17, 2020 · 14 comments
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. release-blocker
Milestone

Comments

@NewbMiao
Copy link

NewbMiao commented Feb 17, 2020

What version of Go are you using (go version)?

$ go version
go version go1.13.7 darwin/amd64
$ uname -a
Darwin newbmiaodeMacBook-Pro.local 17.7.0 Darwin Kernel Version 17.7.0: Sun Jun  2 20:31:42 PDT 2019; root:xnu-4570.71.46~1/RELEASE_X86_64 x86_64

Does this issue reproduce with the latest release?

Not sure

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/newbmiao/Library/Caches/go-build"
GOENV="/Users/newbmiao/Library/Application Support/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/newbmiao/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/Cellar/go/1.13.7/libexec"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/Cellar/go/1.13.7/libexec/pkg/tool/darwin_amd64"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/m_/hlhv5wl17vg368420bbxszhc0000gn/T/go-build908799930=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

This issue about test the spec below, but got panic:

https://golang.org/pkg/sync/atomic/#pkg-note-BUG
The first word in a variable or in an allocated struct, array, or slice can be relied upon to be 64-bit aligned.

I use code below to test, and will panic occasionally
(Test it on my 64-bit mac: GOARCH=386 go build -ldflags=-compressdwarf=false -o test test.go)

But if i remove the fmt.Println() inside of the test function, it will pass.

I also debug it use gdb (append info below), x escape to heap, and use tinyallocs to alloc it

And i notice thatc.tinyoffset is 2.

But when no fmt.Println() it will be 0 or 16, Or just nextFreeFast to get a new object.

So i am confused, is that spec work in this situation?
Or maybe i misunderstand with this spec?
Or a issue cross compile under macOS?

May related with this issue: #599

Code:

package main
import (
	"fmt"
	"sync/atomic"
)

// On ARM, x86-32, and 32-bit MIPS, it is the caller's responsibility to arrange for 64-bit alignment of 64-bit words accessed atomically.
// The first word in a variable or in an allocated struct, array, or slice can be relied upon to be 64-bit aligned.
func test(){
	x := struct {
		v uint64
		y uint32
	}{}
	
	fmt.Println()  //if comment this line, it will pass on 32-bit arch
	atomic.AddUint64(&x.v, 1)
}
func main() {
	test()
}

gdb info

$ GOARCH=386 go build -ldflags=-compressdwarf=false -o test test.go
$ gdb ./hello
(gdb) b main.main
Thread 2 hit Breakpoint 1, main.main ()
    at /Dig101-Go/test.go:19
19      func main() {
(gdb) s
20              test()
(gdb) 
main.test () at /Dig101-Go/test.go:10
10      func test(){
(gdb) 
11              x := struct {
(gdb) 
runtime.newobject (typ=0x9d280 <type.*+93280>, ~r1=<optimized out>)
    at /usr/local/Cellar/go/1.13.7/libexec/src/runtime/malloc.go:1150
1150    func newobject(typ *_type) unsafe.Pointer {
(gdb) 
1151            return mallocgc(typ.size, typ, true)
(gdb) 
runtime.mallocgc (size=<optimized out>, typ=0x9d280 <type.*+93280>, needzero=true, 
    ~r3=<optimized out>)
    at /usr/local/Cellar/go/1.13.7/libexec/src/runtime/malloc.go:877
877     func mallocgc(size uintptr, typ *_type, needzero bool) unsafe.Pointer {
(gdb) 
878             if gcphase == _GCmarktermination {
(gdb) until 945
runtime.mallocgc (size=<optimized out>, typ=0x9d280 <type.*+93280>, needzero=true, ~r3=<optimized out>)
    at /usr/local/Cellar/go/1.13.7/libexec/src/runtime/malloc.go:945
945             if size <= maxSmallSize {
(gdb) n
946                     if noscan && size < maxTinySize {
(gdb) 
976                             off := c.tinyoffset
(gdb) 
978                             if size&7 == 0 {
(gdb) 
980                             } else if size&3 == 0 {
(gdb) 
981                                     off = round(off, 4)
(gdb) 
985                             if off+size <= maxTinySize && c.tiny != 0 {
(gdb) p off
$1 = 4
(gdb) p c.tinyoffset  <========= notice this line
$2 = 2
(gdb) n
987                                     x = unsafe.Pointer(c.tiny + off)
(gdb) 
988                                     c.tinyoffset = off + size
(gdb) 
989                                     c.local_tinyallocs++
(gdb) 
990                                     mp.mallocing = 0
(gdb) 
991                                     releasem(mp)
(gdb) 
992                                     return x
(gdb) 
main.test () at /Dig101-Go/test.go:16
16              fmt.Println() 
(gdb) 

17              atomic.AddUint64(&x.v, 1)
(gdb) 

Thread 2 received signal SIGSEGV, Segmentation fault.
runtime/internal/atomic.Xadd64 () at /usr/local/Cellar/go/1.13.7/libexec/src/runtime/internal/atomic/asm_386.s:105
105             MOVL    0, AX // crash when unaligned
(gdb) 
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x2f7c]

goroutine 1 [running]:
runtime/internal/atomic.Xadd64(0x1141c0b4, 0x1, 0x0, 0x0, 0x0)
        /usr/local/Cellar/go/1.13.7/libexec/src/runtime/internal/atomic/asm_386.s:105 +0xc
main.test()
        /Dig101-Go/test.go:17 +0x72
main.main()
        /Dig101-Go/test.go:20 +0x11
[Inferior 1 (process 68214) exited with code 02]

What did you expect to see?

go run successfully

What did you see instead?

Panic sometimes:

$ ./test  # success

$ ./test # failed

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x2f7c]

goroutine 1 [running]:
runtime/internal/atomic.Xadd64(0x1147e004, 0x1, 0x0, 0x0, 0x0)
        /usr/local/Cellar/go/1.13.7/libexec/src/runtime/internal/atomic/asm_386.s:105 +0xc
main.test()
        /Dig101-Go/test.go:17 +0x72
main.main()
        /Dig101-Go/test.go:20 +0x11
@ALTree ALTree added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Feb 17, 2020
@ianlancetaylor
Copy link
Contributor

CC @randall77 @josharian

@ianlancetaylor ianlancetaylor added this to the Go1.15 milestone Feb 17, 2020
@josharian
Copy link
Contributor

Maybe I'm missing something, but this looks like a malloc issue, not a compiler issue.

cc @mknyszek @aclements

@mknyszek
Copy link
Contributor

@aclements Looks like the tiny allocator doesn't uphold the guarantee in the spec. The lines are:

			// Align tiny pointer for required (conservative) alignment.
			if size&7 == 0 {
				off = round(off, 8)
			} else if size&3 == 0 {
				off = round(off, 4)
			} else if size&1 == 0 {
				off = round(off, 2)
			}

In this case size is 12, and so we align to 4 (as the trace in the original post at the top shows), but it should be 8, because it contains an 8-byte field (right?). I don't think this is even just a 386 problem; we could end up misaligning in any case.

Maybe the fix is to just align to typ.align?

@josharian
Copy link
Contributor

@mknyszek would that make the tiny allocator much less efficient, by increasing the number and size of gaps? And if so, are there mitigations?

@mknyszek
Copy link
Contributor

@mknyszek would that make the tiny allocator much less efficient, by increasing the number and size of gaps? And if so, are there mitigations?

Maybe, but given that we don't have atomic types and instead have atomic operations, it's impossible to know whether we really need the full alignment, and so we must conservatively always assume we need it. In that sense, I can't think of any mitigations besides waiting for #36606 which may allow users to set a custom alignment (thereby signaling the alignment they actually need).

It might also be slower because we have to touch the type whereas we didn't have to before.

@cherrymui
Copy link
Member

typ.align doesn't help in the current way. On 32-bit architectures it is 4 for int64. This is what #36606 is for.

@cherrymui
Copy link
Member

I think this can only happen for a size 9-15 object on a 32-bit architecture. Maybe we round up for only this case. I don't think it is too bad.

@cherrymui cherrymui changed the title cmd/compile: cant atomic access of first word of allocated struct occasionally? runtime: can't atomic access of first word of tiny-allocated struct Feb 18, 2020
@cherrymui cherrymui changed the title runtime: can't atomic access of first word of tiny-allocated struct runtime: can't atomic access of first word of tiny-allocated struct on 32-bit architecture Feb 18, 2020
@aclements
Copy link
Member

Since a type's size is rounded up to its alignment, I think this specific case with a 12 byte struct on 32-bit is really the only way that this can happen, and it's a direct consequence of the underalignment of uint64.

/cc @danscales for an interesting consequence of the current alignment rules

@mknyszek
Copy link
Contributor

@aclements @cherrymui Got it. I missed the fact that the size is rounded up. That makes sense, and I was surprised that this wasn't more broken...

@josharian It looks like what I was proposing is already there, so there shouldn't be any packing problems.

It looks like this problem is wholly #36606 as @cherrymui mentioned, so we might want to just close this as a duplicate?

@cherrymui
Copy link
Member

It looks like this problem is wholly #36606 as @cherrymui mentioned, so we might want to just close this as a duplicate?

Not exactly. Fixing #36606 will make this go away, but we can also fix this separately. This issue is specific to the first word of an allocation, which we try to make it still 8-byte aligned, as mentioned in https://golang.org/pkg/sync/atomic/#pkg-note-BUG . Currently it is mostly aligned, except for tinyalloc (and sbrk mode).

@ianlancetaylor
Copy link
Contributor

This didn't get fixed. Rolling forward to 1.16, and marking as release blocker since it does seem like a serious bug, albeit one that is hard to hit.

@ianlancetaylor ianlancetaylor modified the milestones: Go1.15, Go1.16 Jun 16, 2020
@ianlancetaylor ianlancetaylor added NeedsFix The path to resolution is known, but the work has not been done. release-blocker labels Jun 16, 2020
@gopherbot gopherbot removed the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jun 16, 2020
@aclements
Copy link
Member

Given that #36606 is stalled, I think we need to revisit this. It seems like we need to special-case 12 byte allocations on 32-bit to by 8-byte aligned. Are there any other solutions on the table?

@mknyszek mknyszek self-assigned this Sep 10, 2020
@mknyszek
Copy link
Contributor

I think we should just special case these 12-byte allocations as you and @cherrymui suggested. I can work on this now.

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/254057 mentions this issue: runtime: align 12-byte objects to 8 bytes on 32-bit systems

@golang golang locked and limited conversation to collaborators Oct 1, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. release-blocker
Projects
None yet
Development

No branches or pull requests

8 participants