-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: CPU instruction "NOP" utilizes ~50% CPU and pprof doesn't explain that #30708
Comments
Just in case:
|
I have a similar situation in os.openFileNolog where is shows 1.92s in a NOPL. `
` |
Speculation: those nops are inlining marks, the subroutine is growing the stack (slow) and the newstack call attribution isn’t working in conjunction with the inlining marks. |
Due it its use of ITIMER_PROF, pprof results have a skid of up to several instructions. We cannot correct the skid in Go because the amount of skid is not deterministic. We could perhaps document the limitation better. In #30708 (comment), the usage is almost certainly actually on the |
Duplicate of #41338. |
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes.
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
go get github.com/xaionaro-go/atomicmap
cd "$(go env GOPATH)"/src/github.com/xaionaro-go/atomicmap
git pull # just in case
git checkout performance_experiments
go get ./...
go test ./ -bench=Benchmark_atomicmap_Get_intKeyType_blockSize16777216_keyAmount1048576_trueThreadSafety -benchmem -benchtime 5s -timeout 60s -cpuprofile /tmp/cpu.prof
go tool pprof /tmp/cpu.prof
(pprof) web
(pprof) web increaseReadersStage0Sub0
(pprof) web increaseReadersStage0Sub0Sub0
(pprof) web increaseReadersStage0Sub0Sub1
What did you expect to see?
I expect to see any information about what is utilizing CPU
What did you see instead?
I see an empty function (that doesn't do anything by itself) which utilizes ~50% CPU. Or to be more specific instruction "NOPL" utilizes the CPU. It doesn't make any sense.
The method:
According to
pprof
(see screenshots) both of this sub-calls doesn't utilizes anything essential. But the method utilizes about 50% of CPU.I separated this functions this way intentionally to demonstrate the problem. The problem exists if I remove this extra calling levels (even if I manually inline that code to method
getByHashValue
), too.Or another try (I separated the
return
line on two and removed typeisSet
) -- the same result:disasm
The text was updated successfully, but these errors were encountered: