-
-
Notifications
You must be signed in to change notification settings - Fork 261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: reduce allocs when creating $_SERVER #540
Conversation
I did some more benchmarks using the k6 script we provide, and they are non-conclusive (the gap is small and the winner varies on each run...). @withinboredom, would you mind checking if you notice any improvement or deterioration using your benchmarks? In the meantime, let's mark this patch as a draft. |
I also looked into optimizing this back when I was digging into the memory leak. Looking at flame graphs, most our overhead these days is in go -> c -> go stack switches and optimizing anything else will be negligible. Any way we could reduce that would speed up frankenphp by quite a bit. I was using tinygo for a bit there, which can compile the entire stack onto the llvm and it sped things up quite a bit. However, we're using some incompatible cgo features now, so it won't compile. |
That's weird because according to recent benchmarks, cgo has now a negligible overhead as long as we batch calls (and we do). |
f6aea4d
to
9b05728
Compare
Oh yeah, that's to say the stack switching is still ridiculously fast, not that it's slow. But back then that was literally the slowest part (which is a good thing). |
cfc4cb9
to
3dfc3e2
Compare
With the latest changes, the gains are more significant: 32% fewer allocations and slightly less memory used in Go (probably much less in C but it's hard to measure). Before:
After:
|
45a1d3b
to
aeac7b1
Compare
K6 benchmark on a Macbook Pro (M1 Pro): Before:
After:
(0,5% improvement). Memory usage seems improved too. It would be nice if someone could try the benchmark on Linux. I will also try to use some sync.Pool to prevent memory allocations. |
Where about is the k6 file located to run the same benchmark but on linux? |
In test-data/load-test.js is usually the one I use. I won't be able to test it out until after I get back from vacation at the end of the week. |
814c82b
to
01ad3e5
Compare
@maypok86 sorry to bother you again (and I hope that it's not for nothing) but it looks like the latest failure is also related to Otter. It may be this known Go bug: https://pkg.go.dev/sync/atomic#pkg-note-BUG |
@dunglas Damn, yeah, I completely forgot about that when refactoring and there were no tests on 32-bit archs. Try the |
@maypok86 thanks for this swift fix!! I bumped Otter, let's see if the tests are green. |
5639a8a
to
1e28fc6
Compare
The tests seem to have passed, then I'll create a release with the bug fix now. |
Done. |
1e28fc6
to
5de011e
Compare
5de011e
to
a80bc95
Compare
The main idea is to allocate only one time (Go-side) the memory needed to populate
$_SERVER
environment variables.All the
strings.Clone()
calls are a workaround for golang/go#65286 (comment), they will not be necessary anymore in Go 1.22.On my machine, this implementation is around 5% faster than the current one (
BenchmarkServerSuperGlobal
, introduced in #539).Please note that when running Go benchmarks, the alloc counter doesn't take into account C allocations.