-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove/improve the per-VU buffer pool? #794
Comments
This needs more investigation, but as part of my current fixes of #1044 and #1041, I decided to benchmark a few things. Specifically, how the removal of the current BPools or their replacement with Here's the k6 script: import http from "k6/http";
export let options = {
vus: 50,
iterations: 10000,
batchPerHost: 25,
};
var testServer = "http://127.0.0.1:8080"
export default function () {
http.get(`${testServer}/`);
http.post(`${testServer}/post`);
http.batch([
["GET", `${testServer}/gzip`],
["GET", `${testServer}/gzip`],
["GET", `${testServer}/deflate`],
["GET", `${testServer}/deflate`],
["GET", `${testServer}/redirect/5`],
["GET", `${testServer}/get`],
["GET", `${testServer}/html`],
["GET", `${testServer}/bytes/100000`],
["GET", `${testServer}/image/png`],
["GET", `${testServer}/image/jpeg`],
["GET", `${testServer}/image/jpeg`],
["GET", `${testServer}/image/webp`],
["GET", `${testServer}/image/svg`],
["GET", `${testServer}/forms/post`],
["GET", `${testServer}/bytes/100000`],
["GET", `${testServer}/stream-bytes/100000`],
]);
http.get(`${testServer}/get`);
http.get(`${testServer}/get`);
http.get(`${testServer}/1mbdata`);
} Most of the methods are the ones supplied by go-httpbin and The results are as follows: k6 master, no changes (i.e. with the current buffer pools):
k6 master, with a global sync.Pool buffer pool
k6 master, with a per-VU sync.Pool buffer:
k6 master, with NO buffer pools, i.e.
So, this needs more investigation and probably more realistic benchmarking, but for now it seems like we shouldn't switch the current buffer pools with (Edit: for posterity, the |
Seems like the |
@na--, as you describe in this issue, memory usage could be optimized by using a shared We have validated K6 using |
I see you opened a PR, so I'll try to take a look at it later today and share specific concerns there. In general, we'd need some benchmarks before we can make this change. I remember |
Thinking about this some more, there a few further optimizations we might be able to make here 🤔 Instead of using a distinct I see a few potential problems that can be addressed by this:
One potential danger with 1. is that we shouldn't have references to the middle of internal byte slice, but that should be relatively easy to ensure with the proper docs and a more limited API than And maybe these two things don't matter all that much when we have a "hot" |
About how we can measure this optimization is true that is difficult to validate with a Go benchmarks, we tried to do it but we hadn't a clear result. Of course we have validated it with our test scripts and the improvement is considerable. Attached some metrics of two executions, one with the current version and the other using "sync.Pool".
|
I've provisionally added this to the v0.44.0 milestone. It's likely not going to make it in time for v0.43.0, since the merge window for that closes at the end of the week, and we already have quite a lot of work-in-progress things to finish up before then. We'd need some time to benchmark this ourselves, and maybe also investigate some of the potential improvements from #794 (comment). |
Thanks @na--. Also we are thinking how to benchmark this modification (and the improvements that you commented) with Go benchmarks, if we'd find how to automate it and see the improvements, we will add to the PR. |
FWIW, we now have quite reliable ~end-to-end tests in https://github.com/grafana/k6/tree/master/cmd/tests. This one, for for example, spins up a server and has some checks with it: Lines 488 to 538 in 582ec4d
Potentially, with a few |
I didn't know this kind of test in k6. I added a new one in the same package that can allow us to validate the performance of I ran this test with 500 VUs and during 2 minutes and got the same diferente between both version.
You can validate different configurations for VUs/duration and then run:
After that just run pprof as usual:
|
Currently each VU has its own buffer pool that's used for storing HTTP response bodies. The current approach has a few limitations:
http.batch()
, so why not simply have a global one?Like I mention in this TODO, using something like
sync.Pool
may be a better idea, pending some benchmarks of course.sync.Pool
has dynamic resizes and allows objects to be automatically garbage collected when needed, which seems useful in our case. More details here and hereThe text was updated successfully, but these errors were encountered: