Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not fair to zig, need to use sendfile. #16

Open
EvanCarroll opened this issue Sep 18, 2023 · 4 comments
Open

Not fair to zig, need to use sendfile. #16

EvanCarroll opened this issue Sep 18, 2023 · 4 comments
Labels
enhancement New feature or request

Comments

@EvanCarroll
Copy link

EvanCarroll commented Sep 18, 2023

The reason for the performance difference is likely the use of sendfile in the Rust implementation by way of std::io::copy when you do

std::io::copy(&mut body.reader(), &mut stdout)?;

On Linux and Android that calls sys::kernel_copy::copy_spec which has sendfile optimizations. In the zig implementation you do,

const body = request.reader().readAllAlloc(allocator, 8192) catch unreachable;
defer allocator.free(body);
const stdout = std.io.getStdOut().writer();
try stdout.print("{} {s}\n", .{i, body});

When you should probably be doing something like std.os.sendfile or something like that. I don't write Zig but the patch for sendfile optimizations landed there. Zig has had these optimization since 2020

The lack of sendfile means you made an additional double-copy, moving all the data from kernel space to user space and back.

Good luck!


Follow up questions from me

@truemedian
Copy link

truemedian commented Sep 18, 2023

Sendfile is cool! however: it hadn't been implemented in std.http yet because it has some rather drastic constraints.

  • It simply doesn't work for TLS streams, as TLS is handled in userspace, the plaintext stream simply doesn't exist in the kernel.
  • It doesn't work for compressed streams, as compression is also handled in userspace, the decompressed stream doesn't exist in the kernel.
  • It doesn't work on chunked coded HTTP messages, as the framing is handled in userspace, the raw stream doesn't exist in the kernel.

These constraints mean that this optimization simply doesn't apply in most cases (yes, I'm aware it does apply here), so it hasn't been a priority.

If I were to guess the zig benchmark would gain more performance out of switching to a faster allocator (like std.heap.c_allocator), as GPA is known to be slow, and there happen to be a decent amount of allocations on the hot path for std.http.Client with the default options.

Also, to answer your questions:

Zig has fifo.pump() which takes a reader and a writer? Does this support sendfile optimizations on Linux in any case?

Unfortunately not, pump would need to have a special case for handling the combination of std.fs.File.Reader/std.net.Stream.Reader and std.fs.File.Writer/std.net.Stream.Writer and it simply doesn't.

It's not clear if Zig exposes the filehandle descriptor for the socket in client.request.reader()? Does it? And if so how do you get access to it?

Short answer: no
Long answer: yes, if you're willing to depend on implementation details client.connection.?.data.stream.handle.

@orhun orhun added the enhancement New feature or request label Sep 19, 2023
@EvanCarroll
Copy link
Author

EvanCarroll commented Sep 19, 2023

Sendfile should work in a kernel with TLS using the kTLS optimization which should exist on all modern kernels. But the rest is true. Not sure how much an issue Chunked HTTP Encoding is, http2 drops it entirely. I still think any allocation is going to be an insurmountable problem for any micro benchmark like this against a sendfile optimization.

And, it's not likely that this response would benefit from compression. But I agree at the point you're using deflate/brotli you generally lose sendfile. That may be one way to stop the optimization here if we don't want to benchmark sendfile -- force brotli on the response. It's a clever trick.

My assumption is for this benchmark you either got sendfile or you don't. But you sound like a Zig guy @truemedian, so maybe send a patch over for a new allocator and test that assumption. @orhun would probably update the results if it makes a difference. (or maybe his Zig is good enough to get that done).

@EvanCarroll
Copy link
Author

As a last question @truemedian is it possible to use the internal client.connection.?.data.stream.handle and std.os.sendfile to get the same effect as Rust above?

@truemedian
Copy link

It's possible, but bound to break at some point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants