Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std.io: add GenericStream and AnyStream #19376

Closed

Conversation

clickingbuttons
Copy link
Contributor

@clickingbuttons clickingbuttons commented Mar 21, 2024

  • Rename std.posix.iovec and std.posix.iovec_const members from iov_base and iov_len to ptr and len (matches slice).
  • Add std.io.ReadBuffers = std.posix.iovec; and std.io.WriteBuffers = std.posix.iovec_const;.
  • Replace std.io.[GenericReader|AnyReader]'s fn read(ctx, buffer: []u8) with readv(ctx, iovecs: []std.io.ReadBuffers).
  • Replace std.io.[GenericWriter|AnyWriter]'s fn write(ctx, buffer: []const u8) with writev(ctx, iovecs: []std.io.WriteBuffers).
  • Add GenericStream and AnyStream that has readv, writev, and a new close(ctx): void parameter. Use it in std.net, std.http, and std.crypto.tls to abstract over TLS/plain socket streams and remove StreamInterface.

This change is intended to promote generic reader syscall efficiency with readv and writev (for supported platforms) and provide a type erased stream type to reduce future code bloat from instantiating many generic stream types in #19308.

Existing Readers/Writers have been changed to read/write all buffers where possible. When significant logic would have to change, I've opted to simply extract a buffer from the first entry in std.io.ReadBuffers or std.io.WriteBuffers and leave the function alone.

I chose to not make the slices const so that implementers like writevAll can cleverly mutate the vector fields without having to copy them. This seems to outweigh the cost of forcing callers to duplicate std.io.ReadBuffers or std.io.WriteBuffers when sending to multiple readers/writers. A stack cloneable std.io.ReadBuffers and std.io.WriteBuffers type may remove this limitation in the future.

@clickingbuttons clickingbuttons changed the title std.io: add AnyStream std.io: add GenericStream and AnyStream Mar 21, 2024
- replace Reader `fn read(ctx, buffer: []u8)` with `readv(ctx, iovecs: []std.posix.iovec)`
- replace Writer `fn write(ctx, buffer: []const u8)` with `writev(ctx, iovecs: []std.posix.iovec_const)`
@clickingbuttons
Copy link
Contributor Author

clickingbuttons commented Mar 21, 2024

Looks like the status quo HTTP client with TLS already crashes for https://www.google.com without the std.debug.print("", .{});s before P.AEAD.encrypt and P.AEAD.decrypt. Prepending a usize on the stack also works for P.AEAD.decrypt, so my best guess is a stack alignment or inline switch problem. It changes where it crashes between Release/Debug builds as well.

Since it's fixed in #19308 I'm tempted to leave this strange fix in unless someone else can figure out the issue. I spent 3 hours to get this far. I really thought it had to do with inner_stream.context corruption, but whenever I print it it's fine, and even when I print nothing it's fine.

Here's the stack trace
~/src/zig-test(master)$ zig build run-client
fetching https://www.google.com/
General protection exception (no address available)
/home/thesm/.local/share/zvm/local/lib/std/crypto/tls/Client.zig:741:31: 0x831b0e6 in prepareCiphertextRecord (client.zig)
                P.AEAD.encrypt(ciphertext, auth_tag, cleartext, ad, nonce, p.client_key);
                              ^
Unwind error at address `exe:0x831b0e6` (error.AddressOutOfRange), trace may be incomplete

/home/thesm/.local/share/zvm/local/lib/std/crypto/tls/Client.zig:646:43: 0x8232962 in writeEnd (client.zig)
    var prepared = prepareCiphertextRecord(c, &iovecs_buf, &ciphertext_buf, bytes, .application_data);
                                          ^
/home/thesm/.local/share/zvm/local/lib/std/crypto/tls/Client.zig:671:26: 0x8233118 in writev (client.zig)
    return try c.writeEnd(bytes, false);
                         ^
/home/thesm/.local/share/zvm/local/lib/std/io.zig:374:28: 0x81aa955 in typeErasedWritevFn (client.zig)
            return writevFn(ptr.*, iov);
                           ^
/home/thesm/.local/share/zvm/local/lib/std/io/Writer.zig:13:25: 0x81ab929 in writev (client.zig)
    return self.writevFn(self.context, iov);
                        ^
/home/thesm/.local/share/zvm/local/lib/std/io/Writer.zig:21:34: 0x801d4f5 in writevAll (client.zig)
        var amt = try self.writev(iovecs[i..]);
                                 ^
/home/thesm/.local/share/zvm/local/lib/std/io/Writer.zig:39:26: 0x80081fd in writeAll (client.zig)
    return self.writevAll(&iov);
                         ^
/home/thesm/.local/share/zvm/local/lib/std/http/Client.zig:357:41: 0x801bd8a in flush (client.zig)
        try conn.any().writer().writeAll(conn.write_buf[0..conn.write_end]);
                                        ^
/home/thesm/.local/share/zvm/local/lib/std/http/Client.zig:888:29: 0x8007286 in send (client.zig)
        try connection.flush();
                            ^
/home/thesm/.local/share/zvm/local/lib/std/http/Client.zig:1740:17: 0x8000490 in fetch (client.zig)
    try req.send(.{ .raw_uri = options.raw_uri });
                ^
/home/thesm/src/zig-test/src/client.zig:25:33: 0x800150e in fetch_old (client.zig)
    const req = try client.fetch(.{

@clickingbuttons clickingbuttons mentioned this pull request Mar 21, 2024
19 tasks
@jayschwa
Copy link
Contributor

Introducing Stream and changing Generic[Reader|Writer] parameters should probably be two separate pull requests.

The std.io abstractions are supposed to be broadly cross-platform, so it seems fishy that something from the std.posix namespace would be part of the contract versus an implementation detail. It's also not clear to me why the iovec parameter change should only apply to Generic[Reader|Writer] and not Any[Reader|Writer] as well.

@clickingbuttons
Copy link
Contributor Author

clickingbuttons commented Mar 21, 2024

Introducing Stream and changing Generic[Reader|Writer] parameters should probably be two separate pull requests.

I'm already a bit tired from splitting this PR out from #19308. I know the changeset is large, but I think changing read to readv and write to writev everywhere is trivial. The Stream change is the real change to review and isn't possible without changing Generic[Reader|Writer] parameters.

it seems fishy that something from the std.posix namespace would be part of the contract versus an implementation detail

That's a good point. iovecs were recently moved from std.os to std.posix in #19360. I'll re-export them in std.io to abstract better in case of someone solving #7699.

It's also not clear to me why the iovec parameter change should only apply to Generic[Reader|Writer] and not Any[Reader|Writer] as well.

It does apply. I'll update the OP.

@clickingbuttons
Copy link
Contributor Author

I've thought more about the lack of readv/writev support on Windows.

If the only goal is to save syscalls, I think integrating cross-platform io event loops like libxev into Readers/Writers/Sockets is a better path to pursue.

If the goal is convenience when parsing/serializing I believe developers ought to think more carefully about buffering (which also improves cache coherency) than lean on platform-dependent readv/writev support and be surprised their syscalls multiply on certain operating systems. If even @andrewrk thought there was a way to support readv/writev on Windows in #7699, and @alexcrichton did for sockets in Rust that's a red flag that others will too.

I don't think Reader/Writer types should support more than a simple read and write:

pub fn read(fd: fd_t, buf: []u8) ReadError!usize {}
pub fn write(fd: fd_t, bytes: []const u8) WriteError!usize {}

So after adding them everywhere in this PR, I've swung in favor of removing readv* and writev* functions from std.io entirely. To match the posix API of reading/writing from contiguous entries they'd have to just read/write the first iovec member and would be less optimal and less readable than multiple read or write calls.

Most usages are in http/tls code for parsing and serializing that can be refactored to use a buffered reader or writer in #19308 without having to heap allocate (I'm looking at http.Server.flush_chunked and tls.Client.init).

Also, I think the Stream interface should just become a Socket interface. Streams like ArrayList aren't necessarily closable, and using TLS on anything except a Socket (what the manual page calls "an endpoint for communication") is silly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants