Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change sys::aio::lio_listio to sys::aio::LioCb::listio #872

Merged
merged 4 commits into from
Apr 7, 2018

Conversation

asomers
Copy link
Member

@asomers asomers commented Mar 5, 2018

The new LioCb structure allows us to control the exact arguments passed
to lio_listio, guaranteeing that each call gets a unique storage
location for the list argument. This prevents clients from misusing
lio_listio in a way that causes events to get dropped from a kqueue

Fixes #870

@asomers
Copy link
Member Author

asomers commented Mar 5, 2018

I found the copy/paste error that was causing the tests to fail on armv7 and i686-musl

@asomers
Copy link
Member Author

asomers commented Mar 21, 2018

I discovered a problem with this PR: it makes it difficult to handle EIO. I'll fix it before merging.

@asomers asomers force-pushed the issue_870 branch 2 times, most recently from c458f49 to 022790e Compare March 22, 2018 19:33
The new LioCb structure allows us to control the exact arguments passed
to lio_listio, guaranteeing that each call gets a unique storage
location for the list argument.  This prevents clients from misusing
lio_listio in a way that causes events to get dropped from a kqueue

Fixes nix-rust#870
A double panic can screw up the first panic's stack trace.  Better not
to assert! anything when the thread is already panicing.
Supporting the bytes crate was unnecessarily specific.  This change
replaces from_bytes and from_bytes_mut with from_boxed_slice and
from_boxed_mut_slice, which can work with anything that implements
Borrow<[u8]> and BorrowMut<[u8]>, respectively.
@asomers asomers force-pushed the issue_870 branch 2 times, most recently from aac3bbc to fef70ad Compare March 23, 2018 19:25
@Susurrus
Copy link
Contributor

Susurrus commented Apr 4, 2018

from_bytes_mut is mentioned on 269, and it doesn't exist anymore, so that line of documentation should be updated.

Copy link
Contributor

@Susurrus Susurrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there a test you could add that would fail with the old API and works with the new one? I didn't see any tests added to check that the new API doesn't have a similar failure mode.

/// The returned value will be the argument that was passed to
/// `from_boxed_slice` when this `AioCb` was created.
///
/// It is an error to call this method while the `AioCb` is still in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to actually make this a state machine so this error condition can be caught at compile time? So use a different type for in-progress AioCbs and not-in-progress ones?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. It's conceivable. I would have to change the signature of methods that change the in_progress state to consume the object and return a new one, like this:

pub fn read(self) -> result::Result<AioCbInProgress, (Error, AioCb)> {
let p: *mut libc::aiocb = &mut self.aiocb;
    Errno::result(unsafe {
        libc::aio_read(p)
    }).map(|_| {
        AioCbInProgress::from_aiocb(self)
    }).map_err(|e| {
        (self, e)
    })
}

I would also have to put a Box around AioCb::aiocb, because the libc::aiocb must have a stable address once it's been passed to the kernel.

But I can't guarantee that the compiler will always catch the problem, because if lio_listio returns EIO, then some operations may be in-progress but not all. Which means that LioCb::aiocbs would have to become a Vec of an Enum type. So you wouldn't be able to catch misuse of an LioCb at compile time.

I think I could make such an arrangement work with the downsteam mio-aio and tokio-file crates, but they would likewise have to use Enum types, so they wouldn't gain anything. tokio-file already has a totally different way of ensuring that you don't abuse an in-progress aiocb (based on runtime checks).

Do you think it's worth it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't particularly think either way, the idea crossed my mind so I asked. I really don't understand this API very well and I don't plan to spend the time to do so. But I wanted to ask in case you hadn't thought about it and it would be a good idea. Current API LGTM.

src/sys/aio.rs Outdated
/// operations that were restarted by [`LioCb::listio_resubmit`]
///
/// [`AioCb::error`] (struct.AioCb.html#method.error)
/// [`LioCb::listio_resubmit`] #method.listio_resubmit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This link is incorrect when rendered using rustdoc.

src/sys/aio.rs Outdated
///
/// [`AioCb::aio_return`] (struct.AioCb.html#method.aio_return)
/// [`AioCb::error`] (struct.AioCb.html#method.error)
/// [`LioCb::aio_return`] #method.aio_return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect link format.

src/sys/aio.rs Outdated
///
/// Use `listio` to submit an aio operation and wait for its completion. In
/// this case, there is no need to use `aio_suspend` to wait or
/// `AioCb#error` to poll.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect doc link here also.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two bugs fixed with the new API. #870 could cause kqueue to deliver too few notifications if you issued multiple lio_listio calls with the list argument having the same stack location. The new API makes that impossible, because LioCb owns the list.

The other bug dealt with lio_listio returning EAGAIN or EIO because not all operations were enqueued because of resource limitations. I actually have written a test for this in the tokio-file crate. However, reliably hitting that resource limit requires knowing more information about the system, which I got with the sysrc crate. I didn't want to import that crate into nix as a dev dependency, so I didn't add the test here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you think that dependency is too much for our testing? I'm always wary of fixing subtle bugs but not having a test to make sure they don't re-occur.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I pushed a test for it.

src/sys/aio.rs Outdated
/// [`AioCb::aio_return`] (struct.AioCb.html#method.aio_return)
/// [`AioCb::error`] (struct.AioCb.html#method.error)
/// [`LioCb::aio_return`] #method.aio_return
/// [`lio_listio`]: http://pubs.opengroup.org/onlinepubs/9699919799/functions/lio_listio.html
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why doesn't this use the parens format for external URL like on line 1188? Is there a reason to prefer one syntax over the other and/or is this actually incorrect?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this rare case, I actually got it right. Line 1188 is an inline link. Lines 1190-1191 are footnotes.

@Susurrus
Copy link
Contributor

Susurrus commented Apr 6, 2018

I think just that last comment, then squash everything and this is GTM.

@Susurrus
Copy link
Contributor

Susurrus commented Apr 7, 2018

Looks pretty good I think, though I didn't dig too deep into AIO/LIO, but it seems reasonable. I'd resolve the one last issue I raised, then squash and I'm happy!

// Deliberately exceed system resource limits, causing lio_listio to return EIO.
// This test must run in its own process since it deliberately uses all AIO
// resources. ATM it is only enabled on FreeBSD, because I don't know how to
// check system AIO limits on other operating systems.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this theoretically possible on other platforms? Maybe a FIXME or TODO should be added?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably, but these limits are OS-specific, and I don't know the rules on anything but FreeBSD. The errors, however, are standardized. So as long as listio_resubmit does the right thing when it gets the error on FreeBSD, then it should do the right thing everywhere.

@asomers
Copy link
Member Author

asomers commented Apr 7, 2018

Which "one last issue" is that?

@Susurrus
Copy link
Contributor

Susurrus commented Apr 7, 2018

To add a TODO or FIXME to add an implementation for other platforms. But it sounds like you don't think they're necessary.

@asomers
Copy link
Member Author

asomers commented Apr 7, 2018

I really don't. And I don't know how to do it, either. I can't find any documentation about limits for Linux, DragonFly or NetBSD.

@Susurrus
Copy link
Contributor

Susurrus commented Apr 7, 2018

Please squash them r+ me.

It helps deal with errors like EAGAIN, which can result in a subset of
an LioCb's operations being queued.  The test is only enabled on
FreeBSD, because it requires intimate knowledge of AIO system limits.
@asomers
Copy link
Member Author

asomers commented Apr 7, 2018

bors r+ susurrus

bors bot added a commit that referenced this pull request Apr 7, 2018
872: Change sys::aio::lio_listio to sys::aio::LioCb::listio r=asomers a=asomers

The new LioCb structure allows us to control the exact arguments passed
to lio_listio, guaranteeing that each call gets a unique storage
location for the list argument.  This prevents clients from misusing
lio_listio in a way that causes events to get dropped from a kqueue

Fixes #870
@bors
Copy link
Contributor

bors bot commented Apr 7, 2018

@bors bors bot merged commit 4b769a4 into nix-rust:master Apr 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants