Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix memory ordering in needs_wake #281

Merged
merged 3 commits into from
Apr 28, 2024
Merged

Conversation

HeroicKatora
Copy link
Contributor

A memory fence is required to ensure the decision of the kernel needing waking is consistent between user-space and kernel. The kernel must not go to sleep without us reliably observing the wake flag, respectively it must see updates to the submission head if our load does not contain updated wake flag. The fence is paired with one in the kernel which ensures that one of the two stores is observed by the respective load of the other party.

Fix: #197

A memory fence is required to ensure the decision of the kernel needing
waking is consistent between user-space and kernel. The kernel must not
go to sleep without us reliably observing the wake flag, respectively it
must see updates to the submission head if our load does not contain
updated wake flag. The fence is paired with one in the kernel which
ensures that one of the two stores is observed by the respective load of
the other party.
There is no other state that is acquired. The barrier which the library
or callers are required to ensure provides the necessary synchronization
with regards to kernel state.
@almogkh
Copy link

almogkh commented Apr 27, 2024

Like I mentioned in the linked issue, I think the best place to put the SeqCst fence is in the submit[_...] functions right before calling sq_need_wakeup in submit.rs. I don't think the after_intermittent_seqcst function is necessary. We only need the full barrier when submitting new work to the SQ with SQPOLL enabled so executing the full barrier on submit makes the most sense.

@HeroicKatora
Copy link
Contributor Author

HeroicKatora commented Apr 27, 2024

I do disagree with that, the library exposes access to the SubmissionQueue directly along with a raw Submitter::enter. If we're concerned with incorrect usage, this interface should itself already provide the sensible defaults instead of only fixing the high-level usage through the library. (Besides there are already two paths with sq_need_wakeup so this is also a code-duplication and evolution issue).

@HeroicKatora
Copy link
Contributor Author

I should put the fix in the right location though and add a barrier to the Submitter::sq_need_wakeup 😩

@almogkh
Copy link

almogkh commented Apr 27, 2024

The high-level interface in the Submitter doesn't call the need_wakeup function you modified, it uses its own function. I think at a minimum the safe high-level interface should behave correctly by default so regardless of the changes you made here, I think my suggested change should also probably be implemented.

I honestly don't know why need_wakeup is even exposed as a separate function in SubmissionQueue, I don't ever see a situation where you would want to query that separately from SQE submission. The equivalent function in liburing is not exposed. Submitter::enter is an unsafe function and the documentation for it says you should probably use submit instead. I think if you want to do all of this manually, it's fine to require users to insert the SeqCst fence manually, maybe with some documentation mentioning that it's necessary. I think forcing a SeqCst every time you want to check need_wakeup even if it's not needed isn't the best solution.

@HeroicKatora
Copy link
Contributor Author

HeroicKatora commented Apr 27, 2024

Sure, the fix on Submitter can use separate barriers. There seems to be an existing open issue within the code about the duplicate read from flags that'll complicate the question further. If consolidated, the library definitely doesn't want to perform the barrier without sqpoll, maybe it's more straightforward to refactor.

I believe SubmissionQueue seems natural to me, the constraint is on the order of updating head with respect to reading the flag. As an external caller I do appreciate doing both through the same struct.

I'll ultimately leave this to maintainer's approach to the library interface. Commits are structure to include sufficient relevant documentation for each individual change (where it concerns public vs. private API).

@quininer
Copy link
Member

Thank you! Anyway I agree align to liburing.

@quininer quininer merged commit 4e52bca into tokio-rs:master Apr 28, 2024
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

open question about race with kernel that liburing had to fix
3 participants