Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cancel queue + async usercall interface #404

Closed
wants to merge 11 commits into from

Conversation

mzohreva
Copy link
Contributor

Combined and rebased two long open PRs (#286 and #291) in a separate branch (some commits are quashed).

@mzohreva mzohreva force-pushed the mz/new-async-usercalls branch 2 times, most recently from 78e5b39 to f0f6beb Compare August 1, 2022 18:14
pub fn async_queues(
usercall_queue: *mut FifoDescriptor<Usercall>,
return_queue: *mut FifoDescriptor<Return>,
cancel_queue: *mut FifoDescriptor<Cancel>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could lead to issues like:

thread 'main' panicked at 'assertion failed: `(left == right)`
  left: `94410652213776`,
 right: `0`: Usercall async_queues: expected 3rd argument to be 0', intel-sgx/enclave-runner/src/usercalls/abi.rs:311:1
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

This happens when on older enclave runner is used to run a new async enclave that provides a cancel_queue to the async_queues usercall. I'm wondering if we can provide a nicer error message as users has ran into this issue and probably will again when this PR is merged an published.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably, but we'll need to update the crates in a separate PR and publish enclave-runner before this PR lands and hope that users use that version before switching to this one (older ones will still panic with the same error). I'm not sure it's worth the effort. I don't think we can reliably detect the enclave-runner version in the enclave to produce a better error in the enclave.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we have some explicit versioning in the runner/enclave interface so the runner and the enclave can negotiate whether they can successful run? Something like an API version that the runner can pass to the enclave, and the enclave can return an error if that API version isn't supported?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which won't help us right now, since we don't have the versioning mechanism in place yet. But I think we should assume that we need to make more breaking changes in the future at some point. Having a mechanism in place that lets us get a clear error message when there's incompatibility between the runner and the enclave will make it easier in the future when such transitions happen again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created a separate issue tracking this: #443

intel-sgx/enclave-runner/src/usercalls/mod.rs Show resolved Hide resolved
@@ -515,7 +516,7 @@ struct PendingEvents {

impl PendingEvents {
// Will error if it doesn't fit in a `u64`
const EV_MAX_U64: u64 = (EV_USERCALLQ_NOT_FULL | EV_RETURNQ_NOT_EMPTY | EV_UNPARK) + 1;
const EV_MAX_U64: u64 = (EV_USERCALLQ_NOT_FULL | EV_RETURNQ_NOT_EMPTY | EV_UNPARK | EV_CANCELQ_NOT_FULL) + 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't EV_ABORT be added here as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this type handles abort separately, see line 512 and associated functions of PendingEvents

async-usercalls/src/queues.rs Outdated Show resolved Hide resolved
async-usercalls/src/queues.rs Outdated Show resolved Hide resolved
async-usercalls/src/queues.rs Outdated Show resolved Hide resolved
async-usercalls/src/duplicated.rs Outdated Show resolved Hide resolved
intel-sgx/async-usercalls/src/duplicated.rs Outdated Show resolved Hide resolved
intel-sgx/async-usercalls/src/provider_api.rs Show resolved Hide resolved
intel-sgx/async-usercalls/src/lib.rs Show resolved Hide resolved
bors bot added a commit that referenced this pull request Aug 16, 2022
405: Add cancel queue to SGX ABI r=raoulstrackx a=mzohreva

We need to publish `fortanix-sgx-abi` with this change and then update libstd's dependency on `fortanix-sgx-abi` so we can remove the hacks in #404.

Co-authored-by: Mohsen Zohrevandi <mohsen.zohrevandi@fortanix.com>
matthiaskrgr pushed a commit to matthiaskrgr/rust that referenced this pull request Aug 20, 2022
Update fortanix-sgx-abi to 0.5.0 to add support for cancel queue (see
fortanix/rust-sgx#405 and
fortanix/rust-sgx#404).

Export some useful traits for processing SGX usercall. This is needed
for fortanix/rust-sgx#404 to avoid duplication.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Aug 20, 2022
…queue, r=Mark-Simulacrum

Update fortanix-sgx-abi and export some useful SGX usercall traits

Update `fortanix-sgx-abi` to 0.5.0 to add support for cancel queue (see fortanix/rust-sgx#405 and fortanix/rust-sgx#404).

Export some useful traits for processing SGX usercall. This is needed for fortanix/rust-sgx#404 to avoid duplication.

cc `@raoulstrackx` and `@jethrogb`
workingjubilee pushed a commit to tcdi/postgrestd that referenced this pull request Sep 15, 2022
Update fortanix-sgx-abi to 0.5.0 to add support for cancel queue (see
fortanix/rust-sgx#405 and
fortanix/rust-sgx#404).

Export some useful traits for processing SGX usercall. This is needed
for fortanix/rust-sgx#404 to avoid duplication.
workingjubilee pushed a commit to tcdi/postgrestd that referenced this pull request Sep 15, 2022
…Mark-Simulacrum

Update fortanix-sgx-abi and export some useful SGX usercall traits

Update `fortanix-sgx-abi` to 0.5.0 to add support for cancel queue (see fortanix/rust-sgx#405 and fortanix/rust-sgx#404).

Export some useful traits for processing SGX usercall. This is needed for fortanix/rust-sgx#404 to avoid duplication.

cc `@raoulstrackx` and `@jethrogb`
@Taowyoo Taowyoo requested review from raoulstrackx and Pagten and removed request for aandyl April 26, 2023 20:41
@Taowyoo
Copy link
Collaborator

Taowyoo commented Apr 26, 2023

I cherry pick the 6b2a2cb to fix the CI, need to remove it once the #433 is addressed

intel-sgx/async-usercalls/src/lib.rs Show resolved Hide resolved
#!/bin/bash

# Run this in parallel with:
# $ cargo test --target x86_64-fortanix-unknown-sgx --release -- --nocapture --ignored echo
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid this script?

/// arranged as a ring buffer, we can assign a position to each value written/
/// read to/from the queue. This is useful in case we want to know whether or
/// not a particular value written to the queue has been read.
pub struct PositionMonitor<T: 'static> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation of PositionMonitor is in position.rs. That's confusing. Can you move it to there as well, and export it here?

/// read to/from the queue. This is useful in case we want to know whether or
/// not a particular value written to the queue has been read.
pub struct PositionMonitor<T: 'static> {
read_epoch: Arc<AtomicU64>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of read_epoch?

pub fn read_position(&self) -> ReadPosition {
let current = self.fifo.current_offsets(Ordering::Relaxed);
let read_epoch = self.read_epoch.load(Ordering::Relaxed);
ReadPosition(((read_epoch as u64) << 32) | (current.read_offset() as u64))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

read_epoch itself is a u64. This assumes it will never grow larger than 2^32 before things start wrapping around. How reasonable is this? Can we make read_epoch an AtomicU32 instead?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: check if that is actually reachable

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, panic instead of silently do the wrong stuff.

let return_queue = unsafe { User::<FifoDescriptor<Return>>::from_raw(return_q) }.to_enclave();

// FIXME: once `WithId` is exported from `std::os::fortanix_sgx::usercalls::raw`, we can remove
// `transmute` calls here and use FifoDescriptor/WithId from std everywhere including in ipc-queue.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the fortanix-sgx-abi crate has been update in the compiler and we can get rid of these transmutes

}

pub fn is_empty(&self) -> bool {
self.read == self.write
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the modulo of these offsets be taken here as well?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using len() would keep most of these details in the len() and offsets() functions.


fn is_full(&self) -> bool {
let (read_offset, write_offset) = self.offsets();
read_offset == write_offset && self.read != self.write
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using len() would keep most of these details in the len() and offsets() functions.

mzohreva and others added 6 commits June 8, 2023 13:16
`UsercallEvent::Start` was being sent in `fn handle_usercall`,
which is too late. It needs to be sent before we receive the next
usercall from the enclave so we can maintain the invariant that
"we only need to keep track of cancels received before the actual
usercall if the read position has not moved past the write position
when cancel was received."
- Move async-usercalls to intel-sgx directory
- Remove hacks/unsafe_typecasts.rs
- Fix some typos in docs
- Use marker trait for MakeSend to avoid warnings about issue #93367
- Update crossbeam and crossbeam-channel dependencies
- Use nightly Rust in CI
- Use SGX target for generating docs when crate has `feature(sgx_platform)`
@BelalH BelalH mentioned this pull request Jul 21, 2023
vn971 pushed a commit to vn971/rust-sgx that referenced this pull request Aug 28, 2023
Credits for this commit go to:
Mohsen: fortanix#404
YxC: fortanix#441

This commit is an attempt to have the async-usercalls finally merged
into the main codebase (master branch).
vn971 pushed a commit to vn971/rust-sgx that referenced this pull request Aug 29, 2023
Credits for this commit go to:
Mohsen: fortanix#404
YxC: fortanix#441

This commit is an attempt to have the async-usercalls finally merged
into the main codebase (master branch).
raoulstrackx pushed a commit that referenced this pull request Jan 4, 2024
Credits for this commit go to:
Mohsen: #404
YxC: #441

This commit is an attempt to have the async-usercalls finally merged
into the main codebase (master branch).
vn971 pushed a commit that referenced this pull request Jan 4, 2024
Credits for this commit go to:
Mohsen: #404
YxC: #441

This commit is an attempt to have the async-usercalls finally merged
into the main codebase (master branch).
vn971 pushed a commit to vn971/rust-sgx that referenced this pull request Jan 17, 2024
Credits for this commit go to:
Mohsen: fortanix#404
YxC: fortanix#441

This commit is an attempt to have the async-usercalls finally merged
into the main codebase (master branch).
@Taowyoo
Copy link
Collaborator

Taowyoo commented Jan 27, 2024

I am going to close this since merged #515 covered code change in this PR.
Please be free to reopen this if needed!

@Taowyoo Taowyoo closed this Jan 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants