-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Subscriptions and shared memory have a loophole which can lead to UB #143
Comments
I like the closure based approach, that is what I have seen from other Rust libraries and kind of lines up with general Rust-iness.
Does that actually work? I thought Rust would still throw an error if you wrap something that isn't |
Yes, it works. With |
Cool 👍 |
Could that be mitigated by marking the app as panicked in the kernel and disabling calling of callbacks? |
I have concerns about the usability of the closure-based approach to subscriptions. I already find it hard to combine results from different subscriptions in the drop-based approach. The closure based approach will make this even harder. Although the sync API of the library will feel very C-ish then I'm in favour of using the static subscription model because I believe it to be easier to use (and probably more suitable for internal driver implementations). |
I'm not sure static subscriptions are the way to go.
Is it a "can" or a "must"? A concern I have is how can you subscribe for a limited amount of time only? For example, subscribe to button touches only when your app is waiting for the user to touch a button. Although this may be simulated with static objects containing some subscription status, this could become contrived quite quickly.
I fear that limiting interactions to static objects may be too limiting. From my experience with https://github.com/google/OpenSK, we capture local variables all the time. On the other hand, it seems that Tock's syscalls were mostly designed with static callbacks and global variables in mind - at least that's what happening all the time with In any case, as a revamp of Tock syscalls is under way with Tock 2.0 discussions (tock/tock#1607), I'd definitely weigh in the ability to create usable and safe abstractions with My take on this is that the current I don't see how we could get rid of |
In Tock's current (pre-2.0) implementation, a shared Unfortunately, this means we cannot share buffers on the stack, and further than all accesses to the buffers must be done in a single instruction (to avoid race conditions with the kernel). I'm expecting to rework the system call API, so I'll address that as part of the rework. Unfortunately, it won't be nearly as nice to work with as the current setup. This may be worth addressing in Tock 2.0. |
How is that? I thought that in the Tock API, allowing a null pointer was supposed to revoke the old buffer. Or is that driver-specific? Or is there a bug in the kernel implementation of this API?
Although implementing that properly is likely driver-specific, shouldn't the command & callback syscalls delimit sections where a buffer is "owned" by the kernel vs. by the application? For example, the driver is not supposed to use an allowed buffer before a command is issued to take some action, nor after it has issued a "data transferred" callback? |
The documentation was incorrect, I fixed it in tock/tock#1831. Currently, an app cannot revoke a capsule's access to an
There's a distinction between typical use case patterns, and what's possible. Ideally, yes. However, recall that in Tock's threat model applications do not completely trust capsules. If the application were to treat the buffer in a way that allows a capsule to invoke undefined behavior, that would be a vulnerability in the application. |
I disagree that the documentation was incorrect and that changing it fixed the problem. I'd rather say that the code was incorrect in that it didn't implement the documented API.
I don't see how this can possibly work. This means that once a userspace application allows a region to a capsule, the capsule has read-write access to it forever, and therefore the application cannot use this memory anymore (memory aliasing rules would otherwise be violated). The allow-subscribe-command-yield pattern to transmit data back from the kernel to a userspace buffer has been a common pattern for a while, e.g. to fill in a buffer with rng data or to receive a USB packet. But these changes in the API documentation mean that any kernel -> userspace data transfer cannot happen with allowed buffers - the only option left is to transmit this data one word at a time via the return value of a syscall. Requiring that allowed buffers are |
Maybe I have a wrong understanding of the aliasing rules to apply. As far as I understand they should apply inside the same application, mainly to give the compiler the possibility to perform only admissible optimizations. However, allowing a buffer to an external source is not affected by that (disclamer below). There are many use cases where an application allowing a buffer to an external source (as DMA, MMIO, shared memory via IPC, ...) and a reasonable set of rules for memory safety should allow for these use cases. As a Making these buffers Disclaimer:
|
I don't think that context switches are strong enough as a boundary. The Tock kernel can preempt an application at any point, e.g. due to an interrupt. Consider the case of a "receive packet" driver (USB, TCP, etc.). Upon interrupt due to the hardware receiving a packet, Tock will preempt the application, then handle the interrupt which in turn likely invokes the corresponding capsule, which is allowed to modify the previously allowed buffer. If this preemption happens while the userspace app was copying data to/from this allowed buffer, then the copied buffer will contain half of old data and half of new data. Another example is the RNG driver. The userspace app wants to wait for the driver to have filled the buffer with random data to read back the buffer. Otherwise, it could be that half of the buffer contains non-random data, which then breaks security properties. If the driver is allowed to write data on the allowed buffer at any time, then it could for example clear the buffer with zeros, which breaks the security properties expected by the app.
In terms of protection, I think that a memory barrier on syscalls to delimit a section between I don't see how To me, the only reasonable solution is to fix the Tock kernel, so that an |
Actually, I remember thinking about this problem and found back a side comment in tock/tock#1761 (comment)
I think this should be addressed on the kernel side, by changing the API around |
I agree, but this is something to be solved in upstream Tock. Until then, |
As I explained, I don't think there's any reasonable way to work with the kernel's currently documented API, apart to create trivial programs like In particular, filling the following buffer from the kernel in a loop. libtock-rs/examples/blink_random.rs Lines 18 to 20 in eee2028
However, speaking of behavior, we can note that the "current kernel behavior" actually provides stronger guarantees than what is offered in the documentation, in the sense that most drivers (LEDs, buttons, rng, etc.) will release any callback upon My take on this is that we should rather focus efforts on making sure the kernel can provide these stronger guarantees for all drivers (even with a "malicious" capsule), than trying to cope with a broken API in userspace. |
I think you understand this, but with the current kernel design/implementation, we cannot rely on that behavior (as under the Tock threat model that would be a vulnerability in
I agree this should be changed. I'm not sufficiently familiar with kernel internals to make a suggestion on how to do so, and I don't have time to look into it right now. I'm unsure whether this is a Tock 2.0 change or not. I think this discussion should continue in a non- |
I definitely agree, I'll create an issue in tock/tock. I think that this would qualify as a Tock 2.0 change (or even earlier if any other release happens in 1.x). |
Allowed is the type that will be returned by Platform::allow() in the successful case. It represents a memory buffer shared with the kernel. It uses raw pointers and volatile writes to perform read/write accesses to the memory to avoid encountering undefined behavior. This is the first step towards fixing tock#129 and tock#143.
Allowed is the type that will be returned by Platform::allow() in the successful case. It represents a memory buffer shared with the kernel. It uses raw pointers and volatile writes to perform read/write accesses to the memory to avoid encountering undefined behavior. This is the first step towards fixing tock#129 and tock#143.
222: Add the Allowed type to libtock_platform. r=hudson-ayers a=jrvanwhy Allowed is the type that will be returned by `Platform::allow()` in the successful case. It represents a memory buffer shared with the kernel. It uses raw pointers and volatile writes to perform read/write accesses to the memory to avoid encountering undefined behavior. This is the first step towards fixing #129 and #143. Co-authored-by: Johnathan Van Why <jrvanwhy@google.com>
The Tock 2.0 crates have a different Subscribe design that addresses this unsoundness. |
Overview
Bad news:
syscalls::subscribe
andsyscalls::allow
have a loophole. Their implementations rely on destructors being called before the subscribed closure/memory slice gets out of scope. Unfortunately, this is not guaranteed by Rust and can easily be circumvented usingmem::forget
, for example.Solutions
I see different approaches to the problem which are not necessarily mutually exclusive
Use a closured-based scope instead of a
drop
-based oneExample
ParallelSleepDriver
)Use static subscriptions
Example
Cell
s orRefCell
s) and is limited to objects with a static lifetimeKeep the situation as is
subscribe
andallow
safe although they aren't. A justification could be that UB can only occur if constructors are avoided – which is unlikely to happen intentionally – or if a panic/oom handler is called – which, in future, could be solved by reporting to the kernel that the application is in an invalid state.Vision
I would propose to further investigate a combined solution. Given the buttons API as an example to be generalized over other drivers, the buttons API should have:
wait_for_button_pressed
In order to facilitate working with static subscriptions, we would use the approach that @janvrwhy proposes in https://github.com/tock/design-explorations/blob/master/size_comparison/futures/src/tock_static.rs.
The text was updated successfully, but these errors were encountered: