-
Notifications
You must be signed in to change notification settings - Fork 472
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use after free in Michael-Scott queue? #238
Comments
It seems like based on the invariant:
This looks right to me, now we no need to modify the source code to avoid such errors. |
@stjepang If its okay, I would like to work on this |
I think this issue is related to #221, especially "Using small epoch numbers" in #221 (comment) . This can also solve the problem of epoch wraparound. I'll write down more details soon. |
Thanks a lot @jeehoonkang. It will be much appreciated as I am new to rust and to crossbeam. I am very interested in concurrency and crossbeam. Thanks again. |
@stjepang Thank you for reporting this. It seems much deeper than I first expected... I'd like to spend a little bit more thinking on this issue... |
Any updates @jeehoonkang ? Anything I can start looking at ?
…On Thu, Aug 8, 2019, 22:03 Jeehoon Kang ***@***.***> wrote:
@stjepang <https://github.com/stjepang> Thank you for reporting this. It
seems much deeper than I first expected... I'd like to spend a little bit
more thinking on this issue...
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#238?email_source=notifications&email_token=AG3EF4LODVPKUVNQQ75CMGTQDT3DXA5CNFSM4GGWVPYKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD35THHI#issuecomment-519779229>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AG3EF4O3RWARAYKS5QKV54LQDT3DXANCNFSM4GGWVPYA>
.
|
ping @jeehoonkang @stjepang any mentoring notes on the issue ? |
Hi @blitzerr. I discussed this problem with @jeehoonkang and here is the summary. TL;DR;
Details
Requirement for retirement (
|
Thanks a lot @tomtomjhj for the link to the paper. I will read it and get back to you if I have questions, which I am sure I will have many :) |
@stjepang We actually found out that free at E+3 is still buggy because of a similar scenario in which the thread 2 first pins at E and sleeps until the thread 1 pins at E+1 and gets yielded just before the pushing CAS as in your scenario. So in this scenario the actual sequence of modifications of the queue by thread 1 and 2 is identical but the epoch in which the "real unlinking" happens at E+1 instead of E. In this case, a thread that pins at E+2 may access the retired The real problem is that the implementation doesn't faithfully follow the original MSQueue algorithm. The original |
To be clear, it sounds like the four queues mentioned in the open of this issue are all effectively potentially unsafe. If that is true, I think it's worth it to disclose this information in the documentation for these queues. My understanding was that |
Afaict, |
@cynecx thanks for pointing that out, I think that's true. It doesn't look like anything in this repo is dependent on |
@twmb Well, actually there is, it's |
pop() must completely unlink the popped node from the data structure before it calls defer_destroy() to prevent use-after-free. closes crossbeam-rs#238
466: fix use-after-free in crossbeam-epoch/sync/queue r=jeehoonkang a=tomtomjhj `pop()` must completely unlink the popped node from the shared memory before it calls `defer_destroy()` to prevent use-after-free. This implementation is based on the variation by Doherty et al. where the `head == tail` check is done after a successful CAS, which can be slightly more efficient than the original MSQueue. closes #238 Co-authored-by: Jaehwang Jerry Jung <tomtomjhj@gmail.com>
Use-after-free could be an exploitable security vulnerability. Please make a semver-compatible release with a fix and file a security advisory so that dependent crates would know to upgrade to a fixed version. |
416: Require three epoch advancements for deallocation r=jeehoonkang a=tomtomjhj This patch implements the fix discussed in #238 (comment). * tag the Bag with the local epoch E * free at E+3 Co-authored-by: Jeehoon Kang <jeehoon.kang@sf.snu.ac.kr>
pop() must completely unlink the popped node from the shared memory before it calls defer_destroy() to prevent use-after-free. This implementation is based on the variation by Doherty et al. where the `head == tail` check is done after a successful CAS, which can be slightly more efficient than the original version. closes crossbeam-rs#238
Queues that suffer from the problem:
MsQueue
SegQueue
channel::unbounded()
(see Potential subtle unsafe memory reclaim #237, thanks to @Pslydhh)Queue
incrossbeam-epoch
Consider the following scenario in
MsQueue
...Suppose the current epoch is
E
. The queue is empty, meaninghead
andtail
are pointing to the sentinel node (let's call itnode0
).In other words,
head = node0
,tail = node0
,node0.next = null
.The following happens next...
Thread
T1
gets pinned in epochE
and pushes a new value into the queue by allocating a new nodenode1
and then it doesnode0.next.cas(null, node1)
. The next operation it needs to do istail.cas(node0, node1)
, but the thread gets yielded first sotail
is stillnode0
.The queue is now:
head = node0
,tail = node0
,node0.next = node1
,node1.next = null
.Thread
T2
gets pinned in epochE
and pops a value from the queue withhead.cas(node0, node0.next)
. The CAS succeeds andhead
becomesnode1
. Then we calldefer_destroy(node0)
, which meansnode0
becomes garbage marked with epochE
. The thread then exits.The queue is now:
head = node1
,tail = node0
,node0.next = node1
,node1.next = null
.The global epoch gets advanced to
E+1
.Thread
T3
gets pinned in epochE+1
and attempts to push a new value into the queue. It allocates a new node callednode2
, loads thetail
pointer and getsnode0
. Then, it would attempt to donode0.next.cas(null, node2)
, but first it yields...Thread
T1
resumes, setstail
tonode1
, and exits.The queue is now:
head = node1
,tail = node1
,node1.next = null
.The global epoch is advanced to
E+2
. Garbage from epochE
is collected and sonode0
gets deallocated.Thread
T3
resumes and attempts to continue withnode0.next.cas(null, node2)
, butnode0
has been destroyed. This is use-after-free!To solve this problem, I propose we just relax the invariant used by
defer
.Currently, the invariant is: We may use
defer
to destroy an object only if it will become inaccessible when all active guards get dropped.I propose we change it to: We may use
defer
to destroy an object only if it will become inaccessible when all active guards and future guards overlapping with them get dropped.In other words, instead of saying "call
defer
if the object is inaccessible" we'd say "calldefer
if it is inaccessible or if it will become inaccessible before all active guards get dropped".The PR fixing this would simply change the
global_epoch - garbage_epoch >= 2
invariant toglobal_epoch - garbage_epoch >= 3
.cc @Pslydhh @jeehoonkang
The text was updated successfully, but these errors were encountered: