Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

relax Rc restrictions and add strong/weak reference counted pointer type #10926

Merged
merged 4 commits into from
Jan 10, 2014
Merged

relax Rc restrictions and add strong/weak reference counted pointer type #10926

merged 4 commits into from
Jan 10, 2014

Conversation

thestinger
Copy link
Contributor

No description provided.

@brson
Copy link
Contributor

brson commented Dec 12, 2013

I'd like to understand the motivation here better.

It looks to me like Strong is exactly the same as Rc except that it can be 'downgraded'. How do strong pointers help prevent cycles?

@thestinger
Copy link
Contributor Author

@brson: If you want to have a tree with parent pointers without any unsafe code, you'll end up with something like this:

struct Node {
    children: ~[Rc<Node>],
    parent: Rc<Node>
}

However, now there's a cycle between the parent and each child, so the entire data structure will leak.

These types allow you to do the following instead:

struct Node {
    children: ~[Strong<Node>],
    parent: Weak<Node>
}

The Weak pointer will only hold onto the box allocation, but the object will be destroyed as soon as the last Strong reference goes out of scope. This prevents a leak because the destructor will do a normal walk down the tree, destroying the weak references as it goes. Using a Weak pointer requires upgrading it back to a Strong reference, and it will return None if the object is dead.

This works for any case where ownership is a directed, acyclic graph. The Weak pointers allow for non-owning cyclic references within the graph. A garbage collector allows you to ignore ownership completely, but you lose deterministic destruction. Until we have a tracing garbage collector this is the only way to prevent leaks from cycles without unsafe.

We could add this functionality to Rc instead, but adding an extra word to the allocation and a branch to the destructor is a undesirable. I'm not sure if it's worth paying that price for simplicity at the moment because having super lightweight reference counting is quite nice and the vast majority of use cases won't need weak references.

@emberian
Copy link
Member

👍

@thestinger
Copy link
Contributor Author

Any news on this? The ability to create cyclic references without leaks, unsafe code or non-deterministic destruction seems like an obvious thing to provide in the standard library.

@pcwalton
Copy link
Contributor

I find this compelling but I would like to discuss with the team at the next meeting.

@brson
Copy link
Contributor

brson commented Dec 19, 2013

If I understand correctly the Strong/Weak pointers don't provide any static guarantees - you still have to structure your code correctly to avoid the cycle. So this patch would leave us with no rc types that avoid cycles statically and two slightly different ones that allow cycles. I'm not sure this is the right balance, nor that we want to provide an abundance of smart pointer types by default - can we not come up with a single refcounted type that works well enough for standard and leave more special cases to the multiverse (extra)?

@brson
Copy link
Contributor

brson commented Dec 19, 2013

It's on the agenda for the next meeting, but that's not until 1/7. Maybe we can resolve this earlier.

@thestinger
Copy link
Contributor Author

@brson: The Strong type could replace the Rc type, if we don't care about the performance hit. It's not going to be huge, just the extra word in the allocation often bumping it into the next size class (2x memory usage) and an extra inner branch in the destructor (only hit when it's actually destroying - not a big deal).

@brson
Copy link
Contributor

brson commented Dec 19, 2013

We could possibly keep the size down by making the ref counts smaller, make them both u32 possibly.

I'm worried about how difficult it will be to correctly structure programs using Strong/Weak pointers to avoid cycles. If we made Strong a unique type and only allowed weak pointers it would be impossible to create strong cycles.

@thestinger
Copy link
Contributor Author

@brson: If Strong is unique, it would allow you to represent trees but not directed acyclic graphs. Objective-C and C++ both use reference counting with explicit weak pointers in this way, and in well structured programs it's not hard to prevent cycles. If you're using reference counting as a way to to ignore ownership, then it's not going to turn out well. For that case, we will eventually have a garbage collector.

@thestinger
Copy link
Contributor Author

By the way, I don't really think using u32 will work because then it will need branches to check for overflow which will be quite slow.

@Thiez
Copy link
Contributor

Thiez commented Dec 29, 2013

Why check for overflow with u32? It will only overflow when you have over 4294967295 objects referring to the same object, which is not even possible on a 32-bit system, and on a 64-bit system would require 32GB worth of pointers (pointer size = 8 bytes, need 2 ^ 32 pointers) (all of which would refer to the same object!). The risk of reference count overflow seems entirely acceptable to me, unless I'm missing something important?

@thestinger
Copy link
Contributor Author

@Thiez: You're missing that overflow causes memory unsafety. It's not acceptable for memory safety to be compromised...

@Thiez
Copy link
Contributor

Thiez commented Dec 29, 2013

Except that the case where memory unsafety occurs requires a program that uses over 32GB of RAM, and that is when it does absolutely nothing but attempt to create a far-fetched scenario and has no allocation overhead. If we assume those 4.3 billion object conspiring to make the refcount overflow also contain some useful data suddenly we need over 64GB just to have a chance of this scenario to occur. Perhaps use u32 for refcounts in the rust "desktop and embedded device edition" and u64 in the "enterprise server edition"? 😉

@thestinger
Copy link
Contributor Author

An infinite loop or infinite recursion is not a far-fetched scenario and is a perfectly viable way to exploit a Rust application. If you want to use u32 you need a branch, safety isn't negotiable in Rust. Servers with terabytes of memory exist today.

@liigo
Copy link
Contributor

liigo commented Jan 3, 2014

u32 for refcounts is sufficient

@thestinger
Copy link
Contributor Author

@liigo: It's not sufficient because it breaks memory safety on overflow.

@brson
Copy link
Contributor

brson commented Jan 7, 2014

I don't see how u32's require overflow checks but uintptr's don't.

@huonw
Copy link
Member

huonw commented Jan 7, 2014

Each pointer takes up at least one byte, so if you had enough clones of an Rc's to overflow a uintptr, you'd've filled up your entire addressable memory (and more) with them.

@emberian
Copy link
Member

emberian commented Jan 7, 2014

@brson one cannot overflow a uint without filling the entire address space
with Rc objects. It's trivial to do so with only u32's on any > 32bit arch.

On Mon, Jan 6, 2014 at 9:09 PM, Brian Anderson notifications@git.luolix.topwrote:

I don't see how u32's require overflow checks but uintptr's don't.


Reply to this email directly or view it on GitHubhttps://github.com//pull/10926#issuecomment-31707595
.

@emberian
Copy link
Member

emberian commented Jan 7, 2014

That is, the overflow check is a catastrophic failure.

On Mon, Jan 6, 2014 at 9:13 PM, Corey Richardson corey@octayn.net wrote:

@brson one cannot overflow a uint without filling the entire address space
with Rc objects. It's trivial to do so with only u32's on any > 32bit arch.

On Mon, Jan 6, 2014 at 9:09 PM, Brian Anderson notifications@git.luolix.topwrote:

I don't see how u32's require overflow checks but uintptr's don't.


Reply to this email directly or view it on GitHubhttps://github.com//pull/10926#issuecomment-31707595
.

@brson
Copy link
Contributor

brson commented Jan 7, 2014

In the meeting today we decided to get rid of the current Rc type and rename this Strong pointer to Rc, So we'll have rc::Rc and rc::Weak. https://github.com/mozilla/rust/wiki/Meeting-weekly-2013-01-07#rc-ptrs

@brson
Copy link
Contributor

brson commented Jan 7, 2014

Thanks @strcat

@alexcrichton
Copy link
Member

I canceled this build because I don't believe that the change to introduce and require a NonManaged bound has been discussed enough. I am unclear on why this bound is necessary, and it seems awfully unfortunate to introduce a whole new kind bound for this one use case.

If this is a pre-emptive fix for a future GC, then I do not think it should be added at this time. If this is a soundness requirement for the current GC, I'm curious for a code example which breaks in today's GC.

@alexcrichton
Copy link
Member

Also, would you mind rebasing the last few commits together?

@thestinger
Copy link
Contributor Author

I would prefer leaving off the ptr_eq methods and just having people use ref_eq(x.borrow(), y.borrow()), but other than that it should be good to go now.

bors added a commit that referenced this pull request Jan 9, 2014
@alexcrichton
Copy link
Member

Thanks for looking into whether NonManaged was necessary!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants