Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch from RAII to OBRM Terms, expand and improve parts (#322) #323

Draft
wants to merge 12 commits into
base: main
Choose a base branch
from
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
# Generated output of mdbook
/book
book
.DS_Store
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@ A good principle: "Work together, share ideas, teach others."

### Important Note

Please **don't force push** commits in your branch, in order to keep commit
Please __don't force push__ commits in your branch, in order to keep commit
history and make it easier for us to see changes between reviews.

Make sure to `Allow edits of maintainers` (under the text box) in the PR so
Expand Down
2 changes: 1 addition & 1 deletion SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
- [Command](./patterns/behavioural/command.md)
- [Interpreter](./patterns/behavioural/interpreter.md)
- [Newtype](./patterns/behavioural/newtype.md)
- [RAII Guards](./patterns/behavioural/RAII.md)
- [Resource management with OBRM (RAII)](./patterns/behavioural/RAII.md)
- [Strategy](./patterns/behavioural/strategy.md)
- [Visitor](./patterns/behavioural/visitor.md)
- [Creational](./patterns/creational/intro.md)
Expand Down
4 changes: 4 additions & 0 deletions idioms/dtor-finally.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# Finalisation in destructors

<!-- I'm not sure this is idomatic to Rust, usually one would want to handle that in the types used themself. -->
<!-- doesnt draw comparisons to `defer` in eg golang or ziglang -->
<!-- Is unneccessarily verbose, IF one would want to do that one could simply define a `Defer` type that holds a closure OR a function pointer that gets executed on drop -->
<!-- theres also crates that aim to implenent this via macros, I expect they use something like the above or below. -->
Comment on lines +3 to +6
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should go into a separate issue.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I just noticed that while I was there, and didn't have time nor was sure how to approach that. But a seperate issue for that is totally fine by me.

## Description

Rust does not provide the equivalent to `finally` blocks - code that will be
Expand Down
75 changes: 58 additions & 17 deletions patterns/behavioural/RAII.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,27 @@
# RAII with guards

# Resource management with OBRM

<!-- TODO:
* wayback
* clear up dtor finally discussion
* execute and apply lints (contributing.md)
* check unlinted formatting
-->
## Description

[RAII][wikipedia] stands for "Resource Acquisition is Initialisation" which is a
terrible name. The essence of the pattern is that resource initialisation is done
in the constructor of an object and finalisation in the destructor. This pattern
is extended in Rust by using an RAII object as a guard of some resource and relying
on the type system to ensure that access is always mediated by the guard object.
"Ownership Based Resource Management" (OBRM) - also known as ["Resource Acquisition is Initialisation" (RAII)][wikipedia] - is an idiom meant to make handling resources easier and less error-prone.

In essence it means that an object serves as proxy for a resource, to create the object you have to aquire the resource, once that object isn't used anymore - determined by it being unreachable - the resource is released.
It is said the object guards access to the resource.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd highlight the "O" in "OBRM" a bit more: the object "owns" the resource and thus is responsible for releasing it. But that may be personal preference.


This idiom is supported by the language as it allows to automatically insert calls to the releasing code in the spots where the object becomes unreachable.
The method releasing the resource is generally referred to as destructor, in Rust [drop][Drop::drop] serves that role.

## Example

OBRM is used to manage heap memory in Rust, determining when to free it.
`Box` and `Rc` are classical examples of that.
But most users will have closer contact with OBRM when managing other aspects.

Mutex guards are the classic example of this pattern from the std library (this
is a simplified version of the real implementation):

Expand Down Expand Up @@ -71,24 +83,31 @@ fn baz(x: Mutex<Foo>) {

## Motivation

Where a resource must be finalised after use, RAII can be used to do this
finalisation. If it is an error to access that resource after finalisation, then
this pattern can be used to prevent such errors.
Often times a user will not need to implement [Drop::drop] themselves but will already be covered by just using the provided OBRM-Objects from the standard library or used crates.

<!-- TODO that feels sluggish -->
But for managing external resources it is often helpful, when communicating with external systems, or of course if implementing your own resources.

## Advantages

Prevents errors where a resource is not finalised and where a resource is used
after finalisation.

## Disadvantages

OBRM ensures correctness with implicit behavior, which isn't visible in the source code (one needs to be aware that said object uses OBRM). It also can be difficult to implement for some complex situations. For example resource aquisition and release in bulk, like in performance critical code. Or code which may not fail in some sections - resource aquisition is often fallible.

OBRM interaction with asynchronous code can also [be unexpected][Documentation of tokios Mutex].

## Discussion

RAII is a useful pattern for ensuring resources are properly deallocated or
finalised. We can make use of the borrow checker in Rust to statically prevent
errors stemming from using resources after finalisation takes place.
OBRM is a useful pattern for ensuring resources are properly handled.
The borrow checker in Rust will statically prevent
errors stemming from using resources after the resource has been released.

The core aim of the borrow checker is to ensure that references to data do not
outlive that data. The RAII guard pattern works because the guard object
contains a reference to the underlying resource and only exposes such
outlive that data. The OBRM guard pattern works because the guard object
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the sections above you say "ORBM-Object". I'd stick to that one.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was some sort of attempt of not having to rewrite sections below, and also of introducing the guard terminology which is also used in rusts stdlib (eg with the mutex guard).

acts as a proxy to the underlying resource and enables access only via
references. Rust ensures that the guard cannot outlive the underlying resource
and that references to the resource mediated by the guard cannot outlive the
guard. To see how this works it is helpful to examine the signature of `deref`
Expand All @@ -108,6 +127,25 @@ Note that implementing `Deref` is not a core part of this pattern, it only makes
using the guard object more ergonomic. Implementing a `get` method on the guard
works just as well.

When compared with RAII in C++, there are a few significant differences:

* while C++ code often interfaces with C code or code in older styles, which doesn't use RAII. Rust does so much less often and because of a few factors one often just pulls a crate that already has the API encapsulated. So its far less common to implement OBRM yourself
* C++ doesn't have a borrow checker, so code using RAII can not archive the same combination of safety and ergonomics
* perhaps most importantly, Rust has different semantics when it comes to moving and copying of values, this will be expanded on below.

C++ has complex rules for copying and moving of values, that Rust managed to simplify while keeping most advantages.
In C++ behavior on a "move" (which is semantically meant to signify passing held resources to the moved-to value) is customizable in its move and move-assignment constructors.
But after a variable has been "moved out of", it must still be accessable in C++.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This paragraph is not exactly wrong, but an imo important thing to understand when working with C++ is that the rules in the language are actually not that incredibly complex when it comes to object lifetimes. But they are from another era, and the choices back then had some unfortunate consequences, some of which you detail on further down.
Also, the whole "move" in C++ is really more a convention rather than a concept that exists in the language. And so you end up with zombie objects which end up in an "undefined-but-recoverable" state (simplified for educational purposes) after their contents was moved to another instance.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct, the rules are not really that complex. What is complex is executing it.

As you say, because moves are more of a convention in C++, which more or less mostly just added a move operation and let the developers implement it, you get undefined states for many objects even in the standard lib, with a few exceptions where behavior is defined in the standard, like unique_ptr AFAIK.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not exactly sure but I think even for unique_ptr the standard doesn't mandate that the object moved from is in a defined state but only that a call to std::unique_ptr::reset() will transition it to a defined state. But I imagine any sane implementation will leave a moved-from std::unique_ptr in a reset-state, already.

Copy link
Author

@9SMTM6 9SMTM6 Oct 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are getting a bit sidetracked;-P.

I could rewrite that section to say something to the nature of "copy and move in C++ are just operations with connected conventions, which must be upheld across the respective copy (assignment) constructor and move (assignment) constructor and might also need adjustments to the destructor", which AFAIK would be the correct description but seems fairly complex right now, and more importantly, AFAIK isnt really how people usually think about copy and move.

Other suggestions, or do you think that suggestion is fine?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I think your change is perfectly fine.

In Rust, a moved-out-of variable can not be used, only reassigned a new value (this is referred to as "destructive move"), and the behavior on a move is not customizable, instead a move simply copies the bytes of the moved-out value into the moved-into variable, and ensures the semantics of a destructive move.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIU you don't reassign a value to a variable moved away. You bind something new to a name which now happens to be unused. One consequence is that the new thing doesn't need to have the same type.

Also, I'd split this sentence into two, e.g.

Suggested change
In Rust, a moved-out-of variable can not be used, only reassigned a new value (this is referred to as "destructive move"), and the behavior on a move is not customizable, instead a move simply copies the bytes of the moved-out value into the moved-into variable, and ensures the semantics of a destructive move.
In Rust, a moved-out-of variable can not be used (this is referred to as "destructive move"), though the name of that variable can be re-used.
The behavior on a move is not customizable, instead a move simply copies the bytes of the moved-out value into the moved-into variable, and ensures the semantics of a destructive move.

Copy link
Author

@9SMTM6 9SMTM6 Oct 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIU you don't reassign a value to a variable moved away. You bind something new to a name which now happens to be unused. One consequence is that the new thing doesn't need to have the same type.

I'm pretty sure you can and I have reused a variable that was moved out. In that case it has to have the same type, and it'll use the same memory, which is the purpose of that rule (and probably the reason C++ requires that an object still be usable after being moved out of), rust found a somewhat more elegant way.

If you want to reuse the name only yeah you can vcreate a new let binding shadowing the variable, but thats not what I'm referring to.

But perhaps thats a sign that should be made more explicit?

Otherwise splitting the sentence is a good suggestion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're talking past each other. Are you referring to variables implementing Copy? Then I agree with you. But OBRM-Objects do, virtually by definition, implement Drop and thus not Copy.

I thought you meant creating a new binding using the same name. In this case you may end-up reusing the memory previously occupied by the moved value, but that being specified in the language would be news to me.

Copy link
Author

@9SMTM6 9SMTM6 Oct 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we probably are. Heres some code to ensure we understand each other now:

struct MoveSemantics {
    field: String,
}

struct WhateverElse(u32);

fn main() {
    let mut to_move_out_of = MoveSemantics {
        field: String::from("Whatever"),
    };
    // move value out of to_move_out_of,
    // afterwards its illegal to access
    let moved_to = to_move_out_of;
    // error
    let attempt_access = to_move_out_of.field;
    // reassignment to variable is legal and uses the same memory 
    // (not for the string, which is a seperate allocation, but even if 
    // to_move_out_of would be on the heap it would be reused nonetheless AFAIK
    // this might be beneficial for some situations
    // needs to be the same type of course
    to_move_out_of = MoveSemantics {
        field: String::new(),
    };
    // shadowing of the variable name, doesnt use the same memory
    let to_move_out_of = WhateverElse(2);
}

I think I used the right terminology here, the variable is the name and refers to the memory, the value is whats written in the memory and abstractly also connected resources, and you can define a new variable which has the same name as a old one, which would be what you've described above as far as I understood it.

Copy link
Author

@9SMTM6 9SMTM6 Oct 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was referring to the reassignment of the variable.

The reason I felt the need to mention that possibility is that its the one "usage" of the variable that is allowed, AFAIK all others are forbidden. So for completeness its required unless we find another formulation that covers only "usages" of a variable other than reassignment. Also some C++ programmers might be looking for something like that, and might otherwise wrongly think you cant do it.

We could say thats not worth the effort and strike it, or we expand on it, or we keep it as-is and hope people understand it correctly.

Copy link
Contributor

@neithernut neithernut Oct 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just as I thought, I misinterpreted what you wrote. I think we're on the same page now.
The only case where you can reuse a variable after passing it by value is if its type implements Copy, so maybe the following would work?

Suggested change
In Rust, a moved-out-of variable can not be used, only reassigned a new value (this is referred to as "destructive move"), and the behavior on a move is not customizable, instead a move simply copies the bytes of the moved-out value into the moved-into variable, and ensures the semantics of a destructive move.
In Rust, a moved-out-of variable can not be used (this is referred to as "destructive move") unless its type implements [Copy].
The behavior on a move is not customizable, instead a move simply copies the bytes of the moved-out value into the moved-into variable, and ensures the semantics of a destructive move.

The hint at Copy should be sufficient imo. C++ people new to rust will probably look it up.

Copy link
Collaborator

@simonsan simonsan Oct 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C++ people new to rust will probably look it up.

Please keep in mind that this book is not aimed at only either C++ developers or experienced programmers. We want to try to keep it as inclusive as possible, also to newcomers from other languages. I say that, because if there is a term, that C++ developers need to look up, it's probably good to at least try to explain it a bit or use a less complex explanation/go less into detail.

I haven't had much time the last days to review more of this article, but it's good that others did and you together keep on working on it. I will probably manage to read over it (in terms of reviewing) beginning of next week.


<!-- TODO this should be improved, I find it difficult to separate the creation and management of RAII Objects in the **-constructor - so at declaration time - from the one when using the RAII object. Feedbak welcome. -->
This massively simplifies creation and management of OBRM Objects compared to C++, where one often has to do a lot more manual management of RAII classes - definition of the `destructor`, the `copy constructor`, the `copy assignment constructor`, the `move constructor` and the `move assignment constructor` all at once -, which is very error prone, and where RAII objects have to have a legal moved-out state, which often makes usage of these classes more problematic.
For example, `unique_ptr`, the C++ standard library type that solves similar purposes as `Box`, can contain `nullptr`.

Rust also moves values by default, which can be opted out by explicitly calling `Clone::clone` on each assignment, or on a Type level by implementing `Copy`.
It is currently forbidden, and that is expected to continue, to implement `Copy` on a Type that implements `Drop` or contains a Type that implements `Drop`.
This means that resource aquisition in Rust is a lot more explicit than in C++, as it can not happen during a simple assignment as it can in C++.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You more or less captured already this already:
In C++, objects are expected to stay where they are (in memory) but you end up copying objects every now and then (and knowing when its acceptable is important for a C++ dev). Forbidding that is often an explicit choice and moving is also often more explicit.
In Rust, values tend to move around and if they are simple enough (like a Plain Old Datatype in C/C++), they can impl Copy. If they are not you have to impl Clone in order to multiply an instance, and cloning is always explicit (you have to call Clone::clone).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe if we address the points above we might integrate this with the section above about (currently) "complex C++ rules", but as it is this section still holds new information as far as I can see.

I'd defer this until we've cleared up the parts above.


## See also

[Finalisation in destructors idiom](../../idioms/dtor-finally.md)
Expand All @@ -117,5 +155,8 @@ RAII is a common pattern in C++: [cppreference.com](http://en.cppreference.com/w

[wikipedia]: https://en.wikipedia.org/wiki/Resource_Acquisition_Is_Initialization

[Style guide entry](https://doc.rust-lang.org/1.0.0/style/ownership/raii.html)
(currently just a placeholder).
[Drop::drop]: https://doc.rust-lang.org/std/ops/trait.Drop.html#tymethod.drop

[Documentation of tokios Mutex]: https://docs.rs/tokio/latest/tokio/sync/struct.Mutex.html#which-kind-of-mutex-should-you-use

Rustdoc to std::marker::Copy explaining why [Copy forbids implementing Drop]: <https://doc.rust-lang.org/std/marker/trait.Copy.html#when-cant-my-type-be-copy>