Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify and make Request safer #18

Merged
merged 6 commits into from
May 20, 2017
Merged

Conversation

Rufflewind
Copy link
Contributor

@Rufflewind Rufflewind commented Mar 27, 2017

Created two kinds of Scope objects:

  • LocalScope to track Requests with arbitrary lifetime 'a
  • StaticScope (no bookkeeping, but only 'static Requests)

The Request trait has been squashed into a single Request<S> data type, where S is the type of scope that this Request is associated with.

Fixes #17.


  • Generalize methods over Scope.
  • There's no way to transfer from an outer LocalScope to an inner LocalScope.
  • It would be nice to generalize Buffer to allow things like Rc, which would be compatible with StaticScope. (WIP: Rufflewind:bindgen-hack...Rufflewind:buffer)

@Rufflewind Rufflewind force-pushed the bindgen-hack branch 3 times, most recently from e0e1940 to 6cbe9d5 Compare March 27, 2017 08:01
Copy link
Collaborator

@bsteinb bsteinb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! Just a few nits:

  • I would prefer to separate the inactive request stuff out into another PR.
  • Requests should be deregistered from their scope when the associated operations are completed via explicitly calling a method or wrapping the request in a completion guard. This actually makes some of the examples fail. Unfortunately I forgot to return a non-zero exit status from one of the CI scripts, so it does not show up on Travis.

@Rufflewind
Copy link
Contributor Author

Rufflewind commented Mar 27, 2017

Some questions on design:

  • What should the semantics of dropping a Request be? Panic or wait? In 6cbe9d5 I actually did neither: I simply allowed it to be handled by the Scope, but this would be unsafe if owned buffers are attached a request (if we ever move toward that design). Edit: went with panic for the time being, although now the code is messier.
  • Similarly, what should be the semantics of forgetting a Request be? Panic or wait? In 6cbe9d5 I chose to wait. If we instead panicked, then the implementation of LocalScope could be simplified to just a reference counter instead of a HashMap.

@Rufflewind Rufflewind force-pushed the bindgen-hack branch 3 times, most recently from 63882b8 to dda0458 Compare March 28, 2017 01:40
@Rufflewind
Copy link
Contributor Author

Rufflewind commented Mar 28, 2017

Changes since yesterday:

  • Removed InactiveRequest.
  • Make Request panic on drop (restore old behavior for now).
  • Refactored request.rs a bit (hopefully I didn't forget to unregister things this time?)
  • Added 'a to Scope trait, which allows …
  • … generalization of point_to_point.rs and collective.rs methods over Scope.
  • Edit: Added shrink_scope_to.

@Rufflewind Rufflewind force-pushed the bindgen-hack branch 6 times, most recently from 0ee0a4e to b513eec Compare March 28, 2017 04:08
@Rufflewind
Copy link
Contributor Author

Rufflewind commented Mar 28, 2017

So I started working on a separate branch to generalize buffer types to support 'static buffers like Vec or Rc. That way, they can use StaticScope and Vec<f64> and not have to worry about LocalScopes. It's a WIP, and the names are not finalized yet (in particular, it's intended to replace the original Buffer and BufferMut traits).

  • Does this look interesting to you, or should I abandon this idea?

@Rufflewind Rufflewind changed the title [WIP] Simplify and make Request safer Simplify and make Request safer Mar 28, 2017
@bsteinb
Copy link
Collaborator

bsteinb commented Mar 28, 2017

Regarding the questions about the semantics of dropping or forgetting requests:

  • I think the semantics for dropping can remain the same for now, so panic. And then provide the WaitGuard and CancelGuard to opt into different behavior.
  • As for the semantics of forgotten requests that have not yet been deregistered when exiting from a scope: maybe it would make sense to mirror the choice of cleanup policy that is provided by the WaitGuard etc. here by parameterising Scope over another set of types WaitForgotten, CancelForgotten and PanicForgotten.

I am starting to wonder whether it would make sense to have the requests be owned by the scope. This would probably simplify all this cleaning up, but I think it would take away the ability to enforce a certain order of request completion.

@bsteinb
Copy link
Collaborator

bsteinb commented Mar 28, 2017

I must admit I am a bit confused by the refactoring in requests.rs. The former design where WaitGuard and CancelGuard were just wrappers around Request that implemented a different Drop and all of the logic about wait()ing and cancel()ing remained in Request seemed simpler. Is there a fundamental reason why this had to change that I am not seeing right now?

@bsteinb
Copy link
Collaborator

bsteinb commented Mar 28, 2017

Your other work definitely sounds interesting and I am grateful that you are investing time into this project. For the moment, I would like to focus on this PR first and move to the generalisation of buffers afterwards, if you have the patience. I am currently also juggling a regular day job, a move and a newborn, so time and mental space is scarce.

@Rufflewind
Copy link
Contributor Author

Rufflewind commented Mar 29, 2017

As for the semantics of forgotten requests that have not yet been deregistered when exiting from a scope …

IMO I think parametrizing LocalScope would add a lot of extra complexity for what is supposed to be just a safety net. It might be better to stick to panicking:

  • It's consistent with Request::drop.
  • It allows for a simple and efficient implementation of Scope using an integer counter (vs a full HashMap currently).
  • LocalScope::drop should only trigger if the user forgets a Request or accidentally builds a cycle with Rc, so panicking is a way to tell the user to fix their code. This is similar to how RefCell panics if you borrow_mut() twice – it's a run-time safety check.

Is there a fundamental reason why this had to change that I am not seeing right now?

When I tried it, it seemed that drop is called first on the outer wrapper, and then on the inner objects. This means if I do struct WaitRequest(Request), the Request destructor will be unconditionally called after WaitRequest's destructor.

@Rufflewind Rufflewind force-pushed the bindgen-hack branch 2 times, most recently from 970fa75 to 55a6b91 Compare March 29, 2017 00:07
Created two kinds of Scope objects:

  - LocalScope<'a> to track Requests with arbitrary lifetime 'a
  - StaticScope (no bookkeeping, but only 'static Requests)

The Request trait has been squashed into a single Request<'a, S> data
type, where S is the type of scope that this Request is associated with
and 'a describes the lifetime of this scope.  Any buffers associated
with the request must outlive 'a.

Fixes # 17.
This simplifies the implementation and efficiency of LocalScope
significantly and can reduce the risk of an accidental deadlock.
@bsteinb
Copy link
Collaborator

bsteinb commented Mar 29, 2017

When I tried it, it seemed that drop is called first on the outer wrapper, and then on the inner objects. This means if I do struct WaitRequest(Request), the Request destructor will be unconditionally called after WaitRequest's destructor.

True, this is why WaitGuard etc. in the original design are struct WaitGuard(Option<Request>) and drop() .take()s the value out of the Option to perform a wait or cancel on it. This way, by the time the drop() of the Option<Request> is invoked, it contains None and is reduced to a no-op.

It may be somewhat redundant to wrap the Request in an Option, since the underlying raw type MPI_Request has a defined null value, but I would say this is an optimisation.

@Rufflewind
Copy link
Contributor Author

Are there any changes that need to be made to complete this PR?

@bsteinb
Copy link
Collaborator

bsteinb commented May 18, 2017

Not necessarily. I would still like to see the wait() and cancel() logic moved out of the Guards and back into Request as I described in my previous comments, but I would be willing to just merge this and then work on that myself, if it is alright with you. I was just reluctant to just merge your work and then go and rewrite it.

@Rufflewind
Copy link
Contributor Author

Rufflewind commented May 19, 2017

Okay I’ve moved the logic into Request. I originally tried putting it in RawRequest to avoid the Option but that just made other parts more complicated. You can squash/rebase them if you want.

Having drop(&mut self) instead of drop(self) (rust-lang/rust#4330) made this whole thing really awkward.

@bsteinb
Copy link
Collaborator

bsteinb commented May 19, 2017

Wow, again, thank you so much. Looking again at my original implementation, I notice that I misremembered and my Guards did not actually use the wait() and test() methods of the Request trait, so thanks for going the extra mile there.

I also noticed just now, that my implementation of cancel() was completely unsafe even without forget()ting requests. Once more, good catch.

I have one more question which I will leave as an in-line comment immediately.

Having drop(&mut self) instead of drop(self) (rust-lang/rust#4330) made this whole thing really awkward.

Yes, I remember having the exact same reaction. It reminds me of the C++ move semantics. "Yep, this value is yours. If you want to trash it, go ahead. What? No, you don't actually get to have the value. Here, have this reference instead."

src/request.rs Outdated

/// Wait for the request to finish and unregister the request object from its scope.
///
/// This is unsafe because afterward the request is left in an unspecified but droppable state.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a leftover from an earlier version? The method is not marked unsafe and the Rust Request type is deconstructed by the call to into_raw() so I do not get what the unspecified but droppable state is?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this was from back when it took &mut self instead of self.

@bsteinb
Copy link
Collaborator

bsteinb commented May 20, 2017

Ugh, apparently I had not figured out how these review things work. I left the in-line comment yesterday, but only noticed today that I needed to submit the whole review. Sorry about that.

@Rufflewind
Copy link
Contributor Author

Should be fixed now!

@bsteinb bsteinb merged commit 69e3111 into rsmpi:bindgen-hack May 20, 2017
@bsteinb
Copy link
Collaborator

bsteinb commented May 20, 2017

Great, let's get this merged!

@Rufflewind Rufflewind deleted the bindgen-hack branch May 24, 2017 01:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants