Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lifecycle design #335

Draft
wants to merge 11 commits into
base: gh-pages
Choose a base branch
from
Draft

Lifecycle design #335

wants to merge 11 commits into from

Conversation

alsora
Copy link
Contributor

@alsora alsora commented Aug 17, 2023

No description provided.

@clalancette
Copy link
Contributor

@SteveMacenski Since Nav2 is a heavy user of lifecycle, would you mind giving this a look? We'd be very interested in your feedback.

@SteveMacenski
Copy link
Contributor

It'll be next week at this point for me, but yes I will! The only thing I see immediately that I would ask a question about is the throwing of exceptions. Who, if anyone, is able to catch that error in the state transition? It would be unacceptable if that was uncatchable by the user / rclcpp such that the server crashed.

@alsora alsora marked this pull request as draft August 24, 2023 18:01
@alsora
Copy link
Contributor Author

alsora commented Aug 24, 2023

@clalancette @SteveMacenski This wasn't supposed to be a public-facing PR, but rather to first go through a last round of internal reviews (I should have targeted our fork rather than the official repo)

I won't take it down but I converted it to draft.


To transition out of a primary state requires action from an external supervisory process, with the exception of an error being triggered in the `Active` state.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this remark has value to keep (unless you just moved it elsewhere)


### Primary State: Unconfigured
### Primary State Definitions and Expected Behaviors
Primary States allow for assumptions and expected behavior of the Lifecycle Node with respect to the node’s [Managed Entities](#managed-entities).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea what this sentences are trying to communicate. I also don't understand what the second sentence in this block is trying to say that shouldn't have already been made clear when introducing the idea of transition states.

This is the life cycle state the node is in immediately after being instantiated.
This is also the state in which a node may be retuned to after an error has happened.
In this state there is expected to be no stored state.
Note "processing" refers to Managed Entities receiving/sending messages, those messages being propagated through the `ROS 2` stack, and user callbacks processing these messages with respect to the executor.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand fully what this is communicating, but I think its overly technocratic.

"Processing in managed entities means messages, services, actions, etc are being processed, whereas not processing indicates that messages are not being sent or received by the middleware."


The main purpose of this state is to allow a node to be (re-)configured (changing configuration parameters, adding and removing topic publications/subscriptions, etc) without altering its behavior while it is running.
#### **Primary State: `Inactive`**
This state represents a node that is performing minimal processing with allocated Managed Entities.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is "minimal processing" the original documents' seems more clear (maybe just update this with the entity language)

This state represents a node that is not currently performing any processing.

The main purpose of this state is to allow a node to be (re-)configured (changing configuration parameters, adding and removing topic publications/subscriptions, etc) without altering its behavior while it is running.
#### **Primary State: `Inactive`**
This state represents a node that is performing minimal processing with allocated Managed Entities.
Thus, as defined in the [Managed Entities](#managed-entities) section, the Managed Entities will not be added to the `ROS` graph with data retention to topics following that of an unallocated Persistent Entity.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If pre-defined and within the definition, it needs not be restated repeatedly. Especially when the sentence prior basically just says "not processing"

If the user accepts and successfully handles the cancellation request, the state will follow the `CallbackReturn::FAILURE` path of the state machine.
This follows the "fallback" behavior of a transition.
If the user accepts and unsuccesfully handles the cancellation request, the state will follow the `CallbackReturn::ERROR` path of the state machine.
An ignored cancellation request becomes inactive and failed the instant a transition is inactive.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
An ignored cancellation request becomes inactive and failed the instant a transition is inactive.
An ignored cancellation request becomes inactive and failed the instant a transition is inactive.

I would think that we'd basically transition back down to what the cancel would have done for us if it were processed properly. E.g. If transitioning to Active and cancel is ignored, we just force the state machine back down to Configured.

Or, if cancel accepted + processed puts it always back in the unconfigured state, then an ignored cancel request would transition the system back down the unconfigured automatically so it seems as if it were handled regardless


### Destroy Transition
#### **Operation `raise_error()`**
This operation transitions the node from the primary states `{Unconfigured, Inactive, Active}` to `ErrorProcessing`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What exceptions can I throw that will do this? I think that needs to be explicit (e.g. std::runtime_error)


### Create Transition
### Return Values
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get that for transition states, but is that true for primary states as well? So no exception catching to cause us to go into error processing if an uncaught exception is thrown in a primary state? I think that's untrue given how the primary states work, so I think this needs clarification that we're only referring to transition states.

It will publish every time that a transition is triggered, whether successful or not.
### Supported Interfaces

This section outlines Persistent Entities created by the backend of Lifecycle nodes. These Persistent Entities make up the external facing management interface of a Lifecycle node. Each of these external interfaces must also include an internal equivalent (i.e., a function the user can call from within the node itself).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

L373 - L383 I have no idea what you're trying to communicate or why its an important thing to distinctly discuss

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this entire subsection "Supported Interfaces" I don't understand. Something is really unclear about what the intention is. There's some stuff about the ROS interfaces (e.g. topics/services) which I ack as standard practice, but no idea what anything before "The following interface behaviors will be supported" seems unrelated.

Also, all the interfaces are CamelCase and if these are supposed to be topics, then theey should be snake_case. If these are interface message types, that's unclear.


## Managed Entities

Managed entities transition with the state machine, being automatically allocated and spun.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh (on this whole subsection)

@SteveMacenski
Copy link
Contributor

SteveMacenski commented Aug 30, 2023

More general thoughts

  • I'm not sure why ROS 2 is in code-styling, its quite distracting.
  • I don't know if Persistent Entities is already part of this technology's lexicon. If its not, I might suggest changing it. I'm not sure it describes well what you want to differentiate. May I suggest Unmanaged Entities.
  • The T:Shutdown in the error processing marked half of the diagram is out of place and should be on the right hand side
  • Also, the diagram should be inverted, the left (first) half should be the main state machine. Its confusing to inspect it in reverse
  • I think some more of the concepts needs to be introduced up top before going into describing all the states. I don't think we should wait until introducing all the transition states to even tell someone what that means. Same with primary states, although its less an issue since its the first up. I think there needs to be a concepts section after Background to talk about the types of states like you do types of entities. Then you can lump all the states + expected behaviors in one section.

There are lots of things in here that are overly technocratic that even someone that uses lifecycle nodes every day it takes me 2-3 sometimes 4 times reading through to actually understand what you mean. I'd recommend looking over the original document where the same points were addressed previously that were sufficiently technical (might benefit from more detail) but written in language that I can grokk.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants