-
Notifications
You must be signed in to change notification settings - Fork 193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define external life cycle management interface #99
base: gh-pages
Are you sure you want to change the base?
Changes from 3 commits
1052ecc
c0f964b
db9b3a8
6a96db1
f7edec9
47068fa
7a7a72d
1ef7e71
391764a
c1afa2d
a8cee35
f65f48d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -53,14 +53,17 @@ There are also 6 transition states which are intermediate states during a reques | |
In the transitions states logic will be executed to determine if the transition is successful. | ||
Success or failure shall be communicated to lifecycle management software through the lifecycle management interface. | ||
|
||
There are 7 transitions exposed to a supervisory process, they are: | ||
There are 5 transitions exposed to a supervisory process, they are: | ||
|
||
- `create` | ||
- `configure` | ||
- `cleanup` | ||
- `activate` | ||
- `deactive` | ||
- `shutdown` | ||
|
||
Additionally, there are two transitions that may be exposed by a supervisory process if dynamic creation and destruction of nodes is supported by that supervisory process. | ||
|
||
- `create` | ||
- `destroy` | ||
|
||
The behavior of each state is as defined below. | ||
|
@@ -196,51 +199,176 @@ Transitions to `ErrorProcessing` may be caused by error return codes in callback | |
|
||
- If the `onShutdown` callback raises or results in any other result code the node will transition to `Finalized`. | ||
|
||
## Use of a supervisory process | ||
|
||
Managed nodes may be instantiated manually by a developer. | ||
In this case, the developer's own code is responsible for creating and destroying the node. | ||
However, the node may alternatively be owned by a container system. | ||
The container will be responsible for creating and destroying the node. | ||
Such a container will be responsible for exposing the following two transitions. | ||
|
||
### Create Transition | ||
|
||
This transition will instantiate the node, but will not run any code beyond the constructor. | ||
|
||
### Destroy Transition | ||
|
||
This transition will simply cause the deallocation of the node. | ||
In an object oriented environment it may just involve invoking the destructor. | ||
Otherwise it will invoke a standard deallocation method. | ||
This transition should always succeed. | ||
|
||
### Create Transition | ||
## Management Interface | ||
|
||
This transition will instantiate the node, but will not run any code beyond the constructor. | ||
A node that has a managed life cycle complying with the above life cycle shall provide the following interface via ROS topics and services. | ||
This interface is to be used by tools to manage the node's life cycle state transitions, either automatically or manually according to the tool's purpose. | ||
|
||
## Management Interface | ||
The topics and services of this interface shall function in all states of the node's life cycle. | ||
They shall not be disabled by the node shifting to the `Inactive` state, for example. | ||
|
||
A managed node will be exposed to the ROS ecosystem by the following interface, as seen by tools that perform the managing. | ||
This interface should not be subject to the restrictions on communications imposed by the lifecycle states. | ||
### Interface namespace | ||
|
||
It is expected that a common pattern will be to have a container class which loads a managed node implementation from a library and through a plugin architecture automatically exposes the required management interface via methods and the container is not subject to the lifecycle management. | ||
However, it is fully valid to consider any implementation which provides this interface and follows the lifecycle policies a managed node. | ||
Conversely, any object that provides these services but does not behave in the way defined in the life cycle state machine is malformed. | ||
The interface shall be provided in a namespace named "infra/lifecycle" underneath the node's namespace. | ||
|
||
For example, if a node named `talker` has a managed life cycle complying with the state machine described above, it shall provide topics under the namespace `/talker/infra/lifecycle`. | ||
|
||
All examples in the following sections are also given assuming a node named `talker`. | ||
|
||
If the `infra/lifecycle` namespace is available under a node's namespace, then that node shall be assumed to be functioning according to the managed life cycle. | ||
If the node is not functioning according to the managed life cycle, the `infra/lifecycle` namespace shall not exist. | ||
In other words, tooling shall judge if a node is managed or not by the presence of the `infra/lifecycle` namespace. | ||
|
||
### State enumerations | ||
|
||
The messages used by the interface shall use the following enumeration for indicating states. | ||
|
||
uint8 UNKNOWN=0 | ||
uint8 UNCONFIGURED=1 | ||
uint8 INACTIVE=2 | ||
uint8 ACTIVE=3 | ||
uint8 FINALIZED=4 | ||
uint8 CONFIGURING=10 | ||
uint8 CLEANING_UP=11 | ||
uint8 SHUTTING_DOWN=12 | ||
uint8 ACTIVATING=13 | ||
uint8 DEACTIVATING=14 | ||
uint8 ERROR_PROCESSING=15 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we can link to the actual msg: https://github.com/ros2/rcl_interfaces/blob/master/lifecycle_msgs/msg/State.msg There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should the msg file follow the specification, or should the specification reference the msg? One needs to be considered the master definition, and the other needs to be kept in sync with it. Although I prefer keeping the specification document as the master definition, when considering maintenance, having the spec link to the msg file would be easier. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't mind either way. Linking to the spec might have the problem of tracking changes over time. Then again it may be just as hard to represent that in this document as well. So I don't see that one way is strictly better than the other. At least with the interface file being the sole authority, you can avoid some duplication of information if you choose. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Having thought about it more, I prefer keeping the spec as the master definition. In theory there could be other implementations. I will add a link, though. |
||
|
||
### Life cycle state changes topic | ||
|
||
When the node's life cycle changes, it shall broadcast the following message on the `infra/lifecycle/state_change` topic. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. `<node_name>__transition_event' There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. See above about namespaces. |
||
|
||
uint8 previous_state | ||
uint8 next_state | ||
string trigger | ||
|
||
These services may also be provided via attributes and method calls (for local management) in addition to being exposed ROS messages and topics/services (for remote management). | ||
In the case of providing a ROS middleware interface, specific topics must be used, and they should be placed in a suitable namespace. | ||
`trigger` may be filled in containing a reason for the life cycle change. | ||
This value is optional. | ||
|
||
Each possible supervisory transition will be provides as a service by the name of the transition except `create`. | ||
`create` will require an extra argument for finding the node to instantiate. | ||
The service will report whether the transition was successfully completed. | ||
This topic must at a minimum make the most recent message available to new subscribers at all times. | ||
This may be achieved by use of appropriate QoS settings. | ||
|
||
### Lifecycle events | ||
For example, if the `talker` node transitions from the `Active` state to the `Finalised` state via the `ShuttingDown` state in response to an external request to shut down the node, it will produce the following sequence of messages on this topic (values for `trigger` are illustrative only): | ||
|
||
A topic should be provided to broadcast the new life cycle state when it changes. | ||
This topic must be latched. | ||
The topic must be named `lifecycle_state` it will carry both the end state and the transition, with result code. | ||
It will publish ever time that a transition is triggered, whether successful or not. | ||
previous_state = INACTIVE | ||
next_state = SHUTTTING_DOWN | ||
trigger = "shutdown request" | ||
|
||
## Node Management | ||
previous_state = SHUTTING_DOWN | ||
next_state = FINALIZED | ||
trigger = "shutdown returned OK" | ||
|
||
There are several different ways in which a managed node may transition between states. | ||
Most state transitions are expected to be coordinated by an external management tool which will provide the node with it's configuration and start it. | ||
The external management tool is also expected monitor it and execute recovery behaviors in case of failures. | ||
A local management tool is also a possibility, leveraging method level interfaces. | ||
And a node could be configured to self manage, however this is discouraged as this will interfere with external logic trying to managed the node via the interface. | ||
If the `talker` node transitions from the `Inactive` state to the `Unconfigured` state via a request to activate the node and an error occurring in activation processing, it will produce the following sequence of messages: | ||
|
||
There is one transition expected to originate locally, which is the `ERROR` transition. | ||
previous_state = INACTIVE | ||
next_state = ACTIVATING | ||
trigger = "activate request" | ||
|
||
A managed node may also want to expose arguments to automatically configure and activate when run in an unmanaged system. | ||
previous_state = ACTIVATING | ||
next_state = ERROR_PROCESSING | ||
trigger = "error in activating state" | ||
|
||
previous_state = ERROR_PROCESSING | ||
next_state = UNCONFIGURED | ||
trigger = "error processing returned OK" | ||
|
||
### Current life cycle state service | ||
|
||
The node's current life cycle state shall be available via the `infra/lifecycle/get_state` service. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. See above about namespaces. |
||
The service definition is: | ||
|
||
--- | ||
uint8 state | ||
string state_name | ||
|
||
`state_name` must assume one of the following values, according to the value of `state`. | ||
|
||
Value of state | Value of state_name | ||
UNKNOWN | unknown | ||
UNCONFIGURED | unconfigured | ||
INACTIVE | inactive | ||
ACTIVE | active | ||
FINALIZED | finalized | ||
CONFIGURING | configuring | ||
CLEANING_UP | cleaning_up | ||
SHUTTING_DOWN | shutting_down | ||
ACTIVATING | activating | ||
DEACTIVATING | deactivating | ||
ERROR_PROCESSING | error_processing | ||
|
||
The `UNKNOWN`/`unknown` value shall be assumed by clients of the service to indicate that the node is in an unknown state and thus unusable. | ||
|
||
### Transition request service | ||
|
||
The service `infra/lifecycle/change_state` service shall be provided by the life cycle interface. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. See above about namespaces. |
||
|
||
When this service call is received, the node's life cycle state shall be shifted to the requested state via any appropriate intermediate states in accordance with the state diagram shown above. | ||
|
||
For example, if the node is in the `Inactive` state and a request is made to shift to the `Active` state, the node's life cycle shall first be shifted to the `Activating` state. | ||
Based on the result of the `onActivate()` function called during the `Activating` state, the node's life cycle shall then shift to either the `Active` state or the `ErrorProcessing` state. | ||
|
||
The service definition is: | ||
|
||
# Allowable transitions | ||
uint8 CONFIGURE=0 | ||
uint8 CLEANUP=1 | ||
uint8 ACTIVATE=2 | ||
uint8 DEACTIVATE=3 | ||
uint8 SHUTDOWN=4 | ||
# Transition results | ||
uint8 TRANSITION_ERROR=0 | ||
uint8 SUCCESS=1 | ||
uint8 WRONG_PREV_STATE=10 | ||
|
||
|
||
uint8 transition | ||
--- | ||
uint8 result | ||
|
||
`transition` must take one of the values defined in the transitions enumeration. | ||
Additionally, the allowable values for `transition` is determined by the current life cycle state. | ||
`transition` must take one of the following values, depending on the current life cycle state. | ||
|
||
Current state | `transition` allowable values | ||
Unconfigured | `CONFIGURE`, `SHUTDOWN` | ||
Inactive | `ACTIVATE`, `CLEANUP`, `SHUTDOWN` | ||
Active | `DEACTIVATE`, `SHUTDOWN` | ||
Finalized | None | ||
|
||
`result` shall be `SUCCESS` if the node's life cycle successfully moved to the requested state and the results of any intermediate state functions were all `success`. | ||
|
||
`result` shall be `TRANSITION_ERROR` if the result of any intermediate state function was anything other than `success` or an error was reported by any other means, and the node is now in the `ErrorProcessing` state or one of its successor states. | ||
|
||
`result` shall be `WRONG_PREV_STATE` if the node's life cycle is not in the correct predecessor state for the requested transition, as described above. | ||
|
||
### Provision of the interface | ||
|
||
What provides the interface is implementation dependent. | ||
It may be provided directly by the node object itself, by a container object, or by any other means as appropriate to the technologies being used to implement the node and ROS infrastructure. | ||
|
||
It is expected that a common pattern will be to have a container class which loads a managed node implementation from a library and through a plugin architecture automatically exposes the required management interface via methods and the container is not subject to the lifecycle management. | ||
However, it is fully valid to consider any implementation which provides this interface and follows the lifecycle policies a managed node. | ||
Conversely, any object that provides these services but does not behave in the way defined in the life cycle state machine is malformed. | ||
|
||
## Extensions | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we don't have namespaces in ROS2.0 for the moment, I believe this paragraph can not be merged as is. Currently, the topics and services are advertised with
*__get_state
,*__change_state
and*__transition_event
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which is still a kind of namespace - just not with a
/
but with__
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that namespace support is being implemented, is this paragraph acceptable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should probably make use of the leading
_
in topic names to make it "hidden":http://design.ros2.org/articles/topic_and_service_names.html#hidden-topic-or-service-names
Otherwise, it uses a namespace as currently written, so I'm not sure what else you might be referring to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was referring to @dirk-thomas 's comment that it doesn't use an actual namespace (separated with a
/
) but uses__
to mimic a namespace. The proposal from @Karsten1987 is to change the topics used by the lifecycle interface to not use namespaces.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I was getting at.
And on going to add the underscore to make it a hidden interface, I found myself unsure where to add it. On
infra
or onlifecycle
? Isinfra
going to be a widely-used thing for ROS infrastructure topics? (I hope it is.)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know how I feel about
infra
overinfrastructure
orimpl
orros
, or even just letting individual things create their own "impl" namespaces. For example, maybe the common namespace for these things should be just~/_lifecycle/
. But in general, namespacing the plumbing is something I think we should be doing.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm in favour of having all infrastructure stuff under a single namespace because it makes things tidier, but I can see the argument of "who defines/controls what is 'infrastructure' enough to go in there?".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see that angle too. It seems to be a bit a bikeshed thing atm. I'm fine with leaving it as-is because we can always come back and change it later once we get more of these "infrastructure" topics and services in place. Perhaps then a better pattern with emerge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I make the
infra
namespace hidden. We can wait a while and see if anything else starts using this same namespace. If not, we can remove it.