-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REP-2016] ROS 2 Interface Type Description #381
base: master
Are you sure you want to change the base?
[REP-2016] ROS 2 Interface Type Description #381
Conversation
bf2e62f
to
fd7eca4
Compare
For debugging and introspection, the type version hash will be accessible via the ROS graph API, by extending the ``rmw_topic_endpoint_info_t`` struct, and related types and functions, to include the type version hash, ``topic_type_hash``. | ||
It should be alongside the ``topic_type`` string in that struct, which remains unchanged. | ||
|
||
This information will be transmitted as part of the discovery process. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding context conversation from REP-2011 PR:
@vmayoral wrote
I'm not sure if I understand properly the meaning of the "discovery process" in here.
To my understanding, generally, for most DDS implementations, discovery traffic is sent prior/outside of the security plugins domain (i.e. this information can be used for reconnaissance). If this information is sent also as part of the DDS discovery process, won't it mean that we'd be conveying it in an uncrypted manner (regardless of whether security is enabled or not in the graph)?
Also, (a final answer would of course depend on the implementation but) can you briefly comment on what's the expectation regarding additional traffic generated due to sending this information as part of discovery? Will it be just the type version hash for each ROS interface?
@methylDragon wrote
Well, topic names are already sent as part of the rmw_topic_endpoint_info_t struct. Adding a hash in my opinion is not much of a security risk, since the hash would be degenerate, and by design, meant to not be reversible.
Furthermore, depending on how the hash is generated (e.g. factoring in the field names in the type description), it can be made more robust to rainbow table attacks (definitely more than something like, say, brute forcing of MD5 hashes for passwords.)
Regarding additional traffic, I'd think it'd just be the type version hash. The rest of the stuff (e.g. type description distribution) is planned to be sent after discovery, after the participants have determined that they're mismatched via the hash on discovery.
@wjwwood wrote
@methylDragon is right, it should just be the version hash, which would be in addition to the type name which already being sent via discovery. So the impact to information leak and extra bandwidth used should be minimal. The hash is going to be a sha-1 of the type description struct (most likely) so that is reversible, but not easily and not for much gain I think.
The actual full type descriptions (and other meta data) would be sent via the ~/_get_type_description service which will have to be considered in security set ups just all the other built-in node topics/services must.
"""""""""""""""""""""""""""""""""""""""""""""""" | ||
|
||
A combination of the original ``idl`` / ``msg`` file and any other information needed for serialization and deserialization being sent allows for one to cover the weaknesses of the other. | ||
Specifically, given that certain use-cases (e.g., ``rosbag``) might encounter situations where consumers of a message are using a different middleware or serialization scheme the message was serialized with, it becomes extremely important to send enough information to both reconstruct the type support, and also allow the message fields to be accessed in a human readable fashion to aid in the writing of transfer functions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding conversational context from REP-2011
@james-rms wrote
It's probably an important detail that the full idl or msg definition of all field types needs to be provided along with the toplevel definition, otherwise there may be an unresolved incompatibility between types at a field level. It's worth noting that ROS1 tries to solve this by concatenating all of its msg definitions using gendeps --cat and sending this in the connection header for new connections.
@methylDragon wrote
Yup! You're right! They're meant to be sent along the main type description under the referenced_type_descriptions field.
At least, for any non-primitive types.
Do you see value in concatenating the referenced file contents as well (including the comments)? Or would a parsed definition suffice?
@james-rms wrote
Do you see value in concatenating the referenced file contents as well (including the comments)? Or would a parsed definition suffice?
It depends on how frozen-in-time the parsing procedure is. If the parsed definition evolves over time, I'd keep the original file content around, because then if there's a Type Description schema mismatch, a node can fall back to rebuilding the type description from the original file content. Stripping comments seems safe enough, since at least then the content is still valid IDL or MSG.
@amacneil wrote
@wjwwood regarding storing a type definition source (msg or idl) as a string, I wanted to call your attention to work that has been happening over in mcap.
The specific problem I am referencing is how to store both the message definition and all direct and transitive dependencies in the proposed type_description_raw field. ROS 1 solves this using the gendeps --cat format (mentioned by @james-rms above), which consists of msg files delimited by 80 = characters plus a message name line.
For storing concatenated source message definitions in MCAP, we opted to follow the ROS 1 convention for msg files, which we use for both ROS 1 and ROS 2 data. For IDL source message definitions, we modified the concatenation slightly so that all types including the top level type are explicitly named.
You can see our specification for this here, and discussion on the spec PR at foxglove/mcap#553. This spec was then implemented in ros-tooling/rosbag2_storage_mcap#53.
I think it would be best if we can use this same concatenation standard for REP 2011 types.
@amacneil wrote
It was pointed out to me that you could avoid concatenating message definition strings entirely by providing an array of types (the top level type and any dependent types, each with a name, format, and string body).
Signed-off-by: Emerson Knapp <emerson.b.knapp@gmail.com>
Signed-off-by: Emerson Knapp <emerson.b.knapp@gmail.com>
Signed-off-by: Emerson Knapp <emerson.b.knapp@gmail.com>
Signed-off-by: Emerson Knapp <emerson.b.knapp@gmail.com>
Signed-off-by: Emerson Knapp <emerson.b.knapp@gmail.com>
Signed-off-by: Emerson Knapp <emerson.b.knapp@gmail.com>
Signed-off-by: Emerson Knapp <emerson.b.knapp@gmail.com>
Signed-off-by: Emerson Knapp <emerson.b.knapp@gmail.com>
eabe46a
to
b8287f2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While reading I found some typos, feel free to ignore those if you don't want to touch the PR again ;)
The recommended DDS implementation for type hash discovery adds the type hash in a new previously-unused key, therefore it will be ignored by prior ROS distributions. | ||
This leaves discovery unaffected for backwards compatibility. | ||
|
||
References |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about adding http://wiki.ros.org/ROS/Technical%20Overview#Message_serialization_and_msg_MD5_sums as reference? that could give information what we have done with ROS 1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good idea thank you
Thus the representation includes: | ||
|
||
- the package, namespace, and type name, for example `sensor_msgs/msg/Image` | ||
- a list of field names and types |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i am not sure if this is supported with current code, but constants should be also supported to detect the mismatch?
- a list of field names and types | |
- a list of field names, types and constants |
|
||
The ``TypeDescription`` messages are defined as just another ROS interface type, using all the same tooling. | ||
This means that its data structures are not available yet, during build time, to describe itself and other interface packages. | ||
The final decision to support this is a checked-in mirror of the generated type structs, provided in ``rosidl_runtime_c``, for use by C (used by Python) and C++ code generation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(question) as user who builds the source code with their own CI/CD pipeline, descriptions are generated during build time, including TypeDescription
messages. these descriptions will be the same with the ones from released binary packages as long as source code is the same version. just checking my understanding is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is correct! As long as the source code is the same version, then the TypeDescriptions
and hashes will come out the same, they are deterministic.
The hash must only be computed using fields that affect communication compatibility, so that trivial changes do not change the hash | ||
Thus the hash excludes one aspect of Type Descriptions: it omits field default values. | ||
This is because default values for fields are only used by the writer of a packet, the recipient always receives some value in the field and can read it, thus defaults cannot affect compatibility. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so after all, the constants are not supported at this moment. i understand the that is not requirement to check message field compatibility, but user application might check the value using constants on subscription? from user application perspective, this could be categorized as mismatch?
publisher sends the data using DEFAULT_VALUE
as 1, but on the subscription using the same DEFAULT_VALUE
because message constants are updated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This point does make sense to me. To put a full example:
Publisher's My.msg
uint8 GOOD_STATE = 1
uint8 BAD_STATE = 2
uint8 current_state
Publisher
My msg;
msg.current_state = My::GOOD_STATE;
publisher.publish(msg);
Subscription's My.msg
uint8 GOOD_STATE = 0
uint8 BAD_STATE = 1
uint8 UNKNOWN_STATE = 2
uint8 current_state
Subscription code
void message_callback(My msg) {
if (msg.current_state == My::BAD_STATE) {
ERROR("Bad state!");
}
}
This will trigger the bad state on the subscription, even though the publisher intended a good state. However, this doesn't currently trigger an update.
(tangent: I think almost all constants in practice are enums, so it's unfortunate that we don't have enum support in messages)
@wjwwood I could use your thoughts on this, I'm forgetting if we discussed constants in the original design. They are not represented in IndividualTypeDescription
so it's not that they are specifically ignored from the hash, they're not available to that calculation at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah - we did not come to a final decisions, it was left as a TODO https://github.com/ros2/rcl_interfaces/tree/rolling/type_description_interfaces#todo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great feature to be exposed on ROS 2. I have some concerns regarding some design decisions.
* Causes a mass of parameter event messages being sent at once on init, worsening the network initialization problem | ||
- Store the type description on a centralized node per machine | ||
* Helps reduce network bandwidth, but makes it non-trivial to find the correct centralized node to query, and introduces issues of resolving the local message package, such as when nodes are started from different sourced workspaces. | ||
- Send type description alongside discovery with middlewares |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand why this idea was discarded.
It seems that the specifics of how the type descriptions are distributed could very well be left to the RMWs as they are the middleware adapters. With this approach, DDS based RMWs (such as the default) could leverage the already existing DDS X-Types mechanism instead of further "polluting" the nodes with more topics and endpoints, with their non-negligible increment of discovery traffic, which is by default already intense because of similar design decisions. Non-DDS based middleware RMWs could always implement the service you are describing here internally (or use any other method for that matter).
For DDS based implementations, pursuing this native approach will still be interoperable, as X-Types is built in DDS. Regarding compatibility with other middleware technologies, they will not interoperate in the first place because the protocols are different, so interoperability of this particular aspect is not a problem.
From what I understand, the important part for the ROS 2 users is the API regarding evolving types, their description and distribution, so their ROS 2 applications can dynamically adapt to newly discovered types. The details on how the specific RMW handles the type discovery and distribution should be transparent to the ROS 2 user, and ideally as optimized as possible. As long as an application can query a node for the type description of a specific type given its ID, the rest can very well be RMW specific, which will result in more optimal implementations when native support is present (case in point).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, my opinion is the same. Instead of adding services, left this to the middleware.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the advantage to type transmission being considered an RMW implementation detail. Maybe it's a service/topic-pair in some implementations, but not necessarily designed that way or exposed as such to the end user.
I'm open to revising this design to be less specific about the way type definitions are transmitted, and to reviewing an RMW API proposal that would expose these types.
One thing I think we want to avoid is sending the full type descriptions in discovery traffic, and only getting them on request - I don't know XTypes well enough to know how this would be implemented. I think we want an actual implementation at least at draft stage for Tier 1 RMWs, if we are going to base a design decision on that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@emersonknapp we are open to make such proposal for an API were we can provide ROS 2 with this functionality by leveraging existing standardized technologies such as DDS X-Types, we could start after the ROSCon.
In X-Types v1.3, the type descriptions (Type Objects in our lingo) are not distributed during discovery. Instead, DATA(w) and DATA(r) (the endpoints' discovery packets) contain an identifier (similar to this REPs hash) of the used type (known as Type Information), and optionally the type information of the dependencies of the type. If the remote participant does not have such type registered, then it can ask for it over a builtin service called Type Lookup Service, which manages the Type Object (type description) distribution. Mind the importance of the service being builtin, as what that entail in DDS is that there is no discovery traffic associated with the service endpoints; knowing the participant is enough to use it. IMO we should aim to leverage this feature of the DDS spec to avoid further pollution of the graph and network with a new set of endpoints for each node, which will make discovery and therefore ROS 2 less scalable.
BTW, from some comment I read above, with DDS security and therefore in SROS 2, the only traffic in the clear is the DATA(p) (the information about which ports the specific participant is using). Further discovery exchanges (including types and the Type Lookup Service) happen over secured topics, so there is no information about them exposed to ill-intended observers.
It is also worth noting the DDS X-Types also provides a mechanism for type versioning an incompatibilities with a bit of finer grain control (for instance you may allow for field names to change, or for fields to be added at the end of the struct).
IMO this REP is attempting to provide ROS 2 with a powerful new set of capabilities, but it might be at the same time reinventing the wheel of X-Types while at the same time leaving available capabilities out of the picture. As I said before, I think the course of action would be to put precise requirements in the functionality we want, and leave for the RMWs the details of how to implement those. In the case of DDS based RMW that'd make the type distribution system interoperable, and for those RMW not supporting X-Types or not based on DDS, they were not going to interoperate anyways, so we do not need a common representation for them all.
* Causes a mass of parameter event messages being sent at once on init, worsening the network initialization problem | ||
- Store the type description on a centralized node per machine | ||
* Helps reduce network bandwidth, but makes it non-trivial to find the correct centralized node to query, and introduces issues of resolving the local message package, such as when nodes are started from different sourced workspaces. | ||
- Send type description alongside discovery with middlewares |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, my opinion is the same. Instead of adding services, left this to the middleware.
Co-authored-by: Mirko Ferrati <mirko.ferrati@gmail.com> Co-authored-by: Chris Lalancette <clalancette@gmail.com> Co-authored-by: Tomoya Fujita <Tomoya.Fujita@sony.com> Co-authored-by: Francisco Martín Rico <fmrico@gmail.com>
This pull request has been mentioned on ROS Discourse. There might be relevant details there: https://discourse.ros.org/t/ros-2-tsc-meeting-minutes-2023-09-21/33733/1 |
Discussing this with a colleague I tried to find how constants in messages are treated (apologies if I overlooked something of course), but can't find any mention of them. @emersonknapp: would changing/adding a constant change the hash of a message? |
Afaik, they're not involved in the calculation, since they're not a part of the type description message at the moment. There's a to-do to decide what to do with them: https://github.com/ros2/rcl_interfaces/tree/rolling/type_description_interfaces#todo |
Ok, thanks for that comment @methylDragon. I tried this out by adding a constant to |
That's interesting... The schema of what is considered can be found here https://github.com/ros2/rosidl/blob/rolling/rosidl_generator_type_description/resource/HashedTypeDescription.schema.json Let me see if I can find the code that does the hashing, sec EDIT: Type name is included, which the file name represents. If in doubt, look at your build/install folder and look at the generated .JSON type description files! You'll be able to debug any changes there. Those json files are the inputs to the hasher. Intuitively this makes sense. Consider: TypeA.msg and TypeB.msg could have the same fields, but the fact their names differ is a clue that their semantics differ. (A more clear example: MyNanosecondsInt.msg and MySecondsInt.msg, both with a single int field called "value" means something drastically different despite having the same on the wire representation.) |
thanks @methylDragon for answering so quickly, I'll just confirm that you got it correct. Constants do not affect hash, but type name (from filename) does |
thanks for confirming @emersonknapp. Is it correct / expected the REP doesn't mention constants explicitly? It would be great to have information about what exactly is used as input for the hash somewhere in the REP itself. The JSON schema is good to have, but not very human friendly (I also can't point my colleagues to it). |
No, it doesn't right now, just this #381 (comment) unresolved thread about it. It should probably be mentioned. There is some balance we're trying to strike, between being detailed here, and letting the implementation be the source of truth. But this point is worth specifying one way or the other, I agree. it's been implemented as of Iron to not include constants, but it's an open question whether that's the long term goal state |
Related to #358, takes out the feature subset implemented in ros2/ros2#1159 and defines that as its own REP.
Linked with https://github.com/wjwwood/rep/pull/9/files
Defines a feature set to represent ros2 interface types as communicable messages, a hashing algorithm for compactly representing those definitions across implementations, and ways to retrieve the hashes and descriptions from remote nodes.