Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REP-2016] ROS 2 Interface Type Description #381

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

emersonknapp
Copy link
Contributor

@emersonknapp emersonknapp commented Jun 13, 2023

Related to #358, takes out the feature subset implemented in ros2/ros2#1159 and defines that as its own REP.
Linked with https://github.com/wjwwood/rep/pull/9/files

Defines a feature set to represent ros2 interface types as communicable messages, a hashing algorithm for compactly representing those definitions across implementations, and ways to retrieve the hashes and descriptions from remote nodes.

@emersonknapp emersonknapp changed the title [WIP] [REP-2016] ROS 2 Interface Type Description [REP-2016] ROS 2 Interface Type Description Jun 13, 2023
@emersonknapp emersonknapp force-pushed the type_description_rep branch from bf2e62f to fd7eca4 Compare June 27, 2023 18:59
For debugging and introspection, the type version hash will be accessible via the ROS graph API, by extending the ``rmw_topic_endpoint_info_t`` struct, and related types and functions, to include the type version hash, ``topic_type_hash``.
It should be alongside the ``topic_type`` string in that struct, which remains unchanged.

This information will be transmitted as part of the discovery process.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding context conversation from REP-2011 PR:

@vmayoral wrote

I'm not sure if I understand properly the meaning of the "discovery process" in here.

To my understanding, generally, for most DDS implementations, discovery traffic is sent prior/outside of the security plugins domain (i.e. this information can be used for reconnaissance). If this information is sent also as part of the DDS discovery process, won't it mean that we'd be conveying it in an uncrypted manner (regardless of whether security is enabled or not in the graph)?

Also, (a final answer would of course depend on the implementation but) can you briefly comment on what's the expectation regarding additional traffic generated due to sending this information as part of discovery? Will it be just the type version hash for each ROS interface?

@methylDragon wrote

Well, topic names are already sent as part of the rmw_topic_endpoint_info_t struct. Adding a hash in my opinion is not much of a security risk, since the hash would be degenerate, and by design, meant to not be reversible.

Furthermore, depending on how the hash is generated (e.g. factoring in the field names in the type description), it can be made more robust to rainbow table attacks (definitely more than something like, say, brute forcing of MD5 hashes for passwords.)

Regarding additional traffic, I'd think it'd just be the type version hash. The rest of the stuff (e.g. type description distribution) is planned to be sent after discovery, after the participants have determined that they're mismatched via the hash on discovery.

@wjwwood wrote

@methylDragon is right, it should just be the version hash, which would be in addition to the type name which already being sent via discovery. So the impact to information leak and extra bandwidth used should be minimal. The hash is going to be a sha-1 of the type description struct (most likely) so that is reversible, but not easily and not for much gain I think.

The actual full type descriptions (and other meta data) would be sent via the ~/_get_type_description service which will have to be considered in security set ups just all the other built-in node topics/services must.

""""""""""""""""""""""""""""""""""""""""""""""""

A combination of the original ``idl`` / ``msg`` file and any other information needed for serialization and deserialization being sent allows for one to cover the weaknesses of the other.
Specifically, given that certain use-cases (e.g., ``rosbag``) might encounter situations where consumers of a message are using a different middleware or serialization scheme the message was serialized with, it becomes extremely important to send enough information to both reconstruct the type support, and also allow the message fields to be accessed in a human readable fashion to aid in the writing of transfer functions.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding conversational context from REP-2011

@james-rms wrote

It's probably an important detail that the full idl or msg definition of all field types needs to be provided along with the toplevel definition, otherwise there may be an unresolved incompatibility between types at a field level. It's worth noting that ROS1 tries to solve this by concatenating all of its msg definitions using gendeps --cat and sending this in the connection header for new connections.

@methylDragon wrote

Yup! You're right! They're meant to be sent along the main type description under the referenced_type_descriptions field.

At least, for any non-primitive types.

Do you see value in concatenating the referenced file contents as well (including the comments)? Or would a parsed definition suffice?

@james-rms wrote

Do you see value in concatenating the referenced file contents as well (including the comments)? Or would a parsed definition suffice?

It depends on how frozen-in-time the parsing procedure is. If the parsed definition evolves over time, I'd keep the original file content around, because then if there's a Type Description schema mismatch, a node can fall back to rebuilding the type description from the original file content. Stripping comments seems safe enough, since at least then the content is still valid IDL or MSG.

@amacneil wrote

@wjwwood regarding storing a type definition source (msg or idl) as a string, I wanted to call your attention to work that has been happening over in mcap.

The specific problem I am referencing is how to store both the message definition and all direct and transitive dependencies in the proposed type_description_raw field. ROS 1 solves this using the gendeps --cat format (mentioned by @james-rms above), which consists of msg files delimited by 80 = characters plus a message name line.

For storing concatenated source message definitions in MCAP, we opted to follow the ROS 1 convention for msg files, which we use for both ROS 1 and ROS 2 data. For IDL source message definitions, we modified the concatenation slightly so that all types including the top level type are explicitly named.

You can see our specification for this here, and discussion on the spec PR at foxglove/mcap#553. This spec was then implemented in ros-tooling/rosbag2_storage_mcap#53.

I think it would be best if we can use this same concatenation standard for REP 2011 types.

@amacneil wrote

It was pointed out to me that you could avoid concatenating message definition strings entirely by providing an array of types (the top level type and any dependent types, each with a name, format, and string body).

@emersonknapp emersonknapp marked this pull request as ready for review July 11, 2023 09:35
@clalancette clalancette self-assigned this Jul 11, 2023
Signed-off-by: Emerson Knapp <emerson.b.knapp@gmail.com>
Signed-off-by: Emerson Knapp <emerson.b.knapp@gmail.com>
Signed-off-by: Emerson Knapp <emerson.b.knapp@gmail.com>
Signed-off-by: Emerson Knapp <emerson.b.knapp@gmail.com>
Signed-off-by: Emerson Knapp <emerson.b.knapp@gmail.com>
Signed-off-by: Emerson Knapp <emerson.b.knapp@gmail.com>
Signed-off-by: Emerson Knapp <emerson.b.knapp@gmail.com>
Signed-off-by: Emerson Knapp <emerson.b.knapp@gmail.com>
Signed-off-by: Emerson Knapp <emerson.b.knapp@gmail.com>
Copy link

@MirkoFerrati MirkoFerrati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While reading I found some typos, feel free to ignore those if you don't want to touch the PR again ;)

rep-2016.rst Outdated Show resolved Hide resolved
rep-2016.rst Outdated Show resolved Hide resolved
rep-2016.rst Outdated Show resolved Hide resolved
rep-2016.rst Outdated Show resolved Hide resolved
rep-2016.rst Outdated Show resolved Hide resolved
rep-2016.rst Outdated Show resolved Hide resolved
rep-2016.rst Outdated Show resolved Hide resolved
rep-2016.rst Outdated Show resolved Hide resolved
rep-2016.rst Outdated Show resolved Hide resolved
rep-2016.rst Outdated Show resolved Hide resolved
rep-2016.rst Outdated Show resolved Hide resolved
rep-2016.rst Outdated Show resolved Hide resolved
rep-2016.rst Outdated Show resolved Hide resolved
rep-2016.rst Outdated Show resolved Hide resolved
rep-2016.rst Outdated Show resolved Hide resolved
rep-2016.rst Outdated Show resolved Hide resolved
rep-2016.rst Show resolved Hide resolved
rep-2016.rst Show resolved Hide resolved
The recommended DDS implementation for type hash discovery adds the type hash in a new previously-unused key, therefore it will be ignored by prior ROS distributions.
This leaves discovery unaffected for backwards compatibility.

References
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding http://wiki.ros.org/ROS/Technical%20Overview#Message_serialization_and_msg_MD5_sums as reference? that could give information what we have done with ROS 1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good idea thank you

rep-2016.rst Outdated Show resolved Hide resolved
Thus the representation includes:

- the package, namespace, and type name, for example `sensor_msgs/msg/Image`
- a list of field names and types
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i am not sure if this is supported with current code, but constants should be also supported to detect the mismatch?

Suggested change
- a list of field names and types
- a list of field names, types and constants


The ``TypeDescription`` messages are defined as just another ROS interface type, using all the same tooling.
This means that its data structures are not available yet, during build time, to describe itself and other interface packages.
The final decision to support this is a checked-in mirror of the generated type structs, provided in ``rosidl_runtime_c``, for use by C (used by Python) and C++ code generation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(question) as user who builds the source code with their own CI/CD pipeline, descriptions are generated during build time, including TypeDescription messages. these descriptions will be the same with the ones from released binary packages as long as source code is the same version. just checking my understanding is correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is correct! As long as the source code is the same version, then the TypeDescriptions and hashes will come out the same, they are deterministic.

Comment on lines +194 to +196
The hash must only be computed using fields that affect communication compatibility, so that trivial changes do not change the hash
Thus the hash excludes one aspect of Type Descriptions: it omits field default values.
This is because default values for fields are only used by the writer of a packet, the recipient always receives some value in the field and can read it, thus defaults cannot affect compatibility.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so after all, the constants are not supported at this moment. i understand the that is not requirement to check message field compatibility, but user application might check the value using constants on subscription? from user application perspective, this could be categorized as mismatch?

publisher sends the data using DEFAULT_VALUE as 1, but on the subscription using the same DEFAULT_VALUE because message constants are updated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This point does make sense to me. To put a full example:

Publisher's My.msg

uint8 GOOD_STATE = 1
uint8 BAD_STATE = 2

uint8 current_state

Publisher

My msg;
msg.current_state = My::GOOD_STATE;
publisher.publish(msg);

Subscription's My.msg

uint8 GOOD_STATE = 0
uint8 BAD_STATE = 1
uint8 UNKNOWN_STATE = 2

uint8 current_state

Subscription code

void message_callback(My msg) {
  if (msg.current_state == My::BAD_STATE)  {
    ERROR("Bad state!");
  }
}

This will trigger the bad state on the subscription, even though the publisher intended a good state. However, this doesn't currently trigger an update.
(tangent: I think almost all constants in practice are enums, so it's unfortunate that we don't have enum support in messages)

@wjwwood I could use your thoughts on this, I'm forgetting if we discussed constants in the original design. They are not represented in IndividualTypeDescription so it's not that they are specifically ignored from the hash, they're not available to that calculation at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah - we did not come to a final decisions, it was left as a TODO https://github.com/ros2/rcl_interfaces/tree/rolling/type_description_interfaces#todo

Copy link

@EduPonz EduPonz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great feature to be exposed on ROS 2. I have some concerns regarding some design decisions.

* Causes a mass of parameter event messages being sent at once on init, worsening the network initialization problem
- Store the type description on a centralized node per machine
* Helps reduce network bandwidth, but makes it non-trivial to find the correct centralized node to query, and introduces issues of resolving the local message package, such as when nodes are started from different sourced workspaces.
- Send type description alongside discovery with middlewares
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand why this idea was discarded.

It seems that the specifics of how the type descriptions are distributed could very well be left to the RMWs as they are the middleware adapters. With this approach, DDS based RMWs (such as the default) could leverage the already existing DDS X-Types mechanism instead of further "polluting" the nodes with more topics and endpoints, with their non-negligible increment of discovery traffic, which is by default already intense because of similar design decisions. Non-DDS based middleware RMWs could always implement the service you are describing here internally (or use any other method for that matter).

For DDS based implementations, pursuing this native approach will still be interoperable, as X-Types is built in DDS. Regarding compatibility with other middleware technologies, they will not interoperate in the first place because the protocols are different, so interoperability of this particular aspect is not a problem.

From what I understand, the important part for the ROS 2 users is the API regarding evolving types, their description and distribution, so their ROS 2 applications can dynamically adapt to newly discovered types. The details on how the specific RMW handles the type discovery and distribution should be transparent to the ROS 2 user, and ideally as optimized as possible. As long as an application can query a node for the type description of a specific type given its ID, the rest can very well be RMW specific, which will result in more optimal implementations when native support is present (case in point).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, my opinion is the same. Instead of adding services, left this to the middleware.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the advantage to type transmission being considered an RMW implementation detail. Maybe it's a service/topic-pair in some implementations, but not necessarily designed that way or exposed as such to the end user.

I'm open to revising this design to be less specific about the way type definitions are transmitted, and to reviewing an RMW API proposal that would expose these types.

One thing I think we want to avoid is sending the full type descriptions in discovery traffic, and only getting them on request - I don't know XTypes well enough to know how this would be implemented. I think we want an actual implementation at least at draft stage for Tier 1 RMWs, if we are going to base a design decision on that.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@emersonknapp we are open to make such proposal for an API were we can provide ROS 2 with this functionality by leveraging existing standardized technologies such as DDS X-Types, we could start after the ROSCon.

In X-Types v1.3, the type descriptions (Type Objects in our lingo) are not distributed during discovery. Instead, DATA(w) and DATA(r) (the endpoints' discovery packets) contain an identifier (similar to this REPs hash) of the used type (known as Type Information), and optionally the type information of the dependencies of the type. If the remote participant does not have such type registered, then it can ask for it over a builtin service called Type Lookup Service, which manages the Type Object (type description) distribution. Mind the importance of the service being builtin, as what that entail in DDS is that there is no discovery traffic associated with the service endpoints; knowing the participant is enough to use it. IMO we should aim to leverage this feature of the DDS spec to avoid further pollution of the graph and network with a new set of endpoints for each node, which will make discovery and therefore ROS 2 less scalable.

BTW, from some comment I read above, with DDS security and therefore in SROS 2, the only traffic in the clear is the DATA(p) (the information about which ports the specific participant is using). Further discovery exchanges (including types and the Type Lookup Service) happen over secured topics, so there is no information about them exposed to ill-intended observers.

It is also worth noting the DDS X-Types also provides a mechanism for type versioning an incompatibilities with a bit of finer grain control (for instance you may allow for field names to change, or for fields to be added at the end of the struct).

IMO this REP is attempting to provide ROS 2 with a powerful new set of capabilities, but it might be at the same time reinventing the wheel of X-Types while at the same time leaving available capabilities out of the picture. As I said before, I think the course of action would be to put precise requirements in the functionality we want, and leave for the RMWs the details of how to implement those. In the case of DDS based RMW that'd make the type distribution system interoperable, and for those RMW not supporting X-Types or not based on DDS, they were not going to interoperate anyways, so we do not need a common representation for them all.

rep-2016.rst Outdated Show resolved Hide resolved
rep-2016.rst Outdated Show resolved Hide resolved
rep-2016.rst Show resolved Hide resolved
* Causes a mass of parameter event messages being sent at once on init, worsening the network initialization problem
- Store the type description on a centralized node per machine
* Helps reduce network bandwidth, but makes it non-trivial to find the correct centralized node to query, and introduces issues of resolving the local message package, such as when nodes are started from different sourced workspaces.
- Send type description alongside discovery with middlewares
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, my opinion is the same. Instead of adding services, left this to the middleware.

Co-authored-by: Mirko Ferrati <mirko.ferrati@gmail.com>
Co-authored-by: Chris Lalancette <clalancette@gmail.com>
Co-authored-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
Co-authored-by: Francisco Martín Rico <fmrico@gmail.com>
@ros-discourse
Copy link

This pull request has been mentioned on ROS Discourse. There might be relevant details there:

https://discourse.ros.org/t/ros-2-tsc-meeting-minutes-2023-09-21/33733/1

@gavanderhoorn
Copy link
Contributor

gavanderhoorn commented Sep 9, 2024

Discussing this with a colleague I tried to find how constants in messages are treated (apologies if I overlooked something of course), but can't find any mention of them.

@emersonknapp: would changing/adding a constant change the hash of a message?

@methylDragon
Copy link

Discussing this with a colleague I tried to find how constants in messages are treated (apologies if I overlooked something of course), but can't find any mention of them.

@emersonknapp: would changing/adding a constant change the hash of a message?

Afaik, they're not involved in the calculation, since they're not a part of the type description message at the moment.

There's a to-do to decide what to do with them: https://github.com/ros2/rcl_interfaces/tree/rolling/type_description_interfaces#todo

@gavanderhoorn
Copy link
Contributor

Ok, thanks for that comment @methylDragon.

I tried this out by adding a constant to example_interfaces/msg/Bool.msg and it did change the hash, but perhaps that was because I actually copied that .msg to another file -- and filename is included in the hash calculation?

@methylDragon
Copy link

methylDragon commented Sep 9, 2024

Ok, thanks for that comment @methylDragon.

I tried this out by adding a constant to example_interfaces/msg/Bool.msg and it did change the hash, but perhaps that was because I actually copied that .msg to another file -- and filename is included in the hash calculation?

That's interesting...

The schema of what is considered can be found here https://github.com/ros2/rosidl/blob/rolling/rosidl_generator_type_description/resource/HashedTypeDescription.schema.json

Let me see if I can find the code that does the hashing, sec

EDIT: Type name is included, which the file name represents.

If in doubt, look at your build/install folder and look at the generated .JSON type description files! You'll be able to debug any changes there. Those json files are the inputs to the hasher.

Intuitively this makes sense. Consider: TypeA.msg and TypeB.msg could have the same fields, but the fact their names differ is a clue that their semantics differ. (A more clear example: MyNanosecondsInt.msg and MySecondsInt.msg, both with a single int field called "value" means something drastically different despite having the same on the wire representation.)

@emersonknapp
Copy link
Contributor Author

thanks @methylDragon for answering so quickly, I'll just confirm that you got it correct. Constants do not affect hash, but type name (from filename) does

@gavanderhoorn
Copy link
Contributor

thanks for confirming @emersonknapp.

Is it correct / expected the REP doesn't mention constants explicitly? It would be great to have information about what exactly is used as input for the hash somewhere in the REP itself.

The JSON schema is good to have, but not very human friendly (I also can't point my colleagues to it).

@emersonknapp
Copy link
Contributor Author

emersonknapp commented Sep 9, 2024

No, it doesn't right now, just this #381 (comment) unresolved thread about it. It should probably be mentioned. There is some balance we're trying to strike, between being detailed here, and letting the implementation be the source of truth. But this point is worth specifying one way or the other, I agree. it's been implemented as of Iron to not include constants, but it's an open question whether that's the long term goal state

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants