Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intra-process Type adaptation failing silently if types mismatch #2291

Closed
nirwester opened this issue Aug 29, 2023 · 16 comments · Fixed by #2294
Closed

Intra-process Type adaptation failing silently if types mismatch #2291

nirwester opened this issue Aug 29, 2023 · 16 comments · Fixed by #2294
Labels
bug Something isn't working

Comments

@nirwester
Copy link

Bug report

Intra-process communication is currently supported together with type adaptation, and works fine as long as both the publisher and the subscriber both use a ROS-msg type or the same custom-type. However, if the publisher uses a custom-type and the subscriber a ROS-msg type, the communication (as can be expected) fails if the intra-process communication is enabled.

When this kind of failure occurs, no warning / error / exception is reported, resulting in a silent failure which is potentially difficult to debug. I'm not sure if implementing this kind of check would be complicated, but I think it would make using type-adaptation less error-prone.

  • Operating System:
    • Ubuntu 22.04
  • Installation type:
    • binaries
  • Version or commit hash:
    • Latest Iron release
  • DDS implementation:
    • Tested on Cyclone and Fast-RTPS
  • Client library (if applicable):
    • rclcpp

Steps to reproduce issue

A detailed description and the code / dockerfile to reproduce the issue can be found here, where a Pointcloud2 is adapted to a pcl::Pointcloud:
https://github.com/nirwester/ros2_journey_examples/tree/master/homeworks/chap4/type_adaptation

Expected behavior

Warnings, errors, exceptions, or ros2 doctor logs are reported when the types used by the publisher and the subscriber are not matching and the intra-process communication is enabled.

Actual behavior

The communication fails (no message is received by the subscriber) silently.

@roncapat
Copy link
Contributor

Hi @nirwester, may I ask you about this:

However, if the publisher uses a custom-type and the subscriber a ROS-msg type, the communication (as can be expected) fails if the intra-process communication is enabled.

Where you found evidence / documentation? If intra-process is enabled the adapter is not able to do conversion from/to message?

@nirwester
Copy link
Author

Hi @roncapat , I found no documentation in this regard, but I verified it through tests (see the linked repo). It is also intuitive that, if a conversion needes to be applied, it is not possible to efficiently share a pointer between publisher and subscriber. However, I think some sort of warning or a fallback to the normal communication mechanism would be desirable in this scenario.

@roncapat
Copy link
Contributor

Exactly, I just thought a fallback would happen in that case... that's why now I'm interested in this thread :)

@alsora
Copy link
Collaborator

alsora commented Aug 30, 2023

Are there use-cases to have a publisher and a subscriber in the same process where one uses a ROS-msg type and the other uses a custom type?

I can't think of any valid use-case, so I would throw an exception rather than a warning.

@MarcoMatteoBassa
Copy link

I imagine this could accidentally happen if the subscriber doesn't know which mechanism is being used by the publisher (for example if the programmer misses access to the pub source code). Yet this wouldn't really be a "use-case", since the subscriber can always disable the intra-process option in this scenario, so I suppose an exception would be fine.

@roncapat
Copy link
Contributor

I have a library of components that use my private message adapters. If I want to compose them with 3rd party packages with intra-process on (because I need it for some topics that exchange data with other components of mine) I can't make the mix work (I may have to enable/disable intra process per-topic via param file). So yes, now that I recall, I already experienced this issue in the past.

@clalancette
Copy link
Contributor

The case of an intraprocess pub/sub, with one side being a ROSMsg and the other side being Custom, should be supported for the reasons @roncapat points out. It obviously involves a copy, so isn't as efficient as a ROSMsg<->ROSMsg or Custom<->Custom, but it should still work.

It's been a couple of years since I looked at this, but I believe we even tested that at the time we added in type adaptation. My opinion is that this should be treated as a bug and fixed.

@clalancette clalancette added the bug Something isn't working label Aug 30, 2023
@alsora
Copy link
Collaborator

alsora commented Aug 30, 2023

Good point; I wasn't thinking of "composition" involving third-party code (that you may or may not have control over its source).

We should still add a warning when the pub and sub discover each others (even if there's a valid use-case for it, notifying the users via a log will help them identify possible undesired inefficiencies in their pipeline).

@iuhilnehc-ynos
Copy link
Collaborator

Let me share my two cents.

The

if constexpr (rclcpp::TypeAdapter<MessageT>::is_specialized::value) {
can not be deduced to use the https://github.com/nirwester/ros2_journey_examples/blob/91cc788badf36e8ebe0f2efc729ed3b72169c659/homeworks/chap4/type_adaptation/src/test_type_adaptation/include/test_type_adaptation/type_adapter.hpp#L34-L38, but
template<typename CustomType, typename ROSMessageType = void, class Enable = void>
struct TypeAdapter
{
using is_specialized = std::false_type;
.

I think something below might be correct.

diff --git a/rclcpp/include/rclcpp/experimental/intra_process_manager.hpp b/rclcpp/include/rclcpp/experimental/intra_process_manager.hpp
index a152632a..dfcda620 100644
--- a/rclcpp/include/rclcpp/experimental/intra_process_manager.hpp
+++ b/rclcpp/include/rclcpp/experimental/intra_process_manager.hpp
@@ -486,13 +486,13 @@ private:
                 "subscription use different allocator types, which is not supported");
       }
 
-      if constexpr (rclcpp::TypeAdapter<MessageT>::is_specialized::value) {
+      if constexpr (rclcpp::TypeAdapter<MessageT, ROSMessageType>::is_specialized::value) {
         ROSMessageTypeAllocator ros_message_alloc(allocator);
         auto ptr = ros_message_alloc.allocate(1);
         ros_message_alloc.construct(ptr);
         ROSMessageTypeDeleter deleter;
         allocator::set_allocator_for_deleter(&deleter, &allocator);
-        rclcpp::TypeAdapter<MessageT>::convert_to_ros_message(*message, *ptr);
+        rclcpp::TypeAdapter<MessageT, ROSMessageType>::convert_to_ros_message(*message, *ptr);
         auto ros_msg = std::unique_ptr<ROSMessageType, ROSMessageTypeDeleter>(ptr, deleter);
         ros_message_subscription->provide_intra_process_message(std::move(ros_msg));
       } else {

@MarcoMatteoBassa
Copy link

MarcoMatteoBassa commented Aug 31, 2023

Thanks a lot @iuhilnehc-ynos, I can confirm that applying the patch on top of rclcpp 21.0.2 solved the problem, at least for this specific test.

[component_container-1] [INFO] [1693472229.766338070] [adapted_publisher]: Publishing to ADAPTED topic cloud with address: 0x5581E17A9020
[component_container-1] [INFO] [1693472229.766626558] [adapted_publisher]: Publishing to NOT ADAPTED topic cloud with address: 0x5581E159DC20
[component_container-1] [INFO] [1693472229.767263217] [adapted_subscriber]: Adapted sub received cloud with address: 0x5581E17A9020
[component_container-1] [INFO] [1693472229.767467450] [adapted_subscriber]: Normal sub received cloud with address: 0x5581E1AA1740

@roncapat
Copy link
Contributor

Nice!
Next steps? PR with some warnings added?

@clalancette
Copy link
Contributor

Next steps? PR with some warnings added?

I'd say let's make a PR with that change and a new test. Adding a warning I think should be done in a separate PR, so we can discuss it separately.

@roncapat
Copy link
Contributor

@iuhilnehc-ynos as the patch is yours, would you like to open the PR?

@fujitatomoya
Copy link
Collaborator

@iuhilnehc-ynos please take care of opening the PR, i am happy to do review.

@fujitatomoya
Copy link
Collaborator

Just FYI, backport to iron and humble are completed.

@nirwester
Copy link
Author

Thanks a lot @fujitatomoya and everybody!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants