-
Notifications
You must be signed in to change notification settings - Fork 33
"wrong type writer" error if topics with same token have different types #234
Comments
The different types cause connext to get mixed up, see ros2/rmw_connext#234
* Topics returned by get_topic_names_and_types have leading forward slashes now * get_topic_names_and_types returns a list of types now * Fix a bug in the reception of best effort messages * Kinder failing without matplotlib * Add missing __init__.py files * Print warning when topics which appropriate names are ignored because of types * No issues with empty waitsets anymore (There's always a signal handler) * Add linters * Install the whole package rather than using py_modules * Don't publish on topic with the same name token The different types cause connext to get mixed up, see ros2/rmw_connext#234 * Flake8 import order wants different things on different machines? * Add launch as exec_depend
I can reproduce the problem. It looks like this is a regression of #227. The topic names are being split into the basename and the partition. Then only the basename is being used to lookup the topic type ( rmw_connext/rmw_connext_cpp/src/functions.cpp Line 378 in 363ffcf
I will try to come up with a fix for this. I think it is significant enough to be made for the beta 2 even though we are in code freeze. |
Reading more about partitions and the API in Connext it looks to me that the topic description (which is identifying the type) is being looked up by DDSDomainParticipant::lookup_topicdescription. That function only takes the topic name but not the partition as an arguments. Might it be the case that the same topic in different partitions must have the same type (at least in Connext)? If that is the case I fear that the concept of partitions can't be used to implement the concept of ROS namespace??? (I hope I am missing something here.) @Karsten1987 @wjwwood Maybe either of you has some more insight on this since you were most involved in this part. |
I haven't read the details yet, but if you're correct that the types must match for topics in different partitions: there might still be a way we can use partitions to implement namespaces, if we can somehow "trick" connext into using FQNs for the topic type lookup. It never complained in beta1 when doing psuedo-namespacing by putting "test/chatter" as the topic name before the namespace implementation was introduced (that is, there's no restriction on using slashes in topic names in connext). This isn't a concrete suggestion at the moment, just something to get some ideas rolling while I have time to read up. |
So from what I understood the actual problem comes from I'll have to look at bit more into what options we have. |
"Participant level" seems to be the key here: you can run the following with no complaints from connext, since they're different participants:
Then the two [topic, type] pairs work fine independently.
while in a single process/node/participant like the modified talker demo, namespaces or not, there is the "wrong type writer" issue if you try to publish different types to the same topic. I don't mean to say that you should be allowed to publish different types to the same topic. My point is that we can probably get around our current issue in the same way: by using different participants. |
Looks like it's recommended to use as few participants as possible, so my previous suggestion is not likely to be an ideal solution for the namespacing issue: https://community.rti.com/best-practices/create-few-domainparticipants-possible |
From reading RTI responses on forum (1, 2, 3) I agree that it looks like connext is not supposed to support different types on the same topic from within a participant, and namespaces aren't taken into consideration with that. I did find a workaround though that's less dramatic than creating separate participants: We can get around the '1 type per topic' restriction if we "just" embed the name of the topic type into the ROS -> DDS topic name mangling. Not pretty, but it gets hidden below the RMW layer for the most part. For interoperability we have to apply the same thing across all DDS rmw implementations, but we already have mangling in our topics names anyway. With no modification to the above example (but changes to rmw_connext_cpp + rmw_fastrtps_cpp to embed the topic type into the topic name), it now works fine:
That's just a hacky prototype. We'd have to use a version of the type names that doesn't have characters not allowed by the spec (not that connext or fastrtps seem to mind), and this will limit users' topic name length, but otherwise I think the only other thing that has to change is the get_topic_names_and_types de-mangling. Since we can infer the part of the topic name to remove from the topic type, we don't have to limit what topic name options/characters are allowed for users, but 'wasting' characters when we already have a length limit may be an issue. It's not elegant, but this is one possible workaround. By keeping partitions for namespaces we save the implementation of aliasing and dynamically change namespaces easily, so maybe it's worth the trade-off. |
That will certainly work, but at the same time that could also imply that we go back to pack the FQDN "/test/chatter" completely into the topic field with "test__chatter" by applying underscores. If we can't get around modifying the topic string in any case, we may consider keeping it somewhat intuitive so that external DDS components can make sense out of it. argh, that's a bummer. Now as for beta2, we could agree on leaving the system as is, stating that creating multiple publisher/subscriber/etc. within the same node, the same topic and using connext will lead to undefined behavior. I do believe it's exactly this combination which happens to fail. Other methods such as loading multiple nodes via composition will work as they have their own participant. |
Unfortunately choosing DDS was a huge mistake, you chose ~1000 pages of over-engineered specification that was never fixed and which took ~10 years just to implement version 1. As such you should have expected endless issues like this one. |
Thank you for sharing your opinion. From the feedback we have received so far it seems that many people are disagreeing with that and think choosing DDS as the default option for the RMW implementation makes a lot of sense.
The problem described in this ticket is actually not really a problem in the DDS spec but due to the way ROS 2 chose to map namespaces to partitions. The next release of ROS 2 will actually use a different approach which resolves this problem.
I am pretty confident that DDS users are already using the software in such scenarios. And since ROS 2 doesn't even expose that many QoS settings the user doesn't have to deal with that "complexity". |
We've switched back to not using partitions for the implementation of namespace (ros2/ros2#476). Regression test for this issue in ros2/rcl#257 |
* Topics returned by get_topic_names_and_types have leading forward slashes now * get_topic_names_and_types returns a list of types now * Fix a bug in the reception of best effort messages * Kinder failing without matplotlib * Add missing __init__.py files * Print warning when topics which appropriate names are ignored because of types * No issues with empty waitsets anymore (There's always a signal handler) * Add linters * Install the whole package rather than using py_modules * Don't publish on topic with the same name token The different types cause connext to get mixed up, see ros2/rmw_connext#234 * Flake8 import order wants different things on different machines? * Add launch as exec_depend
I have a node which subscribes to topic
data
and publishes reception rate data on topicreception_rate/data
. Those topics have different types.I get a
TDataWriter::narrow:ERROR: Bad parameter: wrong type writer
error when I try to publish, but only if it's publishing to*/data
. If it publishes toreception_rate/data_
there's no error. I suspect that the typesupport is being mixed up with the other topic's.Here's how to reproduce it with two publishers in the same node:
Behaviour with fastrtps (matches expected behaviour):
Behaviour with connext (fresh daemon):
That last output showing
/test/chatter
having typestd_msgs/String
is suspicious.The text was updated successfully, but these errors were encountered: