-
Notifications
You must be signed in to change notification settings - Fork 193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add design document for node name uniqueness #187
Comments
What about the fact that using composition the nodes in the same container get the same name? For more information: This is an example:
Both the node gets the
This is the result of the
|
There's no reason that you have to end up with nodes with the same fully qualified node name when doing composition. You can remap the node name and/or namespace when you construct it or if you launch it with launch. The warning you're getting is letting you know that the logging feature doesn't work correctly if you don't make your node names unique. In the future we may enforce this some other way, or we may allow duplicate node names, that's what the design doc described in this issue would decide. I would bet on node names needing to be unique, for two reasons, first there are some features we have implemented right now that assume it (like the logging) and for some of those I cannot think of a way to do it with non-unique node names in an async distributed system (no central master). Second, having unique node names is really nice for introspection, like if you get a console "[zed.zed_node] there was a problem", which node did that come from? |
Many systems use a two layer approach: a long ID for uniqueness purposes and a human readable label. E.g. IP vs host name, DDS Participant GUID vs DDS Participant Name. Considering many use cases for non-unique names match what has been solved in the web server world (load balancing, redundancy...). I was wondering if this experience could be leveraged. This change will also impact SROS as the X.509 certificate DN must match the participant name. However, I don't understand the DDS Security spec well enough to completely understand if node name uniqueness would be an improvement or not. I guess the cons of node name uniqueness are:
Are there any other use cases where having non-unique node names are useful? |
@wjwwood I can set different names creating the two classes and it works if I use "ros2 run", but in the launch file I must specify a node name (it is required for lifecycle nodes) that overwrites the names specified in the code |
Yeah, those actions in launch cannot support a process with more than one node in it at a time. As I said in my answer to your question on answers.ros.org, actions to handle that case were considered, but not implemented yet, see also:
That's true, but for For the DDS participant name, that's actually an extension, and not part of the DDS standard, see: The DDS Participant GUID is something we could expose/use, but it's not something that is trivial to create, and would be a considerable burden for other rmw implementations which aren't based on RTPS, which is a secondary use case, but still something people are already doing.
But they don't achieve this by non-unique IP or host name, instead they have proxy's. They can play tricks with DNS, but the result is never that I ask for google.com and get more than one IP back, which would be the case if I asked for the unique id of Instead, an equivalent setup in ROS (in my opinion) would either be a node which you call the service on and then relays the service call to one of a couple backends (each with their own name) or we could make service names namespaced by node name (which was a proposal we didn't follow through on) so that you'd do something like this:
And the above could be the implementation of a service server, maybe something like For topics, it's easier because the node name actually just doesn't matter if you're doing load balancing or high availability. For high availability you just need to set the priority QoS (which we currently don't expose, but we could) or push the topic selection code into the user's application (maybe a library call).
I don't have any knowledge about that, but input from the security folks would be essential for this design document. Though I have to imagine that the configuration files (at least) would be a lot simpler if the node names were required to be unique.
Yeah, we'd basically need to expose the unique id everywhere (at least optionally) otherwise things like logs will become a nightmare if you're running lots of instances of reusable nodes. Basically even if we supported non-unique node names in the end, I'd still recommend having unique node names in your system for your own sanity's sake.
You just need to create them with different names, which we need to make easier but still should be easy at some point, because as I said above in most cases the node name doesn't affect your ability to have backups for reliability.
Personally, I have a hard time coming up with scenarios where non-unique node names is required and I can only come up with a few cases where it might be slightly more convenient to have duplicate node names. Most of the time it's just making the system more confusing. The main reason I didn't just make it the case from the start of |
If we decide to adopt the dual approach proposed by @thomas-moulard, then giving every node a UUID, with the option of manually specifying it and potentially using a different type of label for systems that need that determinism, seems reasonable to me. |
I agree that having unique name for each node is the right way to take, for all the reasons you reported. |
@wjwwood wrote:
Just an observation: we could still classify it as a "trick", but round robin dns actually works exactly like that:
poor mans failover can be implemented like this (although not as efficient as with more explicit systems/implementations). |
I would like to get back to the second part of Williams question:
Some subsystems can probably more easily ensure that node names are unique. E.g. when launch starts N nodes it can check among those easily. But how about scenarios where node with the same name discover each other later asynchronously. E.g. two robots roaming around and when they first time join the same WiFi network discover each other and happen to have duplicate node names. What behaviors are we expecting in these situations? Logging a message, stopping either / all of the non-unique nodes, other options? |
Yeah I thought when I was writing it that might be the case, but flawed as my analogy was I still think the IP address as host is a decent analogy. I suppose we could have a two tier system, where there are node names and node addresses (like IP address, but maybe GID or something) and a name look up system (to translate from node name to node address), but I don't really believe we need that.
In my opinion, we should let that be configurable, as all of those cases might be appropriate. In ROS 1 the older node would automatically shutdown, allowing you to replace just one node in a system by simply running a new instance of it (useful for development in some cases), but I think that would be surprising behavior for most people. For a default (non-production) behavior I think having the newer node raise an exception makes sense. As that's the program output the developer is most likely to be watching when the collision is detected. In both cases, i.e. old closes or new closes, a warning could be logged on the console logging of the node that didn't shutdown for visibility. In the case of a tie (timestamp of node creation), which is unlikely unless manufactured, we can have some resolution (it's a common problem in shared decision making systems). For production systems, I expect neither of the above strategies will be considered safe to allow. So most likely, ignore with a log message and/or have a callback to handle it programmatically is the right approach (this can be used to implement the other cases). I think this is similar to something like how in iOS you can get a callback just before your application is killed due to out-of-memory. It gives you an opportunity handle it yourself, but by default will have some reasonable behavior. |
Have there been any internal discussions in the ros team about this design document? |
I suggest to have an option for enabling and disabling node name uniqueness through an environment variable. Thereby providing the downstream users to just turn on or turn off without much changes to their code. What are your thoughts? |
Currently many design documents and features implicitly rely on node names being unique, but we have never sorted out the issues with how to enforce node name uniqueness in a distributed system and therefore we've never officially declared that node names need to be unique. We need a design document that states node names need to be unique (or not) with rationale why and if unique then how to enforce it and if not unique how to deal with existing features that assume and rely on unique node names.
The text was updated successfully, but these errors were encountered: