Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add rmw count clients,services impl #427

Merged
merged 4 commits into from
Sep 29, 2023
Merged

Add rmw count clients,services impl #427

merged 4 commits into from
Sep 29, 2023

Conversation

leeminju531
Copy link
Contributor

this PR is for service counts using it's name like topic.
Related to ros2/ros2cli#771

Signed-off-by: leeminju531 dlalswn531@naver.com

Copy link
Collaborator

@eboasson eboasson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few comments. I've only made them for rmw_count_clients, obviously they also apply to rmw_count_services.

RMW_CHECK_ARGUMENT_FOR_NULL(count, RMW_RET_INVALID_ARGUMENT);
auto common_context = &node->context->impl->common;
const std::string mangled_rq_service_name =
make_fqtopic(ROS_SERVICE_REQUESTER_PREFIX, service_name, "Request", false);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to have a const std::string for the "Request" and "Reply" strings and use that everywhere (in particular also in rmw_init_cs).

return ret;
}
if (number_of_request_publishers != number_of_response_subscribers) {
return RMW_RET_ERROR;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This'll lead to spurious error returns: discovery of publishers and subscribers happens independently, and so the odd discrepancy between the two is to be expected. It is not trivial to decide what to do, retrying is an option, but then you could end up in a live lock if someone is creating and deleting clients all the time, or in an even worse situation if someone deliberately creates a DDS reader but no writer (or vice versa, but the reader seems more likely to me).

Given that all the (Cyclone DDS) clients have a "client id" in the user data (

std::string user_data = std::string(is_service ? "serviceid=" : "clientid=") + csid_to_string(
), one could also take the union of all of these ids and then return the size of the set. The vast majority of the time it'll be exactly the same as the number of response subscribers ...

Copy link
Contributor Author

@leeminju531 leeminju531 Oct 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've searched methods to get the size of client ids.

for this, to my knowledge, a type of thedds_entity_t should be needed like below.

if (get_matched_endpoints(client.pub->enth, dds_get_matched_subscriptions, rds) < 0 ||
get_matched_endpoints(client.sub->enth, dds_get_matched_publications, wrs) < 0)
{
RMW_SET_ERROR_MSG("rmw_service_server_is_available: failed to get reader/writer matches");
return RMW_RET_ERROR;
}
// first extract all service ids from matched readers
std::set<std::string> needles;
for (const auto & rdih : rds) {
auto rd = get_matched_subscription_data(client.pub->enth, rdih);
std::string serviceid;
if (rd && get_user_data_key(rd->qos, "serviceid", serviceid)) {
needles.insert(serviceid);
}

however, the function i made have no topic entity and i tried to find it (or the type of rmw_client_t , rmw_publisher_t ... ) using node and service_name provided by the input parameter of the function, but i'd not find it.

to implement the service count, there are two ways in my idea.

  1. search all nodes (

    extern "C" rmw_ret_t rmw_get_node_names(
    const rmw_node_t * node,
    rcutils_string_array_t * node_names,
    rcutils_string_array_t * node_namespaces)
    )
    and count the matched service.
    (
    extern "C" rmw_ret_t rmw_get_client_names_and_types_by_node(
    const rmw_node_t * node,
    rcutils_allocator_t * allocator,
    const char * node_name,
    const char * node_namespace,
    rmw_names_and_types_t * sntyp)
    )

  2. immediately return number_of_response_subscribers without comparing with number_of_request_publishers

    return common_context->graph_cache.get_reader_count(mangled_topic_name, count);

What should I do? could you tell me if there's a good way I haven't found?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It really depends on what you want to use it for, right?

If all it is ever going to be used for is for a user to get an indication of the number of clients/services from the CLI, then I'd say just look at the number of response subscribers and be done with it. (It would also make it really trivial to have a single implementation for counting clients and subscribers: all you need to do is pass in the topic name pattern you're looking for.)

If you do want to protect against all kinds of things that might happen if people start to write parts of the system in DDS without using the ROS libraries, I'd go a step further. If you want to have a well-defined notion of what number you actually return (there is no such thing as the "current" number of subscribers in a distributed system that's not in steady state) then the first problem is to define what exactly the returned value means.

I suspect it is, at least for now, the first case. In that case, it is probably best to keep it as simple as possible.

I just checked the related PRs to get a better sense of the goal, and I noticed that the comment in the RMW says "does not allocate memory". I'd say that's not true, e.g., just constructing the topic node likely involves memory allocation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for your information and sorry for misunderstanding

I suspect it is, at least for now, the first case. In that case, it is probably best to keep it as simple as possible.

Does the first case mean just looking at the number of response subscribers ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i just saw comments again. i don't yet have a deep understanding of DDS and ROS.
my current estimate is below.

  1. immediately return number_of_response_subscribers without comparing with number_of_request_publishers.
    return common_context->graph_cache.get_reader_count(mangled_topic_name, count);

this just get the number of data readers for a DDS topic. it can't protect for those to use reading/writing parts in DDS without ROS as you mentioned.

  1. extern "C" rmw_ret_t rmw_get_client_names_and_types_by_node(
    const rmw_node_t * node,
    rcutils_allocator_t * allocator,
    const char * node_name,
    const char * node_namespace,
    rmw_names_and_types_t * sntyp)

in this case, it can get the numbers by Node which is a concept of ROS, so this is likely to protect from reading/writing parts in DDS without using the ROS

I suspect it is, at least for now, the first case. In that case, it is probably best to keep it as simple as possible.

this means the way which get the numbers using Node, right?

i'll post PR again as soon as you respond.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eboasson Friendly ping, I think we are waiting for a response here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed ... sorry that I let it slip through the cracks ...

I think for the purpose of showing the number of clients and services in the ros2 tool, there is no harm in assuming that every part of the system is either built on ROS 2, or plays by the rules that ROS 2 follows. That is, that there are no other readers/writers for the service topics.

I also think it is fine to not worry about all the edge cases one I come up with (years of practice there ...), again, given the purpose. I would put a note in the documentation that the number is only reliable "in steady state", or something to that effect, just so people will realise there are limitations to these operations.

And then, just go with: return common_context->graph_cache.get_reader_count(mangled_topic_name, count);

Does that sound reasonable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

absolutely.
I fixed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping @eboasson, could you review this again?
I'm going to quickly update other PRs related to this upon your acceptance.

if (number_of_request_publishers != number_of_response_subscribers) {
return RMW_RET_ERROR;
}
return common_context->graph_cache.get_reader_count(mangled_rp_service_name, count);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you consider it an error when the number of publisher & the number of subscribers don't match, then arguably you should here return number_of_response_subscribers, because the graph cache may already have changed here.

@leeminju531 leeminju531 requested a review from eboasson November 10, 2022 13:19
Copy link
Collaborator

@eboasson eboasson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, @leeminju531! And thanks for your patience with me ☺️

@methylDragon
Copy link
Contributor

methylDragon commented Feb 9, 2023

@methylDragon
Copy link
Contributor

rmw interface changes have been merged

Signed-off-by: leeminju531 <dlalswn531@naver.com>
Signed-off-by: leeminju531 <dlalswn531@naver.com>
Signed-off-by: leeminju531 <dlalswn531@naver.com>
Signed-off-by: leeminju531 <dlalswn531@naver.com>
@clalancette
Copy link
Contributor

CI for this is in ros2/rmw_implementation#208 (comment)

@clalancette clalancette merged commit 2263814 into ros2:rolling Sep 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants