-
Notifications
You must be signed in to change notification settings - Fork 155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SimpleActionClient sometimes not subscribe result topic #119
Comments
Thank you for detailed report. I just noticed that your example code works, if we changed to
Do you have any good reason to start your |
Thank you for testing and comment my code. |
Sorry I accidentally closed this, so I reopen one. |
So, Why you initialize ac within every loop?
2018年11月8日(木) 21:34 Kazuhide Sawa <notifications@github.com>:
Reopened #119 <#119>.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#119 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAeG3Fw-n8r5v1s8tdu-joUch74k6yiNks5utCThgaJpZM4YONCw>
.
--
--
◉ Kei Okada
|
There is no reasonable reason to initializing action client every loop, in the normal use. |
As the result of additional investigation, I think it is a reconnection issue of action client. I made other reproduce program that loops launch of action server and client, and the issue was not reproduced as 13,000 tries. Then tried action client launch and exit only loop, as keep launching action server, it was reproduced twice in 8,609 tries. The code is here. Reconnection of the action client is assumed to occur occasionally. For example, it may be due to the respawn option of launch file. |
+1 |
This issue probably makes the tests currently flaky, see unstable build for PR #158
With additional debug output, it seems like the server receives the goal, but the test node doesn't get the result. |
I have a roscpp node that has an The debug output of the
Which is printed here. Doing a Doing a EnvironmentI am running this on ROS melodic. Both server and client are C++ nodes. I have never seen this issue when running the nodes outside of Docker. |
I think I found out what was causing the issue for me. The issue was that my Docker containers had a much higher I found the solution thanks to #93 (comment). I'm guessing something like this is happening moby/moby#38814:
See also ros/ros_comm#1122 |
I posted this issue in ROS Answers, but I have not got a clear answer. I think that it is a new bug in ROS, so raising an issue here. The original post is here.
Reproduction procedure
I have a simple reproduction code here, it tries action communication in the loop. In my environment, waitForResult() will time out as a result of miss subscribe result by approximately 500 - 10000 tries. I'm running ROS Kinetic, actionlib 1.11.13, ros-comm 1.12.14, and Ubuntu 16.04.5 LTS AMD64, with kernel 4.15.0.
Possible Sources of Error
To investigate further, I checked result_pub_.getNumSubscribes() in action_server_impl.h publishResult(). The number of subscribers is one in the first execution, and the become two. When a timeout occurred, the number of subscribers on the resulting topic was decreasing. Perhaps when running in the loop, one of the subscribers is the previous subscriber and the other is the current subscriber.
I suppose it is a bug that sometimes fails to register subscriber, so maybe it will be occur not only the loop but also first execution.
Result of rostest: catkin_make run_tests
The text was updated successfully, but these errors were encountered: