-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mixing Wall time with simulation time in Costmap2DROS::mapUpdateLoop #3325
Comments
Confirmed: the Costmap2D publisher works on very low rates for me locally:
<- Average rate should be
<- Average rate should be The situation when the problem appear became to be |
@AzaelCicero A PR is always appreciated.
If the issue is that the actual updating of the costmap is too long, that's not a problem with clocks / timers. That's trying to do too much "stuff" to be faster than real-time algorithmically.
The ROS rate should be doing that for us to keep a regular frequency - unless there's something wrong with that. |
Yes, that was my oversight. The problem is not related to the performance, as I initially thought.
Indeed the
Locally, it works for me: when changing Gazebo sim rate from 1000 -> down to 200, the |
I have created PR which replaces the WallRate with ROS timer created with the node clock instance. I had to remove the functionality of waiting for the first publish, as now the update is done on the ROS executor loop. I am not sure of the purpose of this functionality, so I cannot predict what impact It may have. I have tested this functionality in my use case where I am testing some planning algorithms on data replayed from ROS bags. Algorithms are using costmap as input. I have noticed that the frequency of costmap publishing is far off the expected. Then I started analyzing the code and found out that map update is controlled by system time, not simulation time. With the change proposed by PR, the rate of map updates and map publishing is as expected. I think that solution proposed by @AlexeyMerzlyakov is also viable, with the remark that this WallRate with ROS clock should be used on a separate thread, not on ROS executor thread if SingleThreadedExecutor is in use. Otherwise, we would be waiting for clock progression which would never come as we would block receiving clock updates. One test is not passing, I am going to fix that if you will give a green light to my PR. |
Just for noting: Adding
|
The above observations makes me think about the accuracies of both solutions from this ticket (switching to As follows from the results, both approaches ( However, when comparing the time errors of measured ROS-time between Which makes |
Yay! rclcpp Rate merged in! |
@AlexeyMerzlyakov what's the story with this ticket now? With the new Rate updates in rclcpp, can we use that now in the costmap thread and close out this ticket? |
@SteveMacenski, the scope of related works are reported in #3303 (comment). This issue is covered by #3303, as there are already discussing all the places need to be ROS-time respective, incl. new |
From @AlexeyMerzlyakov translating to this ticket: More general, by grepping where we are using
|
@AlexeyMerzlyakov is #3404 good to go with the rclcpp merge? Or, does it make sense to instead just change https://github.com/ros-planning/navigation2/blob/main/nav2_costmap_2d/src/costmap_2d_ros.cpp#L496 WallRate to the new Rate with |
@SteveMacenski, please refer to the #3404 (comment) which explains the intention |
OK, so I think we're all squared away on this issue. I have decided that things like Costmap's loop rate, waypoint follower's loop rate, and the controller's loop rate should be tied to the system clock. Internal timings of systems are using ROS time when in simulation, but module-level operations should remain as tied to wall time. This is because systems like the costmap and controller server are not systems that can be simulated at 2x-3x-10x speeds due to their computational complexity. The added variation due to the simulation clock is not worth the advantages and results in a poorer quality and explainability of navigation behavior at those higher speeds since you'll see problems not seen at 1x speeds. I'm not so dead set on this that I'm not open to considering this change in the future, however today I think it is not a worthwhile trade off. It also created a number of serious issues breaking testing behavior which given the above statements (which almost certainly impacts real simulation use-cases as well), after about half a day of debugging I ultimately decided wasn't worth continuing to address. The behaviors (timed behavior, spin, drive on heading, backup, wait, etc), planners planning time, smoothers smoothing time, and rate controller have all been updated to use simulation time from the clock as internal behavioral matters. Costmap's & waypoint follower's loops; controller will remain tied to walltime since those are system level behaviors where changes in timing are important. You can see that in the PR #4000 |
Bug report
Required Info:
Steps to reproduce issue
Just use costmap_2d node with simulation time, which has very different rate than wall time. You will observe that HZ of map update and map publish is not reaching the expected value and are far off the target.
Expected behavior
Configured rates of update and publish should be achieved in respect to simulation time.
Actual behavior
Map update is reaching expected rate in respect to wall time. Publish rate is bounded to the update rate, as it can only occur during the update, yet it is triggered in respect to simulation time. Depending on the mismatch of the rate between simulation and wall time you will observe different HZ of update and publish.
Additional information
Code should be written in respect to one time source, in this case I think that simulation time is better suited.
https://github.com/ros-planning/navigation2/blob/main/nav2_costmap_2d/src/costmap_2d_ros.cpp#L461-L498
The text was updated successfully, but these errors were encountered: