Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix high CPU - Use executor to spin and stop node in tf_listener thread #119

Merged
merged 3 commits into from
Jul 31, 2019

Conversation

bpwilcox
Copy link
Contributor

Previously, #114 resolved the issue of terminating the work in the thread spinning the node which lead to hanging and unclean shutdown when destructing the node. However, the current implementation runs in a tight loop using rclcpp::spin_some, which leads to very high and inefficient cpu usage even when there is no work to be processed.

This problem was resolved in ros-navigation/navigation2#866, by putting the node on an executor and spinning. As opposed to rclcpp::spin, this change allows the executor to manage the termination of the spin thread by calling executor->cancel in the destructor. This pattern seems to work well when needing to cleanly join a thread spinning a node.

@bpwilcox bpwilcox changed the title Use executor to spin and stop node in thread Fix high CPU - Use executor to spin and stop node in tf_listener thread Jun 25, 2019
@rotu
Copy link
Contributor

rotu commented Jul 10, 2019

Would love to get this moving. TF_ROS is making robot_localization se_node peg one CPU core at 100% versus 30% with this PR.

@clalancette @cottsay

@mjeronimo
Copy link
Contributor

@cottsay Would appreciate any attention you can provide here. This is also impacting the nav2 stack.

@@ -54,19 +54,24 @@ TransformListener::TransformListener(tf2::BufferCore & buffer, bool spin_thread)

TransformListener::~TransformListener()
{
stop_thread_ = true;
if (spin_thread_) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this boolean actually necessary? Could we just use the unique_ptr's value and eliminate the extra value?

Suggested change
if (spin_thread_) {
if (executor_) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Pushed a commit to remove the extra boolean.

bpwilcox added 2 commits July 17, 2019 11:29
initialize executor

set spin_thread to true in initThread

Signed-off-by: bpwilcox <bpwilcox@eng.ucsd.edu>
@bpwilcox bpwilcox force-pushed the tf_listener_executor_thread branch from aabbd6d to 249fe87 Compare July 17, 2019 18:39
Copy link
Member

@cottsay cottsay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change looks great to me, but it does break ABI, which makes me hesitant to nominate it for backport into Dashing. If you're OK with this landing in Eloquent, we can merge as-is, or we can iterate to find an ABI-compatible solution.

@cottsay cottsay added the bug Something isn't working label Jul 18, 2019
@bpwilcox
Copy link
Contributor Author

I believe we'd like this for a Dashing backport. This change is internal to the transform listener interface, can you elaborate on how this breaks?

@sloretz
Copy link
Contributor

sloretz commented Jul 18, 2019

This change is internal to the transform listener interface, can you elaborate on how this breaks?

@bpwilcox It looks like this patch could change the size of TransformListener if std::atomic<bool> and std::unique_ptr<...> are different sizes or have different alignment requirements.

@cottsay
Copy link
Member

cottsay commented Jul 18, 2019

What if we keep the executor instance scoped within the thread, restore the bool, and use spin_some(std::chrono::nanoseconds) on the executor with some reasonable timeout before checking the bool for termination?

https://github.com/ros2/rclcpp/blob/8c48dbede7de2bc7e5199f340b6bfd33ffd6a784/rclcpp/include/rclcpp/executor.hpp#L192-L205

@bpwilcox
Copy link
Contributor Author

What if we keep the executor instance scoped within the thread, restore the bool, and use spin_some(std::chrono::nanoseconds) on the executor with some reasonable timeout before checking the bool for termination?

This could be an interim solution (though I don't like throttling the spin with an arbitrary timeout), but I'd want this PR as-is (or perhaps using the alternative ros2/rclcpp#764 if merged) to go into the next release since it is more efficient.

I can try to test your suggestion locally and push a new PR for the Dashing backport. Should that also go to the ros2 branch or another branch?

@cottsay
Copy link
Member

cottsay commented Jul 19, 2019

I opened #131. It looks like the duration values passed to spin_some and spin_once are different, and only the spin_once value acts as we would need it to, so I used that. So my earlier reference to spin_some are not correct.

@@ -116,7 +116,7 @@ class TransformListener
rclcpp::Subscription<tf2_msgs::msg::TFMessage>::SharedPtr message_subscription_tf_;
rclcpp::Subscription<tf2_msgs::msg::TFMessage>::SharedPtr message_subscription_tf_static_;
tf2::BufferCore & buffer_;
std::atomic<bool> stop_thread_{false};
std::unique_ptr<rclcpp::executors::SingleThreadedExecutor> executor_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at #133, I actually think that we can get rid of both the stop_thread_ and executor_ member variables, and just use a local executor in initThread.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to use the local executor, should we go with the approach in #133 to be pushed to ros2, just removing the stop_thread?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More-or-less. For this PR targeting master, we don't care about ABI breaks, so we can remove both the stop_thread_ and executor_ member variables, and just use a local executor. For #133 targeting dashing, we'll still use the local executor, but we'll leave the stop_thread_ and executor_ member variables. They won't be used, but they'll maintain ABI. Make sense?

tf2_ros/src/transform_listener.cpp Outdated Show resolved Hide resolved
remove executor member variable
@bpwilcox bpwilcox force-pushed the tf_listener_executor_thread branch from 4160862 to 74ee27b Compare July 24, 2019 18:22
@bpwilcox
Copy link
Contributor Author

@clalancette I pushed the change to match #133 capturing the local executor so we can get this moving.

Also, if we believe this pattern to solve a common hangup, I'd suggest allowing for an executor to be passed into the rclcpp::spin function as I introduced in ros2/rclcpp#764

@cottsay cottsay merged commit fccfd98 into ros2:ros2 Jul 31, 2019
cottsay pushed a commit that referenced this pull request Jul 31, 2019
Reduce high CPU usage in TransformListener, and do it in a way that 
preserves ABI compatibility.
This is a backport of #119.

1. Use the executor `spin` method, so the executor can idle when it has 
   no work.
2. Capture the shared executor so we can (synchronously) call `cancel`
   in the `unique_ptr<thread>` deleter instead of polling `stop_thread_`
   in the thread run.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants