-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incomplete closed subscriber initial_pose_sub_
in nav2_amcl
may cause use-after-free bug
#4166
Comments
Additional informationTo make sure whether there's any possibility that a subscriber's ongoing work would completely shutdown and exit after Part 1. for experimentsTo make sure if subscriber's callback-function like code insertI insert some code for log as following: 1> in function void
AmclNode::handleInitialPose(geometry_msgs::msg::PoseWithCovarianceStamped & msg)
{
//log insert
RCLCPP_INFO(get_logger(),"[debug]:in handleInitialPose()");
std::this_thread::sleep_for(std::chrono::seconds(1));// sleep for 1s
RCLCPP_INFO(get_logger(),"[debug]:still in handlerInitialPose()");
//insert end
...
... 2> In function nav2_util::CallbackReturn
AmclNode::on_cleanup(const rclcpp_lifecycle::State & /*state*/)
{
RCLCPP_INFO(get_logger(), "Cleaning up");
...
...
initial_pose_sub_.reset();
//log insert
RCLCPP_INFO(get_logger(),"[debug]:initial_pose_sub_.reset() has been done");
//insert end
... Thus, if there's a log "still in handlerInitialPose()" occurs after how I use
|
I thought the subscription class in the destructor would block until complete, but it does not appear that it does after closer inspection of the source code and your logs. I think filing a ticket with rclcpp that the subscriber's destructor does not wait until currently processing callbacks are exited is a good thing to note. I am, however and separately now that I'm thinking about this more, a little unclear how it is that the subscription callback is possibly still executing if we're executing on the |
I've opened ISSUE filing this ticket with rclcpp : ros2/rclcpp#2447. By the way , I have had a try to understand how the class |
I took a look based on the ros2/rclcpp#2447 issue, and I don't think this is related to destructors. Rather, it looks to me that the Based on the ASAN trace there, the It appears that there is a second single threaded executor being spawned in the on_configure method here: https://github.com/ros-planning/navigation2/blob/47374622dee01a27e5f9b8ae08f3d19a15de9b3a/nav2_amcl/src/amcl_node.cpp#L249-L251 |
yep, I believe so as well. So I think it would be a solution if we could let the subscription callback thread joint into the on_cleanup thread after |
Yes, there are several moving pieces here, but I think that once you have |
That is my bad then! I missed that second executor, AMCL isn't something I've spent alot of time looking at since the original port back in 2018. That's something I should have caught before looking you guys in, sorry about that. If we |
Bug report
Required Info:
Steps to reproduce issue
Launch the navigation2 normally, as following steps:
Curious about how navigation2's response to received interference-messages during work, I keep sending messages into topic
/initial_pose
at intervals.Finally sent Ctrl+C to shutdown navigation2, which is before stop the msg-sending.
An ASAN report file was discovered in my execution environment.
Expected behavior
Actual behavior
The ASAN reporting a use-after-free bug to me, as following:
Additional information
We found a pointer named
initial_pose_sub
created bynav2_amcl
, which was the only entry to functioninitialPoseRecieved()
, amcl_node.cpp #L1528-L1530Also, focus on code lines amcl_node.cpp#L356-L357 , the pointer
pf_
would be freed bypf_free(pf_)
.Thus, if the
initial_pose_sub_
is processing a recieved msg util thepf_free(pf_)
has been done,initial_pose_sub_
initiates the following function calls:initialPoseReceived() -> handleInitialPose() -> pf_init()
, and finally use the freed pointerpf_
in this line amcl_node.cpp#L601HOWEVER, it's noticed that
initial_pose_sub_
has been freed beforepf_free(pf_)
. amcl_node.cpp #L329-357It's strange. Does it mean that releasing pointer like
**_sub_.reset()
would not cancel and exit the subscriber's ongoing work task?a suggestion that may be helfpful:
**_sub_.reset()
?The text was updated successfully, but these errors were encountered: