-
Notifications
You must be signed in to change notification settings - Fork 324
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix segfault at reconfigure of AdmittanceController #1248
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes look ok, but I'm not sure if this is the right way to fix the seg fault issue you are having
What would you suggest as an alternative ? |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1248 +/- ##
==========================================
- Coverage 80.43% 80.41% -0.02%
==========================================
Files 105 105
Lines 9353 9355 +2
Branches 818 819 +1
==========================================
Hits 7523 7523
- Misses 1556 1557 +1
- Partials 274 275 +1
Flags with carried forward coverage won't be shown. Click here to find out more.
|
We need to introspect more, because your fix is just masking the real issue. What do you say? |
My guess to what happens there is: We basically delete and recreate the instance of the kinematics plugin. It might be possible that at the same time there is an access happening to the reference we have to the dynamic library - but it is already deleted. Given that it does not happen every time is a strong indicator that probably multiple threads are accessing the instance of the kinematics plugin at the same time. I think not reloading the plugin if it is already loaded is just fine. |
If the controller is not active, the update loop is not run at all, in that case there is only one thread that is calling the configure method is the non-RT one. That's why, I'm very curious to know why only on reconfigure. Can you try resetting the kinematics_ pointer with |
I guess you were right! I just changed to code to: if(kinematics_loader_ ){
kinematics_.reset();
}
kinematics_loader_ =
std::make_shared<pluginlib::ClassLoader<kinematics_interface::KinematicsInterface>>(
parameters_.kinematics.plugin_package, "kinematics_interface::KinematicsInterface");
kinematics_ = std::unique_ptr<kinematics_interface::KinematicsInterface>(
kinematics_loader_->createUnmanagedInstance(parameters_.kinematics.plugin_name));
if (!kinematics_->initialize(
node->get_node_parameters_interface(), parameters_.kinematics.tip))
{
return controller_interface::return_type::ERROR;
} and no segfault. But it is nevertheless very interesting that this is causing an issue. Edit: I think we can still keep the original fix I made as there is no real reason to actually reload the plugin rather than just reinitializing it. |
@saikishor So what do you think. Shall we keep the fix as it currently or do you suggest a different approach ? |
@firesurfer have you taken a look at the pluginlib documentation?. I recommend doing so and do a proper fix. As we are talking about reconfiguring, it could be that the user wants to change the plugin type and it would be only possible at reconfigure instance, else he has to unload and load + configure the controller again |
@saikishor I just pushed the adaption I showed in my previous comment. I think this could be a proper solution as it makes sure that the I did take a look at the ClassLoader headerfile but as far as I can tell there is no information on how to properly re-create a ClassLoader or about the destruction order But what I could imagine what happens there is: When we destroy the class loader the linux dynamic library loader may unload/destroy the instance of the dynamic library https://linux.die.net/man/3/dlopen: If we then destroy the interface handle it calls the destructor of the KinematicsInterface, which results in a segfault in case the library is already unloaded. I would guess that the timing between the calling dlclose and the library actually being unloaded may not be deterministic. But as I said this is just a guess. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is an elegant solution to the problem you encountered
The failing ci issues are as far as I can tell due to an unrelated test. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (after the dlopen discussions).
@saikishor Is it fine for you to merge it like this / does my explanation sound reasonable ? |
(cherry picked from commit 31f7fbe)
(cherry picked from commit 31f7fbe)
This PR fixes #1181.
I tested it on iron and rebased it on rolling afterwards.
This PR simply avoid recreating the kinematics plugin, but just reinitializes it - which seems to work.