Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle HW errors on read and write in CM by stopping controllers #742

Merged
merged 2 commits into from
Sep 28, 2022

Conversation

destogl
Copy link
Member

@destogl destogl commented Jun 29, 2022

In short: when a hardware fails, i.e., error happens on read and write and hardware goes to "UNCONFIGURED" or "FINALIZED" state, controllers should be stopped automatically.
this addresses few recent issues.

  • Add code for deactivating controller when hardware gets an error on read and write.

  • Remove all interface from available list for hardware when an error happens.

  • To add in follow up PR: automatically restart Broadcasters (and trigger publishing before stopping)

@mergify
Copy link
Contributor

mergify bot commented Jun 30, 2022

This pull request is in conflict. Could you fix it @destogl?

@ejalaa12
Copy link
Contributor

ejalaa12 commented Jul 4, 2022

If a hardware stopped working you would stop the controller.
But in any case if that happens the controller manager won't be able to recover from it.
We might need a way to control the hardware interface lifecycle as well no? Otherwise, this hardware is not controllable anymore until we restart the whole controller manager.
Or did I miss something?

@destogl
Copy link
Member Author

destogl commented Jul 5, 2022

If a hardware stopped working you would stop the controller. But in any case if that happens the controller manager won't be able to recover from it. We might need a way to control the hardware interface lifecycle as well no? Otherwise, this hardware is not controllable anymore until we restart the whole controller manager. Or did I miss something?

I think (and hope) you missed something 😄

There is a service set_hardware_state where one can control hardware state from an external component, terminal. Would this be sufficient for now?

And yes, unfortunately, this is manual process for now. I hope to get some external components soon that would be a state-machines for ros2_control. Adding this into framework would it make even more cluttered and complex.

@destogl
Copy link
Member Author

destogl commented Jul 15, 2022

I'll split this into two PRs

@@ -1496,6 +1541,20 @@ void ControllerManager::list_hardware_components_srv_cb(
hwi.name = interface;
hwi.is_available = resource_manager_->command_interface_is_available(interface);
hwi.is_claimed = resource_manager_->command_interface_is_claimed(interface);
// TODO(destogl): Add here mapping to controller that has claimed or
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what shall we do with this?

@mergify
Copy link
Contributor

mergify bot commented Sep 24, 2022

This pull request is in conflict. Could you fix it @destogl?

…ead and write.

Fix misleading variable name in the tests.

Error handling in RM of read/write is tested.

Remove all interface from available list for hardware when an error happens.

Working verison of stopping controller on read error. Write error also works but the tests are missing.

Finalized tests for deactivating controllers when write failes.

Do some more variable renaming to the new nomenclature.
@bmagyar bmagyar merged commit ecf6c69 into ros-controls:master Sep 28, 2022
@destogl destogl deleted the handle-hardware-errors-in-cm branch September 28, 2022 18:40
@destogl
Copy link
Member Author

destogl commented Oct 15, 2022

@Mergifyio backport humble

mergify bot pushed a commit that referenced this pull request Oct 15, 2022
Add code for deactivating controller when hardware gets an error on read and write.
Fix misleading variable name in the tests.
Remove all interface from available list for hardware when an error happens.
Do some more variable renaming to the new nomenclature.

(cherry picked from commit ecf6c69)
@mergify
Copy link
Contributor

mergify bot commented Oct 15, 2022

backport humble

✅ Backports have been created

destogl added a commit that referenced this pull request Mar 13, 2023
Add code for deactivating controller when hardware gets an error on read and write.
Fix misleading variable name in the tests.
Remove all interface from available list for hardware when an error happens.
Do some more variable renaming to the new nomenclature.

(cherry picked from commit ecf6c69)
destogl added a commit that referenced this pull request Apr 11, 2023
Add code for deactivating controller when hardware gets an error on read and write.
Fix misleading variable name in the tests.
Remove all interface from available list for hardware when an error happens.
Do some more variable renaming to the new nomenclature.

(cherry picked from commit ecf6c69)
destogl added a commit that referenced this pull request Aug 9, 2023
Add code for deactivating controller when hardware gets an error on read and write.
Fix misleading variable name in the tests.
Remove all interface from available list for hardware when an error happens.
Do some more variable renaming to the new nomenclature.

(cherry picked from commit ecf6c69)
pac48 pushed a commit to pac48/ros2_control that referenced this pull request Jan 26, 2024
(cherry picked from commit 5178503f1044cc988663c196c32a0039c72418b0)

Co-authored-by: Christoph Fröhlich <christophfroehlich@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants