Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix VTOL example #403

Merged
merged 3 commits into from
May 22, 2018
Merged

Fix VTOL example #403

merged 3 commits into from
May 22, 2018

Conversation

julianoes
Copy link
Collaborator

Fixes #401.

This fixes a segfault that sometimes happened on exit because the
iterator of the _mavlink_handler_table was invalidated because plugins
were being destructed that removed their callbacks from the map.
If instantiated earlier, the action plugin can "prepare" and by the
time it's used it will have received the landed state.
This is nicer than just exiting immediately.
@@ -136,6 +137,16 @@ void MAVLinkSystem::process_mavlink_message(const mavlink_message_t &message)
it->callback(message);
_mavlink_handler_table_mutex.lock();
}

if (_iterator_invalidated) {
// Someone messed with the map while we were doing the callback.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can that happen? Don't we have the _mavlink_handler_table_mutex lock for preventing that?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that we have to release the lock during the callback. If we didn't release the lock during the callback, we could lock ourselves out if the code in the callback tries to register or unregister a message.

An alternative would be to use a recursive_lock, so we wouldn't lock ourselves out, however, we would still change the map and invalidate the iterator.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proper solution is to have events in queues instead of callbacks, I think but we're far away from that 😞.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Well, that's fine for now :-).

Copy link
Collaborator

@hamishwillee hamishwillee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works :-)

@hamishwillee
Copy link
Collaborator

@julianoes Is there a test of some kind that we don't have that we should have had to catch this?

@julianoes
Copy link
Collaborator Author

@julianoes Is there a test of some kind that we don't have that we should have had to catch this?

We should have integration tests in CI to catch this in the future. For now I try to eliminate one segfault or deadlock after the other when I run into it. This way I've been able to fix a couple in the past weeks.

@julianoes julianoes merged commit ae1ad25 into develop May 22, 2018
@julianoes julianoes deleted the fix-vtol-example branch May 22, 2018 14:23
@hamishwillee
Copy link
Collaborator

Thanks @julianoes
It would be good to add those tests. For every segfault you fix I think I could inject 3 :-)

rt-2pm2 pushed a commit to rt-2pm2/DronecodeSDK that referenced this pull request Nov 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants