Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix shutdown ordering in DeviceController. #7430

Merged
merged 1 commit into from
Jun 8, 2021

Conversation

bzbarsky-apple
Copy link
Contributor

Before this fix we would tear down some things (importantly the secure
session manager) before starting shutdown of the network layer. This
would lead to a window of time during which we can still receive
messages while in a partially torn down state, which would leave to
crashes.

Moving network layer shutdown, and in particular platform manager
shutdown, to be first in the shutdown sequence ensures this can't
happen by shutting down the message processing thread before we tear
down any other state.

Fixes #7297

Problem

We can process messages while partially torn down.

Change overview

Ensure message processing is the first thing we stop doing.

Testing

I removed the various workarounds added in #7381, changed scripts/tests/test_suites.sh to run the tests many times, and tested with/without this change. Without this change, I typically crashed within just a few test runs (< 10). With this change I did >1000 runs with no crashes.

@woody-apple
Copy link
Contributor

bzbarsky-apple added a commit to bzbarsky-apple/connectedhomeip that referenced this pull request Jun 7, 2021
@woody-apple
Copy link
Contributor

@andy31415 ?

Before this fix we would tear down some things (importantly the secure
session manager) before starting shutdown of the network layer.  This
would lead to a window of time during which we can still receive
messages while in a partially torn down state, which would leave to
crashes.

Moving network layer shutdown, and in particular platform manager
shutdown, to be first in the shutdown sequence ensures this can't
happen by shutting down the message processing thread before we tear
down any other state.

Fixes project-chip#7297
@bzbarsky-apple bzbarsky-apple force-pushed the fix-controller-shutdown branch from 3f0b66e to 0d2df68 Compare June 8, 2021 18:17
@mspang mspang merged commit 2877e1e into project-chip:master Jun 8, 2021
@bzbarsky-apple bzbarsky-apple deleted the fix-controller-shutdown branch June 8, 2021 20:06
bzbarsky-apple added a commit to bzbarsky-apple/connectedhomeip that referenced this pull request Jun 8, 2021
@andy31415
Copy link
Contributor

Cherrypick tag based on request for Arun.

@andy31415
Copy link
Contributor

This seems to give me merge conflicts since #7429 also changed order of shutdown. Likely both should be cherrypicked.

andy31415 pushed a commit that referenced this pull request Jun 9, 2021
Before this fix we would tear down some things (importantly the secure
session manager) before starting shutdown of the network layer.  This
would lead to a window of time during which we can still receive
messages while in a partially torn down state, which would leave to
crashes.

Moving network layer shutdown, and in particular platform manager
shutdown, to be first in the shutdown sequence ensures this can't
happen by shutting down the message processing thread before we tear
down any other state.

Fixes #7297
bzbarsky-apple added a commit that referenced this pull request Jun 9, 2021
…7434)

This change depends on
#7428 and
#7430 which fix
the problems properly.

Fixes #7408
Fixes #7409
Fixes #7410
doublemis1 pushed a commit to doublemis1/connectedhomeip that referenced this pull request Jul 7, 2021
Before this fix we would tear down some things (importantly the secure
session manager) before starting shutdown of the network layer.  This
would lead to a window of time during which we can still receive
messages while in a partially torn down state, which would leave to
crashes.

Moving network layer shutdown, and in particular platform manager
shutdown, to be first in the shutdown sequence ensures this can't
happen by shutting down the message processing thread before we tear
down any other state.

Fixes project-chip#7297
nikita-s-wrk pushed a commit to nikita-s-wrk/connectedhomeip that referenced this pull request Sep 23, 2021
Before this fix we would tear down some things (importantly the secure
session manager) before starting shutdown of the network layer.  This
would lead to a window of time during which we can still receive
messages while in a partially torn down state, which would leave to
crashes.

Moving network layer shutdown, and in particular platform manager
shutdown, to be first in the shutdown sequence ensures this can't
happen by shutting down the message processing thread before we tear
down any other state.

Fixes project-chip#7297
nikita-s-wrk pushed a commit to nikita-s-wrk/connectedhomeip that referenced this pull request Sep 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Crashes in chip-tool client after PR 7060
8 participants