-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crashes in chip-tool client after PR 7060 #7297
Comments
IMHO, this case is quite tricky…… since in most apps we won't call shotdown and this won't be a problem (i.e. the Shutdown is never called by example apps since we terminates it without gracefully shutting down) One possible solution would be:
|
Just to be clear, on iOS we do in fact call Shutdown. In practice we end up having to call it a good bit, so we're interested in it working properly. ;) The right solution here seems to be to:
|
For things that land us inside |
Before this fix we would tear down some things (importantly the secure session manager) before starting shutdown of the network layer. This would lead to a window of time during which we can still receive messages while in a partially torn down state, which would leave to crashes. Moving network layer shutdown, and in particular platform manager shutdown, to be first in the shutdown sequence ensures this can't happen by shutting down the message processing thread before we tear down any other state. Fixes project-chip#7297
Before this fix we would tear down some things (importantly the secure session manager) before starting shutdown of the network layer. This would lead to a window of time during which we can still receive messages while in a partially torn down state, which would leave to crashes. Moving network layer shutdown, and in particular platform manager shutdown, to be first in the shutdown sequence ensures this can't happen by shutting down the message processing thread before we tear down any other state. Fixes project-chip#7297
Before this fix we would tear down some things (importantly the secure session manager) before starting shutdown of the network layer. This would lead to a window of time during which we can still receive messages while in a partially torn down state, which would leave to crashes. Moving network layer shutdown, and in particular platform manager shutdown, to be first in the shutdown sequence ensures this can't happen by shutting down the message processing thread before we tear down any other state. Fixes #7297
Before this fix we would tear down some things (importantly the secure session manager) before starting shutdown of the network layer. This would lead to a window of time during which we can still receive messages while in a partially torn down state, which would leave to crashes. Moving network layer shutdown, and in particular platform manager shutdown, to be first in the shutdown sequence ensures this can't happen by shutting down the message processing thread before we tear down any other state. Fixes #7297
Before this fix we would tear down some things (importantly the secure session manager) before starting shutdown of the network layer. This would lead to a window of time during which we can still receive messages while in a partially torn down state, which would leave to crashes. Moving network layer shutdown, and in particular platform manager shutdown, to be first in the shutdown sequence ensures this can't happen by shutting down the message processing thread before we tear down any other state. Fixes project-chip#7297
Before this fix we would tear down some things (importantly the secure session manager) before starting shutdown of the network layer. This would lead to a window of time during which we can still receive messages while in a partially torn down state, which would leave to crashes. Moving network layer shutdown, and in particular platform manager shutdown, to be first in the shutdown sequence ensures this can't happen by shutting down the message processing thread before we tear down any other state. Fixes project-chip#7297
Problem
C++ chip-tool intermittently crashes when sending commands, in a build with PR #7060 in it. Attached is a log showing the crash plus some debugging information, but the upshot is that the following things happen, in order:
DeviceController::Shutdown
is called on thread 1SecureSessionMgr
has a nullmTransportMgr
.log.txt
What's happening here is that we send a command and get a response, which we process on thread 2. That response ends up in
CommandSender::OnMessageReceived
. This processes the response, and, after the PR #7060 changes, calls the consumer's callback. The callback in this case is one of the functions defined inexamples/chip-tool/commands/common/Commands.h
, which callsSetCommandExitStatus
. That notifies the condvar that thread 1 (which sent the command) is waiting for.At this point the two threads are racing against each other. Thread 2 proceeds to finish processing the message and close the exchange context, which flushes out the CRMP/MRP ack. That's the second stack in the attached log, leading to the message send. At the same time, Thread 1 is running and eventually calling
Shutdown
. The question is who wins the race: do we get our message off before we are shut down or not? If we do, we're fine. If not, we crash.Proposed Solution
Not have a race here. At the very least, we should not be able to enter shutdown while we are still in the middle of message processing. This is generally related to #6841.
Before PR #7060 I don't think this was a problem because we always closed the exchange and flushed out the pending ack before notifying our consumer. But we don't really want to be forced to do that....
@erjiaqing @pan-apple @andy31415 @arunbharadwaj
The text was updated successfully, but these errors were encountered: