-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Option for safe k_thread_abort #21484
Comments
So... storage for k_thread structs (we don't have anything Zephyr called a "TCB") is managed by the user, not the kernel. After the return from a k_thread_abort(), the kernel has promised[1] not to have saved or cached or otherwise remembered the pointer to it, and not to use any of the memory for any purpose. Given this, our abort API has always been synchronous and hasn't needed to wait for other code to get out of the way[2]. Basically, the kernel doesn't manage storage for this object, and the object lifecycle is deterministically under the control of the app. Apps can do this for themselves, as I see it. What's an example of a code pattern that doesn't have a straightforward safe implementation in Zephyr but does elsewhere? [1] Though the k_thread_foreach_unlocked() patch in #20937 does complicate that by allowing the user callback to have an unlocked reference to a struct that might in theory be aborted. There it seemed like we all agreed it was resolvable with a "don't do that" note in the docs. [2] Actually there actually is a circumstance in SMP where we need to delay the abort of a thread running on the other CPU until it can handle an IPI and self-abort. But even here the k_thread_abort() implementation still looks synchronous to the caller -- it spins waiting on the other CPU. |
For example mbed has per thread mutex that is used when thread is destroyed. Moreover - the join mechanism is implemented - so other task may wait for thread termination before releasing the memory. |
This is how Zephyr works, thread state is captured entirely within the struct k_thread object and its associated stack object. There's no mechanism to deallocate anything when a thread exits. |
this would be solved with k_thread_join() |
This is actually how I plan on fixing this, the idle thread will have its priority raised temporarily. |
this has been addressed already. |
Is your enhancement proposal related to a problem? Please describe.
There is a discussion ongoing here: #20937 (comment).
The problem is that if we wish to support k_thread_abort, and allow the user to reuse the memory that was allocated for the TCB (Task Control Block), we may end reading data from a block that does not exist. It seems that task creation is quite safe, but destructive functions may need a protection.
One more time - it would not be an issue in a system where we abort a thread but do not destroy/deallocate the TCB. It might be a problem when user removes the TCB directly after thread abort.
Describe the solution you'd like
We should add some configuration like "THREAD_SAFE_ABORT" that would use mutex when we are about to remove a thread.
Also for interrupts, and that may access thread data, we should add additional variable with a thread handle that is going to be removed - then if anyone wishes to access thread data that is going to be deleted we may return error code (thread_resume/suspend, thread_name_set/get/copy, etc).
Describe alternatives you've considered
We may implement a part of how it is solved inside FreeRTOS, where real thread deletion may be moved into Idle thread (but only if thread tries to remove itself). But I have some doubts about this as it may lead the high priority task to wait until idle thread executes.
We may consider not to block thread_abort, but add mechanism similar than linux join - where we are waiting for the thread to finish. Then the user code may continue to delete any thread data only after he waits for the thread to really finish. Windows has similar mechanism, where we can wait for a thread as any other waitable object - it would be released when the thread is finished.
This two would then require the API extension. But in second hand - it would not add Mutex overhead when we are just aborting the thread without reusing nor deleting its data block.
Additional context
NC
The text was updated successfully, but these errors were encountered: