Option for safe k_thread_abort #21484

rakons · 2019-12-18T10:58:36Z

Is your enhancement proposal related to a problem? Please describe.
There is a discussion ongoing here: #20937 (comment).
The problem is that if we wish to support k_thread_abort, and allow the user to reuse the memory that was allocated for the TCB (Task Control Block), we may end reading data from a block that does not exist. It seems that task creation is quite safe, but destructive functions may need a protection.

One more time - it would not be an issue in a system where we abort a thread but do not destroy/deallocate the TCB. It might be a problem when user removes the TCB directly after thread abort.

Describe the solution you'd like
We should add some configuration like "THREAD_SAFE_ABORT" that would use mutex when we are about to remove a thread.
Also for interrupts, and that may access thread data, we should add additional variable with a thread handle that is going to be removed - then if anyone wishes to access thread data that is going to be deleted we may return error code (thread_resume/suspend, thread_name_set/get/copy, etc).

Describe alternatives you've considered
We may implement a part of how it is solved inside FreeRTOS, where real thread deletion may be moved into Idle thread (but only if thread tries to remove itself). But I have some doubts about this as it may lead the high priority task to wait until idle thread executes.

We may consider not to block thread_abort, but add mechanism similar than linux join - where we are waiting for the thread to finish. Then the user code may continue to delete any thread data only after he waits for the thread to really finish. Windows has similar mechanism, where we can wait for a thread as any other waitable object - it would be released when the thread is finished.

This two would then require the API extension. But in second hand - it would not add Mutex overhead when we are just aborting the thread without reusing nor deleting its data block.

Additional context
NC

andyross · 2019-12-18T17:37:11Z

So... storage for k_thread structs (we don't have anything Zephyr called a "TCB") is managed by the user, not the kernel. After the return from a k_thread_abort(), the kernel has promised[1] not to have saved or cached or otherwise remembered the pointer to it, and not to use any of the memory for any purpose. Given this, our abort API has always been synchronous and hasn't needed to wait for other code to get out of the way[2].

Basically, the kernel doesn't manage storage for this object, and the object lifecycle is deterministically under the control of the app. Apps can do this for themselves, as I see it. What's an example of a code pattern that doesn't have a straightforward safe implementation in Zephyr but does elsewhere?

[1] Though the k_thread_foreach_unlocked() patch in #20937 does complicate that by allowing the user callback to have an unlocked reference to a struct that might in theory be aborted. There it seemed like we all agreed it was resolvable with a "don't do that" note in the docs.

[2] Actually there actually is a circumstance in SMP where we need to delay the abort of a thread running on the other CPU until it can handle an IPI and self-abort. But even here the k_thread_abort() implementation still looks synchronous to the caller -- it spins waiting on the other CPU.

rakons · 2019-12-19T10:45:03Z

For example mbed has per thread mutex that is used when thread is destroyed. Moreover - the join mechanism is implemented - so other task may wait for thread termination before releasing the memory.

carlescufi · 2019-12-19T10:51:48Z

@rakons and @andyross
Note that there is now an RFC for a new k_thread_join() API: #21500

andrewboie · 2019-12-19T20:44:28Z

it would not be an issue in a system where we abort a thread but do not destroy/deallocate the TCB

This is how Zephyr works, thread state is captured entirely within the struct k_thread object and its associated stack object. There's no mechanism to deallocate anything when a thread exits.

nashif · 2020-02-13T23:46:13Z

this would be solved with k_thread_join()

andrewboie · 2020-08-13T17:59:28Z

Closely related issue: #23062 and also #26486

We need thread->fn_abort to run not in the context of the aborting thread.

andrewboie · 2020-08-13T18:17:19Z

We may implement a part of how it is solved inside FreeRTOS, where real thread deletion may be moved into Idle thread (but only if thread tries to remove itself). But I have some doubts about this as it may lead the high priority task to wait until idle thread executes.

This is actually how I plan on fixing this, the idle thread will have its priority raised temporarily.

nashif · 2021-04-23T14:57:00Z

this has been addressed already.

rakons added the Enhancement Changes/Updates/Additions to existing features label Dec 18, 2019

rakons changed the title ~~k_thread_abort implementation is not safe~~ Option for safe k_thread_abort Dec 18, 2019

carlescufi assigned andyross Dec 18, 2019

carlescufi added the area: Kernel label Dec 18, 2019

nashif assigned andrewboie and unassigned andyross Feb 13, 2020

nashif unassigned andrewboie Apr 23, 2021

nashif closed this as completed Apr 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Option for safe k_thread_abort #21484

Option for safe k_thread_abort #21484

rakons commented Dec 18, 2019

andyross commented Dec 18, 2019 •

edited

Loading

rakons commented Dec 19, 2019

carlescufi commented Dec 19, 2019 •

edited

Loading

andrewboie commented Dec 19, 2019

nashif commented Feb 13, 2020

andrewboie commented Aug 13, 2020

andrewboie commented Aug 13, 2020

nashif commented Apr 23, 2021

Option for safe k_thread_abort #21484

Option for safe k_thread_abort #21484

Comments

rakons commented Dec 18, 2019

andyross commented Dec 18, 2019 • edited Loading

rakons commented Dec 19, 2019

carlescufi commented Dec 19, 2019 • edited Loading

andrewboie commented Dec 19, 2019

nashif commented Feb 13, 2020

andrewboie commented Aug 13, 2020

andrewboie commented Aug 13, 2020

nashif commented Apr 23, 2021

andyross commented Dec 18, 2019 •

edited

Loading

carlescufi commented Dec 19, 2019 •

edited

Loading