-
Notifications
You must be signed in to change notification settings - Fork 636
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Main thread spread exception when thread-mgr enabled #1889
Main thread spread exception when thread-mgr enabled #1889
Conversation
@@ -1794,6 +1798,9 @@ wasm_runtime_call_wasm(WASMExecEnv *exec_env, | |||
result_argc, argv)) { | |||
wasm_runtime_set_exception(exec_env->module_inst, | |||
"the result conversion is failed"); | |||
#if WASM_ENABLE_THREAD_MGR != 0 | |||
wasm_cluster_spread_exception(exec_env); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should also spread exception when wasm_runtime_prepare_call_function fails, L1767?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
according to our discussion, I‘ve moved the spread into wasm_set_exception function, so we don't need to process the spread elsewhere
core/iwasm/aot/aot_runtime.c
Outdated
@@ -902,7 +902,7 @@ create_exports(AOTModuleInstance *module_inst, AOTModule *module, | |||
static bool | |||
clear_wasi_proc_exit_exception(AOTModuleInstance *module_inst) | |||
{ | |||
#if WASM_ENABLE_LIBC_WASI != 0 | |||
#if (WASM_ENABLE_LIBC_WASI != 0) && (WASM_ENABLE_THREAD_MGR == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why skip clear the exception for multi-threads?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Restored, thanks~
@@ -4021,7 +4021,7 @@ fast_jit_call_func_bytecode(WASMModuleInstance *module_inst, | |||
static bool | |||
clear_wasi_proc_exit_exception(WASMModuleInstance *module_inst) | |||
{ | |||
#if WASM_ENABLE_LIBC_WASI != 0 | |||
#if (WASM_ENABLE_LIBC_WASI != 0) && (WASM_ENABLE_THREAD_MGR == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, why skip clear for multi-threads?
|
Thanks for the suggestions.
I think we can add a status field to record the status of a cluster. When we spreading the exception:
Then if there are any other threads creating new thread at the meantime, we directly fail it if the cluster's status is @yamt @wenyongh how do you think about this solution?
Yes, and |
i think it works.
thank you. |
It looks good to me. |
bh_assert(module_inst_comm->module_type == Wasm_Module_Bytecode | ||
|| module_inst_comm->module_type == Wasm_Module_AoT); | ||
|
||
const char *exception = wasm_get_exception(module_inst); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had better declare the variable at the beginning of the function, a concern is that old version compiler might report warning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks
#if WASM_ENABLE_THREAD_MGR != 0 | ||
wasm_cluster_spread_exception( | ||
wasm_clusters_search_exec_env((WASMModuleInstanceCommon *)module_inst), | ||
exception ? false : true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When exception is NULL, does it mean to clear exception of other threads?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
When adding lock for I use this strategy to avoid this:
|
core/iwasm/aot/aot_runtime.c
Outdated
@@ -956,7 +941,8 @@ execute_start_function(AOTModuleInstance *module_inst) | |||
u.f(exec_env); | |||
|
|||
wasm_exec_env_destroy(exec_env); | |||
(void)clear_wasi_proc_exit_exception(module_inst); | |||
(void)clear_wasi_proc_exit_exception( | |||
(WASMModuleInstanceCommon *)module_inst); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems no need to clear wasi proc exit exception here. WASI module doesn't set internal start func index,instead it exports a function named "_start". If here it really call wasi proc exit,we can also let instantiation process failed. @lum1n0us what is your opinion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with that, and updated
@@ -1743,6 +1743,30 @@ wasm_runtime_finalize_call_function(WASMExecEnv *exec_env, | |||
} | |||
#endif | |||
|
|||
bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can set to static if the above is ok
traverse_list(&cluster->exec_env_list, terminate_thread_visitor, NULL); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does traverse_list add lock for the list? why not remove L845 and L849
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In terminate_thread_visitor
, we will call os_thread_join to wait for other threads to exit, and the exited thread need to get cluster->lock
for accessing the list, so we can't hold the lock during terminate_thread_visitor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…ance#1889) And refactor clear_wasi_proc_exit_exception, refer to bytecodealliance#1869
No description provided.