-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(websocket): wait for task on destroy (IDFGH-14533) #751
fix(websocket): wait for task on destroy (IDFGH-14533) #751
Conversation
@gabsuren thanks for considering this one too |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thanks for the fix!
So the real issue is actually using (client->run)
to determine if the client is still running.
And we do the same thing in the ws_client_stop()
, correct?
maybe checking status_bits
for STOPPED_BIT
would solve the issue, too, and we could even reduce some code duplication and do:
esp_err_t esp_websocket_client_stop(esp_websocket_client_handle_t client)
{
return check_and_stop_internal(client);
}
esp_err_t esp_websocket_client_destroy(esp_websocket_client_handle_t client)
{
esp_err_t err = check_and_stop_internal(client);
// ignore ESP_FAIL for some "already stopped" case
destroy_and_free_resources(client);
return err;
}
just an idea on possible improvement, but your current changes in this PR LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to fix unit tests per: https://github.com/espressif/esp-protocols/actions/runs/13029806393/job/36348129127
this change broke the basic init-destroy usecase (since STOPPED_BIT
is not set upon initialization),
and also, destroy is called on any unsuccessful init (so need to take into account that the event group might not be initialized yet)
d252280
to
7fd85c6
Compare
Yes, checking for the I followed your suggestion partially; the check is different for stop vs destroy. In stop, it's okay to log a warning; in destroy, it shouldn't warn if it wasn't started. So they now share the logic to stop and wait, and both check for the |
7fd85c6
to
bb87623
Compare
bb87623
to
42674b4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates!
Description
This ensures that the task is awaited on destroy.
This fixes a race condition between two independent actions:
run
to false, posts theDISCONNECTED
event, breaks, then posts theFINISHED
events and deletes the taskCurrently, in
esp_websocket_client_destroy()
would immediately destroy all resources, becauserun
is alreadyfalse
. However, the task may still be running, and it needs the resources to cleanly stop the task (e.g. post theFINISHED
event).This change ensures that the task has stopped. The task must be stopped before resources can be freed.
The
run
check in the destroyer is also racy, so the wait shouldn't happen in anelse
.Testing
I tested this locally.
Checklist
Before submitting a Pull Request, please ensure the following: