-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: command response processing issues #40
Conversation
Fix command response processing issues introduced by notification handling. This includes: * Random hangs due to writes to channels with no readers. * Potential race when reusing client response buffer. * Close of notify channel while handler was still running. * Eliminated race on IsConnected check in ExecCmd. Also: * Use just \n for keep-alive to reduce data sent across the wire. * Wrap more errors to improve error reporting. * Remove io.(Reader|Writer) wrapping as that breaks scanner io.EOF handling.
Increase the golangci-lint timeout on Windows as it constantly takes longer than Linux and Mac leading to timeout failures.
92360f2
to
9cb8b6d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First of all thank you for your commitment!
I actually had something similar in mind, but I couldn't really wrap my head around it.
All goroutines seem to end now.
Running the tests there are still random failures every few hundred tries, which I still think might have something to do with the mock server. So I wouldn't really bother with it in this pull request.
There is another thing I have noticed, which is probably not scope of this pull request:
There is no way to find out if the connection got closed without executing a command or using c.IsConnected()
. c.notify
is only closed inside of c.Close()
which has to be called manually.
This leads to indefinitely blocking when just listening to events, without sending anything.
From what I can tell this has been like this before as well. Exposing c.done
might be a good way for an easy fix.
Move notify close so its done as soon as the sender exits. Revert keep alive change as '\n' isn't enough to prevent connection timeout.
Document that notification channel will be closed.
Could you post the details of this, as I've not been able reproduce even with 10k runs here.
I've moved the close of |
I tried it again on a docker container running on my home docker server and even after one hour (~3.5k runs) there was not a single failure. Maybe what I observe is some weird Windwos/WSL issue on my end, where the mockserver is not reachable sometimes. In this instance for example it seems like the connection could not be established, as after connecting there immediately is an
|
Fix a comment typo for responseErrTimeout Co-authored-by: HalloTschuess <hallo.ich.f@gmail.com>
Remove wrap on Write, Read and Close methods in mock servers to ensure we don't trigger unexpected behaviour such as io.EOF check failures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I tried your branch and it seems to have fixed all the issues I had with connections being freezed. |
Can confirm this fixing all the errors. Also LGTM Any chance this will be merged? @stevenh or @HalloTschuess |
@danmrichards / @lwaddicor can we get this merged please |
Thanks for this, approved and merged. @stevenh do you feel a v1.1.1 or v1.2.0 is most appropriate here? |
Thanks Lewis, as its a backwards compatible bug fix https://semver.org/ says its a patch so v1.1.1 |
Fix command response processing issues introduced by notification handling.
This includes:
Also: