-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix racing in Open #171
Fix racing in Open #171
Conversation
Related to #170. There is an edge case where the background frame handler receives an error before `Open` completes. In this case, there is a data race to assign the value of the allocator. The destructor in shutdown is already a critical section. Adding a tiny critical section in openComplete to protect the allocator. Signed-off-by: Aitor Perez Cedres <acedres@vmware.com>
Related to #170. There is an edge case where the TCP connection between client-server is setup, and AMQP handshake starts, up to the point right after sending Tune frame. At this point, the background frame reader receives an error from the TCP socket and starts the shutdown sequence. At the same time, the Tune function continues (as it has not completed) and attempts to set the ChannelMax field in the connection struct. At the same time, the shutdown sequence initiated by the error in the frame handler reads the ChannelMax field. This creates a race. A potential solution is to add a critical section in tune to protect access to ChannelMax field. The destructor in the shutdown sequence is already a critical section, protected by the struct mutex. Signed-off-by: Aitor Perez Cedres <acedres@vmware.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we bother re-initializing these two fields when the connection is shutdown (and thus should never be used again)?
https://github.com/rabbitmq/amqp091-go/blob/issue-170/connection.go#L554-L555
I wish the past authors of a79bd14 would have left better commit messages to explain why this was necessary 🥲 Earlier, the critical section was smaller and those fields were not modified, looking back at: Lines 370 to 408 in 425d3a7
|
Summary
Description
In #170, it was reported a data race in
Connection.Open
. The situationdescribed is an edge case, where the connection reader (responsible for reading
data sent from the server/wire) initiates the shutdown sequence in
Connection.shutdown
beforeConnection.Open
completes. This creates a datarace for
Connection.allocator
andConnection.Config.ChannelMax
fields.The shutdown function is already protected by the
Connection
struct mutex. Theaccess from
Open
stack calls is not protected. This PR adds two new, as smallas possible, critical sections protected by the struct mutex during
Open
.It is quite challenging to add a reliable test for this edge case, even with a
server fake, since we have to inject a failure just before the frame reader
reads, but right after the Tune frame is sent. Original reporter validated the
fix.
Closes #170