-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proxylib: Fix data races in unit tests #17141
proxylib: Fix data races in unit tests #17141
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The updated test checks need to be fixed.
This fixes a data race where we were accessing raw integers in unit tests from two different go routines without using atomics. Fixes: cilium#16315 Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
This introduces a mutex for the connections list, as there is a race where `Close()` starts iterating over the connections while the accept loop is trying to append a new connection to it. Fixes: cilium#16371 Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
00474c3
to
cd4a4c2
Compare
This commit contains no functional change. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
@@ -27,8 +28,8 @@ func Test(t *testing.T) { | |||
} | |||
|
|||
type ClientSuite struct { | |||
acks int | |||
nacks int | |||
acks uint64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are having to make this an atomic variable just to synchronize the goroutine write and reads in the assert call in the test case. Again, nyc, but it looks like the the test case should've synchronized the calls via a channel instead of adding sleep. The callback function in UpsertNetworkPolicy
can signal on a channel that the test case blocks on. Anyway I can revisit this in a separate PR.
I don't really understand the sleep duration specified in this block -
// Create version 1 with resource 0.
s.UpsertNetworkPolicy(c, xdsServer, resources[0])
time.Sleep(DialDelay * BackOffLimit) ------> there is no back-off?
c.Assert(s.acks, Equals, 1)
c.Assert(s.nacks, Equals, 0)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fully agreed, I have the same feeling here. But I haven't spend the time to understand it yet.
I'll leave this open for now (as there is no urgency to merge it). If I find time in the next few to clean this up, I will do it in this PR, otherwise I'll label it ready-to-merge (assuming this doesn't need a Jenkins run, since it's unit tests only).
@@ -25,7 +24,7 @@ import ( | |||
type AccessLogServer struct { | |||
Path string | |||
Logs chan cilium.EntryType | |||
closing uint32 // non-zero if closing, accessed atomically | |||
done chan struct{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks much better now. Thanks!
Marking this ready to merge, I will do the discussed improvements potentially in a follow-up PR (which then also doesn't have to be backported). Both tests are covered by Travis, so no need to run Jenkins. |
This fixes two data races in the proxylib unit ests. I've verified this locally using:
Fixes: #16315
Fixes: #16371