-
Notifications
You must be signed in to change notification settings - Fork 326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix initial sync race #208
Conversation
bb115b6
to
8be4a97
Compare
660aaa6
to
29ec319
Compare
thks @lkysow looks good to me |
7c43b7e
to
27472e2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Left a recommendation for clarification in a comment and clearer timing in the test.
catalog/to-consul/syncer_test.go
Outdated
// Create a service directly in Consul. Since it was created directly we | ||
// expect it to be deleted but not until the initial sync is performed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Create a service directly in Consul. Since it was created directly we | |
// expect it to be deleted but not until the initial sync is performed. | |
// Create a service directly in Consul. Since it was created on the | |
// synthetic sync node and has the sync-associated tag, we expect | |
// it to be deleted but not until the initial sync is performed. |
A service needs the sync tag and to be registered on the sync node or the sync process would ignore it.
catalog/to-consul/syncer_test.go
Outdated
require.NoError(t, err) | ||
|
||
// We wait until the syncer has had the time to delete the service. | ||
time.Sleep(200 * time.Millisecond) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like we might run into edge cases having this value exactly equal the SyncPeriod
in the test ConsulSyncer. What about setting up the ConsulSyncer with an even smaller interval?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! Great idea. I didn't like this sleep but a retry loop doesn't work because it might run through before syncFull has run :D
A race condition was occurring where we would delete services via the `watchReapableServices` function if `Sync` wasn't called fast enough by `resource.go`. This occurred because `watchReapableServices` makes a call to the catalog and then deletes everything with a k8s tag that isn't in its `s.services` map. If `Sync` isn't called, then this map is empty and so it tries to delete everything. I've solved this by blocking until `Sync` is first called.
Address: a.HTTPAddr, | ||
}) | ||
require.NoError(t, err) | ||
s, closer := testConsulSyncer(client) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion:
s, closer := testConsulSyncerWithConfig(client, func(s *ConsulSyncer) {
s.SyncPeriod = 5 * time.Millisecond
})
Since the testConsulSyncer
also Runs
the syncer, the sync period change may not get picked up if it's made afterwards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. PR coming.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A race condition was occurring where we would delete services via the
watchReapableServices
function ifSync
wasn't called fast enough byresource.go
. This occurred becausewatchReapableServices
makes acall to the catalog and then deletes everything with a k8s tag that
isn't in its
s.services
map. IfSync
isn't called, then this mapis empty and so it tries to delete everything.
I've solved this by blocking until
Sync
is first called.Closes #191