-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Whereabouts comprehensive go e2e test #181
Whereabouts comprehensive go e2e test #181
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a first pass at your PR.
8c89afa
to
7002dab
Compare
6143016
to
31892fa
Compare
31892fa
to
6d8071f
Compare
Pull Request Test Coverage Report for Build 2196380365
💛 - Coveralls |
72efbe1
to
1a4746b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love how the test reads.
I have some minor nits, but we're close :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Furthermore, once this test is OK, you should remove the original bash script.
bca53c5
to
24b4816
Compare
1b6074b
to
84f6b48
Compare
@nicklesimba I think you can remove the I'll give it one more round of reviews, but, from my perspective, all this PR is missing is removing everything related to the bash script (the script itself + the github action). Just to have some figures:
The golang based tests furthermore feature one additional test (takes ~= 15 seconds ...). Very good work. |
@maiqueb I'm going to do one last CI run before I remove Martin's scripts. I want to ensure that my test does proper cleanup (Martin's test will fail if it doesn't). Then I'll make a final commit (probably squashing everything to one commit). Thanks for all the help/code review along the way! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost there.
f33f238
to
db7f67b
Compare
dab19ac
to
a61bd94
Compare
e2e/e2e_test.go
Outdated
} | ||
|
||
if len(ipPool.Allocations()) != 0 { | ||
return fmt.Errorf("Expected 0 IP pool allocations, got %d instead", len(ipPool.Allocations())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm seeing this fail in the e2e tests; I think it would help somehow to know in which iteration did the test fail.
How about letting the caller indicate that ? Something like (in line 152):
Expect(checkZeroIPPoolAllocationsAndReplicas(clientInfo, k8sIPAM)).To(Succeed(), fmt.Sprintf("Stale IP allocations in iteration: %d", i))
EDIT:using yet another helper would make it more readable, something like:
Expect(
checkZeroIPPoolAllocationsAndReplicas(clientInfo, k8sIPAM)).To(
Succeed(),
expectNoStaleAllocationsErrorMsg(i))
func expectNoStaleAllocationsErrorMsg(iterationNumber int) string {
return fmt.Sprintf("Stale IP allocations in iteration: %d", i)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Miguel, for the record, I somehow missed a lot of these comments from 10 days ago. Apologies! I'm catching up on these, but for this comment I think (for now) the problem is solved. I created a WaitForZeroIPPoolAllocations() function that gives kubernetes adequate time to clean up. I'm going to leave this comment unresolved, because it's a great idea in the event the problem resurfaces.
f24ca20
to
c6f8d83
Compare
c6f8d83
to
75c0b75
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really the only beef I have with your code :)
Signed-off-by: nicklesimba <simha.nikhil@gmail.com>
75c0b75
to
49a69bb
Compare
Signed-off-by: nicklesimba <simha.nikhil@gmail.com>
e2e/e2e_test.go
Outdated
@@ -48,12 +57,22 @@ var _ = Describe("Whereabouts functionality", func() { | |||
Context("Test setup", func() { | |||
var ( | |||
clientInfo *ClientInfo | |||
envVars *envVars |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the variable name is shadowing the variable type - please rename the variable type to something else - testConfig
for instance.
e2e/e2e_test.go
Outdated
createTimeout = 10 * time.Second | ||
deleteTimeout = 2 * createTimeout | ||
wbLabelEqual = "tier=whereabouts-scale-test" | ||
testNamespace = "kube-system" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should use a different namespace for the tests: this is where the kubernetes infrastructure pods live.
I am OK with using the default
namespace.
A follow up PR should generate a new namespace for each test, to ensure no collisions on the tests.
e2e/e2e_test.go
Outdated
createTimeout = 10 * time.Second | ||
deleteTimeout = 2 * createTimeout | ||
wbLabelEqual = "tier=whereabouts-scale-test" | ||
testNamespace = "kube-system" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is why the stateful set pods stopped being ready: their pods are looking for the network-attachment-definition in the default namespace.
I recommend you use that namespace as well.
Signed-off-by: nicklesimba <simha.nikhil@gmail.com> e2e tests: remove manual ip pool reconciliation Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com> e2e tests: use the default namespace Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com> e2e tests: read the IPAllocations from the correct namespace Furthermore, harden the `isIPPoolAllocationsEmpty` function - currently, if the IPPool does not exist, it will be created without allocations. When that happens, the pool is indeed empty. Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com> Changed variable name for readability (envVars *envVars changed to testConfig *envVars) Signed-off-by: nicklesimba <simha.nikhil@gmail.com>
54d90bf
to
81769fa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is good enough.
More improvements can happen in follow-up PRs.
Thanks for the patience, and good work.
Would you mind squashing the 3 commits into one ? |
* build: generate ip pool clientSet/informers/listers Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com> * vendor: update vendor stuff Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com> * build: vendor net-attach-def-client types Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com> * config: look for the whereabouts config file in multiple places The reconciler controller will have access to the whereabouts configuration via a mount point. As such, we need a way to specify its path. Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com> * reconcile-loop: requires the IP ranges in normalized format The IP reconcile loop also requires the IP ranges in a normalized format; as such, we export it into a function, which will be used in a follow-up commit. Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com> * config: allow IPAM config parsing from a NetConfList Currently whereabouts is only able to parse network configurations in the strict [0] format - i.e. **do not accept** a plugin list - [1]. The `ip-control-loop` must recover the full plugin configuration, which may be in the network configuration format. This commit allows whereabouts to now understand both formats. Furthermore, the current CNI release - v1.0.Z - removed the support for [0], meaning that only the configuration list format is now supported [2]. [0] - https://github.com/containernetworking/cni/blob/v0.8.1/SPEC.md#network-configuration [1] - https://github.com/containernetworking/cni/blob/v0.8.1/SPEC.md#network-configuration-lists [2] - https://github.com/containernetworking/cni/blob/master/SPEC.md#released-versions Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com> * reconcile-loop: add a controller Listen to pod deletion, and for every deleted pod, assure their IPs are gone. The rough algorithm goes like this: - for every network-status in the pod's annotations: - read associated net-attach-def from the k8s API - extract the range from the net-attach-def - find the corresponding IP pool - look for allocations belonging to the deleted pod - delete them using `IPManagement(..., types.Deallocate, ...)` All the API reads go through the informer cache, which is kept updated whenever the objects are updated on the API. The dockerfiles are also updated, to ship this new binary. Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com> * e2e tests: remove manual cluster reconciliation This would leave the `ip-control-loop` as the reconciliation tool. Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com> * unit tests: assure stale IPAllocation cleanup This commit adds a unit where it is checked that the pod deletion leads to the cleanup of a stale IP address. This commit features the automatic provisioning of the controller informer cache with the data present on the fake clientset tracker (the "fake" datastore). This way, users can just create the client with provisioned data, and that'll trickle down to the informer cache of the pod controller. Because the `network-attachment-definitions` resources feature dashes, the heuristic function that guesses - yes, guesses. very deterministic ... - the name of the resource can't be used - [0]. As such, it was needed to create an alternate `newFakeNetAttachDefClient` where it is possible to specify the correct resource name. [0] - https://github.com/k8snetworkplumbingwg/network-attachment-definition-client/blob/2fd7267afcc4d48dfe6a8cd756b5a08bd04c2c97/vendor/k8s.io/client-go/testing/fixture.go#L331 Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com> * unit tests: move helper funcs to other files The helper files are tagged with the `test` build tag, to prevent them from being shipped on the production code binary. Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com> * control loop, queueing: use a rate-limiting queue Using a queue allows us to re-queue errors. Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com> * control loop: add IPAllocation cleanup related events Adds two new events related to garbage collection of the whereabouts IP addresses: - when an IP address is garbage collected - when a cleanup operation fails and is not re-queued The former event looks like: ``` 116s Normal IPAddressGarbageCollected pod/macvlan1-worker1 \ successful cleanup of IP address [192.168.2.1] from network \ whereabouts-conf ``` The latter event looks like: ``` 10s Warning IPAddressGarbageCollectionFailed failed to garbage \ collect addresses for pod default/macvlan1-worker1 ``` Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com> * e2e tests: check out statefulset scenarios Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com> * e2e tests: test different scale up/down order and instance deltas Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com> * ci: test e2e bash scripts last These ugly tests do not cleanup after themselves; this way, the golang based tests (which **do** cleanup after themselves) will not be impacted by these left-overs. Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com> * ip control loop, unit tests: test negative scenarios Check the event thrown when a request is dropped from the queue, and assure reconciling an allocation is impossible without having access to the attachment configuration data. Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com> * e2e tests: test fix for issue k8snetworkplumbingwg#182 Issue [0] reports an error when a pod associated to a `StatefulSet` whose IPPool is already full is deleted. According to it, the new pod - scheduled by the `StatefulSet` - cannot run because the IPPool is already full, and the old pod's IP cannot be garbage collected because we match by pod reference - and the "new" pod is stuck in `creating` phase. [0] - k8snetworkplumbingwg#182 Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com> * ip-control-loop: strip pod before queueing it The ip reconcile loop only requires the pod metadata and its network status annotatations to garbage collect the stale IP addresses. As such, we remove the status and spec parameters from the pod before queueing it. Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com> * reconcile-loop: focus on networks w/ whereabouts IPAM type Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com> Replaced bash e2e test with Golang e2e test (k8snetworkplumbingwg#181) Fixed code errors causing e2e tests to not compile Signed-off-by: nicklesimba <simha.nikhil@gmail.com> e2e tests: remove manual ip pool reconciliation Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com> e2e tests: use the default namespace Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com> e2e tests: read the IPAllocations from the correct namespace Furthermore, harden the `isIPPoolAllocationsEmpty` function - currently, if the IPPool does not exist, it will be created without allocations. When that happens, the pool is indeed empty. Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com> Changed variable name for readability (envVars *envVars changed to testConfig *envVars) Signed-off-by: nicklesimba <simha.nikhil@gmail.com>
c18c4b0
to
81769fa
Compare
The commit history is extremely messy under the hood, which is more than likely my own fault. I'm going to close this PR shortly and open a new one with the changes added as a single commit - since they're approved here anyway. I'll reference and link this PR in the new one. This way, context is provided to anyone who needs it. |
Abandoned in lieu of #215 . |
This PR introduces a more complex test that is based off of Martin's bash test - but is written in Golang. Once merged, Martin's test should be removed and replaced by this. The goal here is that a Golang based e2e will become more easily maintainable and readable in the long term.