Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: refactor and improve K8s sync provider #443

Merged

Conversation

Kavindu-Dodan
Copy link
Contributor

This PR

fixes #434

This is a refactoring and internal improvement for K8s ISync provider.

Improvements include,

  • Reduce K8s API load by utilizing Informer cache
  • Yet, provide a fallback if cache miss occurs (Note - we rely on K8s Informer for cache refill and consistency)
  • Informer now only watches a specific namespace (compared to *) - this is a potential performance improvement and a security improvement
  • Reduced informer handlers with extracted common logics
  • Unit tests where possible

@codecov
Copy link

codecov bot commented Feb 27, 2023

Codecov Report

Merging #443 (4f551fc) into main (dc71e84) will decrease coverage by 3.91%.
The diff coverage is 33.94%.

❗ Current head 4f551fc differs from pull request most recent head 35e5e14. Consider uploading reports for the commit 35e5e14 to get more accurate results

@@            Coverage Diff             @@
##             main     #443      +/-   ##
==========================================
- Coverage   66.17%   62.27%   -3.91%     
==========================================
  Files          11       13       +2     
  Lines        1437     1617     +180     
==========================================
+ Hits          951     1007      +56     
- Misses        430      547     +117     
- Partials       56       63       +7     
Impacted Files Coverage Δ
pkg/sync/kubernetes/kubernetes_sync.go 36.36% <33.94%> (ø)
pkg/sync/kubernetes/inotify.go 100.00% <0.00%> (ø)
pkg/store/flags.go 86.57% <0.00%> (+5.42%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Member

@thisthat thisthat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left only minor suggestions to improve logging a bit. I see that parts of the code are not tested, in particular the one with K8s details. For that, I recommend having a look at the fake client (docs)

pkg/sync/kubernetes/kubernetes_sync.go Outdated Show resolved Hide resolved
pkg/sync/kubernetes/kubernetes_sync.go Outdated Show resolved Hide resolved
pkg/sync/kubernetes/kubernetes_sync.go Outdated Show resolved Hide resolved
@toddbaert
Copy link
Member

I tested this in OFO locally (latest version) and it works as expected.

@Kavindu-Dodan
Copy link
Contributor Author

Kavindu-Dodan commented Feb 28, 2023

I left only minor suggestions to improve logging a bit. I see that parts of the code are not tested, in particular the one with K8s details. For that, I recommend having a look at the fake client (docs)

Thank you. I addressed the log and renaming related comments. Regarding testing, K8s backed logic. Even with a fake client, we still need to mock the shared informer interface as we have a fallback. I am not sure how worth these mocks are against real e2e tests which easily test most of the logic.

@toddbaert
Copy link
Member

I agree with @thisthat that we probably need a bit more coverage here (I know that there was no tests here before, so that's not your fault). I think now is the time. If there's any way to mock the k8s data I think we should do some basics here. It's worth another day or so of effort I think.

Maybe @thisthat has some specific recommendations.

Besides that everything here looks good to me and I think this is an improvement.

@thisthat
Copy link
Member

thisthat commented Mar 1, 2023

K8s provides great fake clients that can be used to unit-test your informer logic.
I think a good starting point to see an example is this test file in Keptn: https://github.com/keptn/lifecycle-toolkit/blob/91e57cadba32fce6d873bc480408f90bcb8964d9/metrics-operator/cmd/metrics/adapter/provider/provider_test.go

@Kavindu-Dodan
Copy link
Contributor Author

K8s provides great fake clients that can be used to unit-test your informer logic. I think a good starting point to see an example is this test file in Keptn: https://github.com/keptn/lifecycle-toolkit/blob/91e57cadba32fce6d873bc480408f90bcb8964d9/metrics-operator/cmd/metrics/adapter/provider/provider_test.go

Yes, I understand the motivation here. However, if you check the FakeInformer internals, then you see that this is incomplete ex:- [1]

The main blocker is not the K8s client but the behavior of the informer. We either need a test suite to recreate an informer mock and react on the eventing or spin up a real K8s backend based on envtest.Environment [2]

Anyway, I know testing is important and I am fully agreeing to the coverage requirements. However, this PR was specifically focusing on an improvement that is blocking (I assume) another set of tasks. So I do not know how much effort we should put into this PR and try to introduce tests that were never there before.

[1] - https://pkg.go.dev/sigs.k8s.io/controller-runtime/pkg/controller/controllertest#FakeInformer.GetStore
[2] - https://pkg.go.dev/sigs.k8s.io/controller-runtime@v0.14.4/pkg/controller/controllertest#pkg-overview

@toddbaert
Copy link
Member

@Kavindu-Dodan we can create a new issue for additional testing. I think at a minimum we should get the coverage diff here to 0% with some meaningful tests unless it's really impossible. This isn't a place where we can afford to lose coverage.

@Kavindu-Dodan
Copy link
Contributor Author

@Kavindu-Dodan we can create a new issue for additional testing. I think at a minimum we should get the coverage diff here to 0% with some meaningful tests unless it's really impossible. This isn't a place where we can afford to lose coverage.

It is not impossible. It's time to deliver. Soon there will be merged conflicts with upcoming PRs and given that we constantly change API contracts, the efforts can go wasted.

Besides, I introduced new tests and simply touched code with the refactoring, reducing complexity. Even with new tests, for some reason, we see a reduction in coverage. To be honest I do not know how this happened.

@Kavindu-Dodan
Copy link
Contributor Author

@Kavindu-Dodan we can create a new issue for additional testing. I think at a minimum we should get the coverage diff here to 0% with some meaningful tests unless it's really impossible. This isn't a place where we can afford to lose coverage.

It is not impossible. It's time to deliver. Soon there will be merged conflicts with upcoming PRs and given that we constantly change API contracts, the efforts can go wasted.

Besides, I introduced new tests and simply touched code with the refactoring, reducing complexity. Even with new tests, for some reason, we see a reduction in coverage. To be honest I do not know how this happened.

To prove my point, here is a PR with the change we need - #451

This will work and since I am not refactoring, code coverage won't complain.

@toddbaert
Copy link
Member

@Kavindu-Dodan we can create a new issue for additional testing. I think at a minimum we should get the coverage diff here to 0% with some meaningful tests unless it's really impossible. This isn't a place where we can afford to lose coverage.

It is not impossible. It's time to deliver. Soon there will be merged conflicts with upcoming PRs and given that we constantly change API contracts, the efforts can go wasted.
Besides, I introduced new tests and simply touched code with the refactoring, reducing complexity. Even with new tests, for some reason, we see a reduction in coverage. To be honest I do not know how this happened.

To prove my point, here is a PR with the change we need - #451

This will work and since I am not refactoring, code coverage won't complain.

Alright. I will create a new issue to get some coverage here.

@toddbaert toddbaert self-requested a review March 1, 2023 21:19
@Kavindu-Dodan Kavindu-Dodan force-pushed the feat/k8s-sync-improvement branch 2 times, most recently from 246821d to 2bea3aa Compare March 2, 2023 00:10
@Kavindu-Dodan
Copy link
Contributor Author

@thisthat @toddbaert I added more tests however the coverage is below the previous value due to the reason I explained [1]

Please have another round of review :)

[1] #443 (comment)

Signed-off-by: Kavindu Dodanduwa <kavindudodanduwa@gmail.com>
Signed-off-by: Kavindu Dodanduwa <kavindudodanduwa@gmail.com>
Signed-off-by: Kavindu Dodanduwa <kavindudodanduwa@gmail.com>
Signed-off-by: Kavindu Dodanduwa <kavindudodanduwa@gmail.com>
Signed-off-by: Kavindu Dodanduwa <kavindudodanduwa@gmail.com>
Signed-off-by: Kavindu Dodanduwa <kavindudodanduwa@gmail.com>
Signed-off-by: Kavindu Dodanduwa <kavindudodanduwa@gmail.com>
Signed-off-by: Kavindu Dodanduwa <kavindudodanduwa@gmail.com>
@Kavindu-Dodan Kavindu-Dodan merged commit 4c03bfc into open-feature:main Mar 2, 2023
@toddbaert
Copy link
Member

@Kavindu-Dodan @thisthat I created #460 for testing.

beeme1mr pushed a commit that referenced this pull request Mar 2, 2023
🤖 I have created a release *beep* *boop*
---


##
[0.4.0](v0.3.7...v0.4.0)
(2023-03-02)


### ⚠ BREAKING CHANGES

* Use OTel to export metrics (metric name changes)
([#419](#419))

### 🧹 Chore

* add additional sections to the release notes
([#449](#449))
([798f71a](798f71a))
* attach image sbom to release artefacts
([#407](#407))
([fb4ee50](fb4ee50))
* **deps:** update actions/configure-pages digest to fc89b04
([#417](#417))
([04014e7](04014e7))
* **deps:** update amannn/action-semantic-pull-request digest to b6bca70
([#441](#441))
([ce0ebe1](ce0ebe1))
* **deps:** update docker/login-action digest to ec9cdf0
([#437](#437))
([2650670](2650670))
* **deps:** update docker/metadata-action digest to 3343011
([#438](#438))
([e7ebf32](e7ebf32))
* **deps:** update github/codeql-action digest to 32dc499
([#439](#439))
([f91d91b](f91d91b))
* **deps:** update google-github-actions/release-please-action digest to
d3c71f9 ([#406](#406))
([6e1ffb2](6e1ffb2))
* disable caching tests in CI
([#442](#442))
([28a35f6](28a35f6))
* fix race condition on init read
([#409](#409))
([0c9eb23](0c9eb23))
* integration test stability
([#432](#432))
([5a6a5d5](5a6a5d5))
* integration tests
([#312](#312))
([6192ac8](6192ac8))
* reorder release note sections
([df7bfce](df7bfce))
* use -short flag in benchmark tests
([#431](#431))
([e68a6aa](e68a6aa))


### 🐛 Bug Fixes

* **deps:** update kubernetes packages to v0.26.2
([#450](#450))
([2885227](2885227))
* **deps:** update module github.com/bufbuild/connect-go to v1.5.2
([#416](#416))
([feb7f04](feb7f04))
* **deps:** update module
github.com/open-feature/go-sdk-contrib/providers/flagd to v0.1.9
([#427](#427))
([42d2705](42d2705))
* **deps:** update module github.com/open-feature/open-feature-operator
to v0.2.29 ([#429](#429))
([b7fae81](b7fae81))
* **deps:** update module github.com/stretchr/testify to v1.8.2
([#440](#440))
([ab3e674](ab3e674))
* **deps:** update module golang.org/x/net to v0.7.0
([#410](#410))
([c6133b6](c6133b6))
* **deps:** update module sigs.k8s.io/controller-runtime to v0.14.5
([#454](#454))
([f907f11](f907f11))
* remove non-error error log from parseFractionalEvaluationData
([#446](#446))
([34aca79](34aca79))


### ✨ New Features

* add debug logging for merge behaviour
([#456](#456))
([dc71e84](dc71e84))
* add Health and Readiness probes
([#418](#418))
([7f2358c](7f2358c))
* Add version to startup message
([#430](#430))
([8daf613](8daf613))
* introduce flag merge behaviour
([#414](#414))
([524f65e](524f65e))
* introduce grpc sync for flagd
([#297](#297))
([33413f2](33413f2))
* refactor and improve K8s sync provider
([#443](#443))
([4c03bfc](4c03bfc))
* Use OTel to export metrics (metric name changes)
([#419](#419))
([eb3982a](eb3982a))


### 📚 Documentation

* add .net flagd provider
([73d7840](73d7840))
* configuration merge docs
([#455](#455))
([6cb66b1](6cb66b1))
* documentation for creating a provider
([#413](#413))
([d0c099d](d0c099d))
* updated filepaths for schema store regex
([#344](#344))
([2d0e9d9](2d0e9d9))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@github-actions github-actions bot mentioned this pull request Dec 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Investigate K8s-go-client caching/load implications
6 participants