Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate K8s-go-client caching/load implications #434

Closed
Kavindu-Dodan opened this issue Feb 24, 2023 · 4 comments · Fixed by #443
Closed

Investigate K8s-go-client caching/load implications #434

Kavindu-Dodan opened this issue Feb 24, 2023 · 4 comments · Fixed by #443
Assignees

Comments

@Kavindu-Dodan
Copy link
Contributor

Kavindu-Dodan commented Feb 24, 2023

Flagd is designed to watch K8s CRD changes and update flag configurations on CRD changes (ex:- updates, removals). This is performed by K8s ISync implementation [1]

The Focus of this task is to know the impact of on-demand CRD fetching on the K8s API. This knowledge is a pre-requisite to remove any caching in-between sync providers and flag evaluations/grpc streams.

Research:

  • Does K8s API cache its responses
  • The impact of K8s API overloading (ex:- throttling, performance impact and stability)

Outcome:

Whether we require a cache OR whether we can rely on K8s API for high load scenarios

[1] - https://github.com/open-feature/flagd/blob/main/pkg/sync/kubernetes/kubernetes_sync.go

@Kavindu-Dodan Kavindu-Dodan converted this from a draft issue Feb 24, 2023
@Kavindu-Dodan Kavindu-Dodan self-assigned this Feb 24, 2023
@beeme1mr beeme1mr changed the title [research/OFO] Investigate K8s-go-client caching/load implications Investigate K8s-go-client caching/load implications Feb 24, 2023
@Kavindu-Dodan
Copy link
Contributor Author

Currently used K8s client of the K8s ISync implementation is not backed by a cache [1]

// New returns a new Client using the provided config and Options.
// The returned client reads and writes directly from the server
// (it doesn't use object caches). It understands how to work with
// normal types (both custom resources and aggregated/built-in resources),
// as well as unstructured types

  • Source [2]

Next step, consider a caching client if exist.

[1] - https://github.com/open-feature/flagd/blob/main/pkg/sync/kubernetes/kubernetes_sync.go#L155
[2] - https://github.com/kubernetes-sigs/controller-runtime/blob/main/pkg/client/client.go#L90-L99

@Kavindu-Dodan
Copy link
Contributor Author

One option is to rely on the store backing the informer that we already use for eventing [1]. The downside is that we could be reading outdated changes.

[1] - https://github.com/open-feature/flagd/blob/main/pkg/sync/kubernetes/kubernetes_sync.go#L168

@Kavindu-Dodan
Copy link
Contributor Author

Good reading resources

[1] https://kubernetes.io/docs/concepts/cluster-administration/flow-control/
[2] https://cloud.google.com/kubernetes-engine/quotas#limit_for_api_requests

Next steps, investigating improvements in our ISync implementation to reduce API impact and reuse resources.

@Kavindu-Dodan
Copy link
Contributor Author

PR #443 improves K8s Sync provider internals to utilize the informer store. However, we will still have a fallback to API get if we have a cache mismatch.

This will allow us to avoid a caching layer for flagd server

Kavindu-Dodan added a commit that referenced this issue Mar 2, 2023
## This PR
fixes #434

This is a refactoring and internal improvement for K8s ISync provider.

Improvements include,

- Reduce K8s API load by utilizing Informer cache
- Yet, provide a fallback if cache miss occurs (Note - we rely on K8s
Informer for cache refill and consistency)
- Informer now only watches a specific namespace (compared to **\***) -
this is a potential performance improvement and a security improvement
- Reduced informer handlers with extracted common logics 
- Unit tests where possible

---------

Signed-off-by: Kavindu Dodanduwa <kavindudodanduwa@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging a pull request may close this issue.

2 participants