Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NETOBSERV-1247: using Meta informer for replicaSets #474

Merged
merged 3 commits into from
Nov 7, 2023

Conversation

msherif1234
Copy link
Contributor

@msherif1234 msherif1234 commented Aug 23, 2023

This is follow up to the work @jpinsonneau started in https://github.com/jpinsonneau/flowlogs-pipeline/tree/update_metainformer

  • switch to metadata informer for RS

@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented Aug 23, 2023

@msherif1234: This pull request references NETOBSERV-1247 which is a valid jira issue.

In response to this:

This is follow up to the work @jpinsonneau started in https://github.com/jpinsonneau/flowlogs-pipeline/tree/update_metainformer

  • switch to metadata informer for RS
  • update go-client to v0.28.0

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented Aug 23, 2023

@msherif1234: This pull request references NETOBSERV-1247 which is a valid jira issue.

In response to this:

This is follow up to the work @jpinsonneau started in https://github.com/jpinsonneau/flowlogs-pipeline/tree/update_metainformer

  • switch to metadata informer for RS
  • update go-client to v0.28.0 and its dependencies

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@msherif1234
Copy link
Contributor Author

/ok-to-test

@openshift-ci openshift-ci bot added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Aug 23, 2023
@github-actions
Copy link

New image:
quay.io/netobserv/flowlogs-pipeline:27c0dcf

It will expire after two weeks.

To deploy this build, run from the operator repo, assuming the operator is running:

USER=netobserv VERSION=27c0dcf make set-flp-image

@codecov
Copy link

codecov bot commented Aug 23, 2023

Codecov Report

Attention: 26 lines in your changes are missing coverage. Please review.

Comparison is base (e3a97f1) 67.07% compared to head (aaf2505) 66.90%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #474      +/-   ##
==========================================
- Coverage   67.07%   66.90%   -0.17%     
==========================================
  Files          94       94              
  Lines        6724     6741      +17     
==========================================
  Hits         4510     4510              
- Misses       1948     1965      +17     
  Partials      266      266              
Flag Coverage Δ
unittests 66.90% <0.00%> (-0.17%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
pkg/pipeline/transform/kubernetes/kubernetes.go 17.94% <0.00%> (-1.20%) ⬇️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@github-actions github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Aug 23, 2023
@msherif1234
Copy link
Contributor Author

/ok-to-test

@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented Aug 23, 2023

@msherif1234: This pull request references NETOBSERV-1247 which is a valid jira issue.

In response to this:

This is follow up to the work @jpinsonneau started in https://github.com/jpinsonneau/flowlogs-pipeline/tree/update_metainformer

  • switch to metadata informer for RS

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Aug 23, 2023
@github-actions
Copy link

New image:
quay.io/netobserv/flowlogs-pipeline:c6e2afd

It will expire after two weeks.

To deploy this build, run from the operator repo, assuming the operator is running:

USER=netobserv VERSION=c6e2afd make set-flp-image

@msherif1234
Copy link
Contributor Author

image

@jpinsonneau
Copy link
Collaborator

image

The initial issue was on a cluster having 15k+ replicasets. We should try to reproduce something around this number 😸

jpinsonneau
jpinsonneau previously approved these changes Aug 24, 2023
Copy link
Collaborator

@jpinsonneau jpinsonneau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good, not tested yet
Do you see differences in the enriched flows ?

@msherif1234
Copy link
Contributor Author

msherif1234 commented Aug 24, 2023

  • using hey-ho tool /hey-ho.sh -n 50, which generate the following RS load
$ oc get rs -A | wc -l
343

metadata_informer

if we ignore few flp pods with PR showing much higher usage then I would say this is mostly the same , I am not able to increase RS to lagre number w/o increasing current default of 250 pods/node which prevented me from swapping FLP images to test before and after

@jpinsonneau
Copy link
Collaborator

if we ignore few flp pods with PR showing much higher usage then I would say this is mostly the same , I am not able to increase RS to lagre number w/o increasing current default of 250 pods/node which prevented me from swapping FLP images to test before and after

So do you think it's not worth it or should it be verified by QEs ?

@msherif1234
Copy link
Contributor Author

if we ignore few flp pods with PR showing much higher usage then I would say this is mostly the same , I am not able to increase RS to lagre number w/o increasing current default of 250 pods/node which prevented me from swapping FLP images to test before and after

So do you think it's not worth it or should it be verified by QEs ?

I think QE verified on scale run we might have better numbers there @memodi WDYT ?

@memodi
Copy link

memodi commented Aug 25, 2023

I guess the idea here to rollout the new revisions to existing deployments, we don't have any tool that does that today. Simply, increasing the cluster size won't help since FLP pods would scale equally.

@memodi
Copy link

memodi commented Aug 25, 2023

@msherif1234 @jpinsonneau - does all FLP tries to build similar cache? I am trying to think if we had say 120 node cluster with 100 deployments on each worker would be appropriate scenario? cc @nathan-weinberg

@jpinsonneau
Copy link
Collaborator

@msherif1234 @jpinsonneau - does all FLP tries to build similar cache?

Yes every FLP build the exact same cache. That's why the kube API is taking a burst on Netobserv startup
https://issues.redhat.com/browse/NETOBSERV-1248

I am trying to think if we had say 120 node cluster with 100 deployments on each worker would be appropriate scenario? cc @nathan-weinberg

So you would be around 12k pods ? More if you plan multiple replicas ?
That sounds good ! Compare FLPs overall CPU / Memory without that PR and with that PR.
A plus would also be checking kube api perfs.

@memodi
Copy link

memodi commented Sep 5, 2023

/rm-ok-test

@nathan-weinberg
Copy link

/ok-to-test

@jpinsonneau jpinsonneau added lgtm ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. and removed lgtm ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. labels Sep 5, 2023
@openshift-ci openshift-ci bot added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Nov 3, 2023
Copy link

github-actions bot commented Nov 3, 2023

New image:
quay.io/netobserv/flowlogs-pipeline:89ba22d

It will expire after two weeks.

To deploy this build, run from the operator repo, assuming the operator is running:

USER=netobserv VERSION=89ba22d make set-flp-image

@github-actions github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Nov 6, 2023
@jotak
Copy link
Member

jotak commented Nov 6, 2023

@msherif1234 hope you don"t mind :) I pushed a fix on this PR, FLP was CLBO due to a wrong conversion, the transform func now expecting partial metadata rather than replicaset

@jotak
Copy link
Member

jotak commented Nov 6, 2023

/lgtm
For test failing: this should fix it: #534

@openshift-ci openshift-ci bot added the lgtm label Nov 6, 2023
@jotak jotak added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Nov 6, 2023
Copy link

github-actions bot commented Nov 6, 2023

New image:
quay.io/netobserv/flowlogs-pipeline:217cdb2

It will expire after two weeks.

To deploy this build, run from the operator repo, assuming the operator is running:

USER=netobserv VERSION=217cdb2 make set-flp-image

Signed-off-by: msherif1234 <mmahmoud@redhat.com>
Signed-off-by: msherif1234 <mmahmoud@redhat.com>
@openshift-ci openshift-ci bot removed the lgtm label Nov 7, 2023
@github-actions github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Nov 7, 2023
(cherry picked from commit 0da2955e9738d2b5a53d716e7d1cbe5fddffc5f3)
@msherif1234
Copy link
Contributor Author

/ok-to-test

@openshift-ci openshift-ci bot added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Nov 7, 2023
Copy link

github-actions bot commented Nov 7, 2023

New image:
quay.io/netobserv/flowlogs-pipeline:e98f7ba

It will expire after two weeks.

To deploy this build, run from the operator repo, assuming the operator is running:

USER=netobserv VERSION=e98f7ba make set-flp-image

@msherif1234
Copy link
Contributor Author

/hold cancel

@openshift-ci openshift-ci bot added the lgtm label Nov 7, 2023
@nathan-weinberg
Copy link

/label qe-approved

@openshift-ci openshift-ci bot added the qe-approved QE has approved this pull request label Nov 7, 2023
@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented Nov 7, 2023

@msherif1234: This pull request references NETOBSERV-1247 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.15.0" version, but no target version was set.

In response to this:

This is follow up to the work @jpinsonneau started in https://github.com/jpinsonneau/flowlogs-pipeline/tree/update_metainformer

  • switch to metadata informer for RS

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@msherif1234
Copy link
Contributor Author

/approve

Copy link

openshift-ci bot commented Nov 7, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: msherif1234

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved label Nov 7, 2023
@openshift-merge-bot openshift-merge-bot bot merged commit 47c2095 into netobserv:main Nov 7, 2023
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved jira/valid-reference lgtm ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. qe-approved QE has approved this pull request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants