Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NETOBSERV-123: Added infra health dashboard #299

Merged
merged 4 commits into from
Mar 22, 2023

Conversation

OlivierCazade
Copy link
Contributor

@OlivierCazade OlivierCazade commented Mar 16, 2023

This PR add infra health dashboard.

I first try to generate the dashboard using FLP confgen but I faced multiple limitations:

  • Not possible to use rows definitions to have multiple foldable subsections
  • Not possible to have multiple query on the same graph. Here I wanted to have ingested flows and sent to loki flows in the same graph
  • Not possible to manage graph width so we can have mulitple graphs on the same row. Here for CPU and memory utilization.

You can check this branch which is working on how it was using FLP confgen.

If at some point we have to update this dashboard too often. We may want to implement this limitations in FLP confgen and use it.

@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented Mar 16, 2023

@OlivierCazade: This pull request references NETOBSERV-123 which is a valid jira issue.

In response to this:

This PR add infra health dashboard.

I first try to generate the dashboard using FLP confgen but I faced multiple limitations:

  • Not possible to use rows definitions to have multiple foldable subsections
  • Not possible to have multiple query on the same graph. Here I wanted to have ingested flows and sent to loki flows in the same graph
  • Not possible to manage graph width so we can have mulitple graphs on the same row. Here for CPU and memory utilization.
    You can check this branch which is working on how it was using FLP confgen.

If at some point we have to update this dashboard too often. We may want to implement this limitations in FLP confgen and use it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented Mar 16, 2023

@OlivierCazade: This pull request references NETOBSERV-123 which is a valid jira issue.

In response to this:

This PR add infra health dashboard.

I first try to generate the dashboard using FLP confgen but I faced multiple limitations:

  • Not possible to use rows definitions to have multiple foldable subsections
  • Not possible to have multiple query on the same graph. Here I wanted to have ingested flows and sent to loki flows in the same graph
  • Not possible to manage graph width so we can have mulitple graphs on the same row. Here for CPU and memory utilization.

You can check this branch which is working on how it was using FLP confgen.

If at some point we have to update this dashboard too often. We may want to implement this limitations in FLP confgen and use it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@codecov
Copy link

codecov bot commented Mar 16, 2023

Codecov Report

Merging #299 (648babe) into main (44e1b37) will increase coverage by 0.58%.
The diff coverage is 66.33%.

@@            Coverage Diff             @@
##             main     #299      +/-   ##
==========================================
+ Coverage   49.66%   50.24%   +0.58%     
==========================================
  Files          43       43              
  Lines        5020     5069      +49     
==========================================
+ Hits         2493     2547      +54     
+ Misses       2324     2316       -8     
- Partials      203      206       +3     
Flag Coverage Δ
unittests 50.24% <66.33%> (+0.58%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
api/v1alpha1/flowcollector_types.go 100.00% <ø> (ø)
api/v1alpha1/flowcollector_webhook.go 0.00% <0.00%> (ø)
api/v1alpha1/zz_generated.conversion.go 0.27% <0.00%> (ø)
api/v1beta1/flowcollector_types.go 100.00% <ø> (ø)
api/v1beta1/zz_generated.deepcopy.go 42.33% <0.00%> (ø)
controllers/consoleplugin/consoleplugin_objects.go 92.45% <0.00%> (ø)
controllers/constants/constants.go 100.00% <ø> (ø)
controllers/flowcollector_controller.go 58.60% <40.00%> (-0.39%) ⬇️
pkg/helper/flowcollector.go 61.70% <50.00%> (ø)
controllers/flowlogspipeline/flp_reconciler.go 80.00% <78.57%> (+17.73%) ⬆️
... and 3 more

... and 3 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@OlivierCazade
Copy link
Contributor Author

Here is a screenshot of the deployed dashboard:

image

@OlivierCazade
Copy link
Contributor Author

It looks like we have a regression because of some change in the monitoring operator.

The firing alert is not labelled with the namespace anymore and we were relying on this label to filter the alert for the banner.
I added an app label.

},
},
Data: map[string]string{
healthDashboardCMFile: string(healthDashboardEmbed),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to use string().
healthDashboardCMFile: healthDashboardEmbed,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, thanks!

@memodi
Copy link
Contributor

memodi commented Mar 20, 2023

@OlivierCazade - tests are failing ? could you PTAL ? let me know once they're fixed.

]
},
"timezone":"browser",
"title":"Netobserv / Health",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be capital O : NetObserv

@jotak
Copy link
Member

jotak commented Mar 21, 2023

FYI the failing test is something I experienced also (and fixed) in this PR: #294 - if it's merged first, this one should pass I think

@jotak
Copy link
Member

jotak commented Mar 21, 2023

@jotak
Copy link
Member

jotak commented Mar 21, 2023

/lgtm

@openshift-ci openshift-ci bot removed the lgtm label Mar 21, 2023
@jotak
Copy link
Member

jotak commented Mar 22, 2023

/lgtm

@openshift-ci openshift-ci bot added the lgtm label Mar 22, 2023
@OlivierCazade
Copy link
Contributor Author

/approve

@openshift-ci
Copy link

openshift-ci bot commented Mar 22, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: OlivierCazade

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit 460d0b0 into netobserv:main Mar 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants