Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose HCO monitoring on plain k8s #2393

Merged
merged 1 commit into from
Jun 28, 2023

Conversation

assafad
Copy link
Contributor

@assafad assafad commented Jun 21, 2023

What this PR does / why we need it:
Right now we are creating HCO monitoring-related resources only in OpenShift clusters, and thus HCO metrics and alerts are created and exposed to Prometheus only on these clusters. On plain k8s clusters, HCO only implements metrics, but they are not exposed to Prometheus, and alerts are not created or reconciled at all.
This PR exposes metrics and alerts also on k8s clusters, by enabling the creation and reconciliation of required monitoring-related resources (e.g. PrometheusRule,ServiceMonitor), as long as Prometheus is installed on the cluster.

Reviewer Checklist

Reviewers are supposed to review the PR for every aspect below one by one. To check an item means the PR is either "OK" or "Not Applicable" in terms of that item. All items are supposed to be checked before merging a PR.

  • PR Message
  • Commit Messages
  • How to test
  • Unit Tests
  • Functional Tests
  • User Documentation
  • Developer Documentation
  • Upgrade Scenario
  • Uninstallation Scenario
  • Backward Compatibility
  • Troubleshooting Friendly

Jira Ticket: https://issues.redhat.com/browse/CNV-26009

Release note:

Expose HCO monitoring on plain k8s

@kubevirt-bot kubevirt-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. labels Jun 21, 2023
@assafad
Copy link
Contributor Author

assafad commented Jun 21, 2023

/hold

@kubevirt-bot kubevirt-bot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. size/M labels Jun 21, 2023
@assafad
Copy link
Contributor Author

assafad commented Jun 21, 2023

Tested with kubevirtci and kind clusters. Steps to view metrics and alerts using kubevirtci:

  • export KUBEVIRT_DEPLOY_PROMETHEUS=true
  • make cluster-up
  • make cluster-sync
  • in order to expose alerts to Prometheus, edit Prometheus to select all existing prometheusrules:
    • ./cluster/kubectl.sh edit prometheus k8s -n monitoring
    • edit
    ruleSelector:
       matchLabels:
        	prometheus: k8s
        	role: alert-rules
    
    to ruleSelector: {}
  • ./cluster/kubectl.sh port-forward service/prometheus-k8s 9090:9090 -n monitoring
  • access Prometheus dashboard using http://localhost:9090

@assafad assafad force-pushed the expose-monitoring branch 2 times, most recently from 8bd43e6 to 0e7a5f8 Compare June 21, 2023 16:03
Signed-off-by: assafad <aadmi@redhat.com>
@sonarcloud
Copy link

sonarcloud bot commented Jun 21, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@coveralls
Copy link
Collaborator

Pull Request Test Coverage Report for Build 5337253780

  • 23 of 34 (67.65%) changed or added relevant lines in 4 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-0.06%) to 85.903%

Changes Missing Coverage Covered Lines Changed/Added Lines %
controllers/alerts/rbac.go 8 9 88.89%
pkg/util/cluster.go 14 17 82.35%
controllers/hyperconverged/hyperconverged_controller.go 0 7 0.0%
Totals Coverage Status
Change from base Build 5330400492: -0.06%
Covered Lines: 4942
Relevant Lines: 5753

💛 - Coveralls

@assafad
Copy link
Contributor Author

assafad commented Jun 25, 2023

Hi @nunnatsa, @machadovilaca, Could you please review this PR?

@assafad
Copy link
Contributor Author

assafad commented Jun 25, 2023

/retest

1 similar comment
@nunnatsa
Copy link
Collaborator

/retest

@nunnatsa
Copy link
Collaborator

Great job!
/lgtm

@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Jun 26, 2023
@openshift-ci
Copy link

openshift-ci bot commented Jun 26, 2023

@assafad: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/hco-e2e-kv-smoke-azure 5665934 link true /test hco-e2e-kv-smoke-azure

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@nunnatsa
Copy link
Collaborator

/override coverage/coveralls
/approve

@kubevirt-bot
Copy link
Contributor

@nunnatsa: Overrode contexts on behalf of nunnatsa: coverage/coveralls

In response to this:

/override coverage/coveralls
/approve

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@kubevirt-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: nunnatsa

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubevirt-bot kubevirt-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 26, 2023
@nunnatsa
Copy link
Collaborator

hco-e2e-kv-smoke-gcp lane passed

/override ci/prow/hco-e2e-kv-smoke-azure

@kubevirt-bot
Copy link
Contributor

@nunnatsa: Overrode contexts on behalf of nunnatsa: ci/prow/hco-e2e-kv-smoke-azure

In response to this:

hco-e2e-kv-smoke-gcp lane passed

/override ci/prow/hco-e2e-kv-smoke-azure

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@assafad
Copy link
Contributor Author

assafad commented Jun 28, 2023

/unhold

@kubevirt-bot kubevirt-bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 28, 2023
@kubevirt-bot kubevirt-bot merged commit 811fa02 into kubevirt:main Jun 28, 2023
3 checks passed
@kubevirt-bot
Copy link
Contributor

@assafad: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-hyperconverged-cluster-operator-e2e-k8s-1.26-centos9 5665934 link unknown /test pull-hyperconverged-cluster-operator-e2e-k8s-1.26-centos9
pull-hyperconverged-cluster-operator-e2e-k8s-1.27 5665934 link unknown /test pull-hyperconverged-cluster-operator-e2e-k8s-1.27

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants