Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emit CSV metric on startup #2216

Merged

Conversation

josefkarasek
Copy link
Contributor

csv_succeeded metric is lost between pod restarts.
This is because this metric is only emitted when CSV.Status is changed.

Description of the change:
Emit csv_succeeded/csv_abnormal metric during OLM startup

Motivation for the change:
csv_succeeded metric is lost between pod restarts.

Reviewer Checklist

  • Implementation matches the proposed design, or proposal is updated to match implementation
  • Sufficient unit test coverage
  • Sufficient end-to-end test coverage
  • Docs updated or added to /doc
  • Commit messages sensible and descriptive

@openshift-ci
Copy link

openshift-ci bot commented Jun 28, 2021

Hi @josefkarasek. Thanks for your PR.

I'm waiting for a operator-framework member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jun 28, 2021
@openshift-ci openshift-ci bot requested review from exdx and hasbro17 June 28, 2021 12:59
@openshift-ci openshift-ci bot added the bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. label Jun 28, 2021
@openshift-ci
Copy link

openshift-ci bot commented Jun 28, 2021

@josefkarasek: This pull request references Bugzilla bug 1952576, which is valid. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.9.0) matches configured target release for branch (4.9.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

Requesting review from QA contact:
/cc @jianzhangbjz

In response to this:

Bug 1952576: Emit CSV metric on startup

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot added the bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. label Jun 28, 2021
@openshift-ci
Copy link

openshift-ci bot commented Jun 28, 2021

@openshift-ci[bot]: GitHub didn't allow me to request PR reviews from the following users: jianzhangbjz.

Note that only operator-framework members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

@josefkarasek: This pull request references Bugzilla bug 1952576, which is valid. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.9.0) matches configured target release for branch (4.9.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

Requesting review from QA contact:
/cc @jianzhangbjz

In response to this:

Bug 1952576: Emit CSV metric on startup

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@josefkarasek
Copy link
Contributor Author

This is an alternative solution along with #2213

Copy link
Contributor

@anik120 anik120 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/ok-to-test

cmd/olm/main.go Outdated Show resolved Hide resolved
@openshift-ci openshift-ci bot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jun 28, 2021
cmd/olm/main.go Outdated Show resolved Hide resolved
cmd/olm/main.go Outdated Show resolved Hide resolved
Copy link
Contributor

@anik120 anik120 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jul 12, 2021
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Jul 12, 2021
Copy link
Member

@dinhxuanvu dinhxuanvu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jul 12, 2021
cmd/olm/main.go Outdated Show resolved Hide resolved
@ecordell
Copy link
Member

/hold

if we can make this change in the operator package it will be much easier to backport

we can un-hold if there's a reason it needs to be in main, but it's not clear to me from the existing discussion why it would need to be

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 13, 2021
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Jul 20, 2021
@josefkarasek josefkarasek requested a review from ecordell July 20, 2021 16:34
@@ -116,6 +117,45 @@ var _ = Describe("Metrics are generated for OLM managed resources", func() {
})
})
})

When("the OLM pod restars", func() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
When("the OLM pod restars", func() {
When("the OLM pod restarts", func() {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The setup of the test feels a little unnatural to me. I would expect a When block inside this Context. And the layout ideally would something be:

Context("Metrics emitted by objects during operator installation", func() {

  When("A CSV is created", func() {
			BeforeEach(func() {
                                   // create CSV
                       }
                      It("emits a CSV metrics", func() {
                              // check for CSV metrics
                      }
                     When("The OLM pod restarts", func() {
                         BeforeEach(func(){
                                // kill the deployment and wait for it to come up again
                         }
                        It("The csv metrics are preserved" func(){
                                 // check for CSV metrics again
                       }) 
                     })
  }

This leaves room for extending this portion of the test suite in the future

@estroz
Copy link
Member

estroz commented Aug 19, 2021

I don't see necessarily a "better" place for the EnsureCSVMetrics call since queueinformer.Operator.Run can't easily be wrapped, so this looks fine to me.

If the spelling nit gets addressed, I'll re-lgtm this.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Aug 19, 2021
@@ -116,6 +117,45 @@ var _ = Describe("Metrics are generated for OLM managed resources", func() {
})
})
})

When("the OLM pod restars", func() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The setup of the test feels a little unnatural to me. I would expect a When block inside this Context. And the layout ideally would something be:

Context("Metrics emitted by objects during operator installation", func() {

  When("A CSV is created", func() {
			BeforeEach(func() {
                                   // create CSV
                       }
                      It("emits a CSV metrics", func() {
                              // check for CSV metrics
                      }
                     When("The OLM pod restarts", func() {
                         BeforeEach(func(){
                                // kill the deployment and wait for it to come up again
                         }
                        It("The csv metrics are preserved" func(){
                                 // check for CSV metrics again
                       }) 
                     })
  }

This leaves room for extending this portion of the test suite in the future

@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Aug 20, 2021
@anik120
Copy link
Contributor

anik120 commented Aug 20, 2021

This looks good to me. If Evan's concerns have been assuaged (I don't see any conversation here that suggests that they have been yet), we can remove the hold and lgtm it.

@josefkarasek
Copy link
Contributor Author

/retest

1 similar comment
@josefkarasek
Copy link
Contributor Author

/retest

Signed-off-by: Josef Karasek <jkarasek@redhat.com>
@josefkarasek
Copy link
Contributor Author

/cc @anik120

@openshift-ci openshift-ci bot requested a review from anik120 August 24, 2021 15:05
@josefkarasek josefkarasek requested a review from estroz August 25, 2021 07:58
Copy link
Member

@dinhxuanvu dinhxuanvu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Sep 3, 2021
@openshift-ci
Copy link

openshift-ci bot commented Sep 3, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dinhxuanvu, josefkarasek, kevinrizza

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [dinhxuanvu,kevinrizza]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kevinrizza
Copy link
Member

/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 3, 2021
@openshift-merge-robot openshift-merge-robot merged commit de4bebe into operator-framework:master Sep 3, 2021
anik120 pushed a commit to anik120/operator-lifecycle-manager that referenced this pull request May 3, 2022
Signed-off-by: Josef Karasek <jkarasek@redhat.com>
anik120 pushed a commit to anik120/operator-lifecycle-manager that referenced this pull request May 3, 2022
Signed-off-by: Josef Karasek <jkarasek@redhat.com>
openshift-merge-robot pushed a commit that referenced this pull request May 12, 2022
* Update getMetricsFromPort to infer port number

Problem: The getMetricsFromPod function assumes that metrics are exposed
on port 8080. This function fails to retrieve metrics from the olm or
catalog operator when the port is changed.

Solution: Name the port in each of the deployments and update the
getMetricsFromPod function to infer the port number from the
deployments.

Signed-off-by: Alexander Greene <greene.al1991@gmail.com>

* Emit CSV metric on startup (#2216)

Signed-off-by: Josef Karasek <jkarasek@redhat.com>

* fix e2e CSV metric is preserved failure (#2530)

Signed-off-by: akihikokuroda <akihikokuroda2020@gmail.com>

Co-authored-by: Alexander Greene <greene.al1991@gmail.com>
Co-authored-by: Josef Karasek <jkarasek@redhat.com>
Co-authored-by: Akihiko (Aki) Kuroda <16141898+akihikokuroda@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants