-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-44244: use node node in deleted machine phases #29269
OCPBUGS-44244: use node node in deleted machine phases #29269
Conversation
@kannon92: This pull request references Jira Issue OCPBUGS-44244, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (schoudha@redhat.com), skipping review request. The bug has been updated to refer to the pull request using the external bug tracker. In response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/jira refresh |
@kannon92: This pull request references Jira Issue OCPBUGS-44244, which is valid. 3 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (schoudha@redhat.com), skipping review request. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/hold I need to write some unit tests. |
} | ||
|
||
if newHasNodeRef { | ||
nodeName = machine.Status.NodeRef.Name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So getting a correct node name into the interval annotation will allow your test logic to correctly correlate this deleting machine with a node not ready, and ignore it going forward? i.e. this will cause LESS failures without having to make the test flake.
Suggest explaining some of that in the commit message + PR when you finish this up and have added the unit tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a lot of real estate to work with for commit message. I added this to the description.
f44e549
to
13c26b8
Compare
@kannon92: An error was encountered querying GitHub for users with public email (schoudha@redhat.com) for bug OCPBUGS-44244 on the Jira server at https://issues.redhat.com/. No known errors were detected, please see the full error message for details. Full error message.
Post "http://ghproxy/graphql": dial tcp 172.30.229.2:80: i/o timeout
Please contact an administrator to resolve this issue, then request a bug refresh with In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
cc @rphillips @deads2k @dgoodwin This is ready. I want to run the tests off of this PR so please don't merge until we confirm this works. |
Nice. This looks good... Let's see how the tests go. |
/hold cancel Something is up with these tests.. |
/jira refresh |
@kannon92: This pull request references Jira Issue OCPBUGS-44244, which is valid. 3 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (schoudha@redhat.com), skipping review request. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
@kannon92: This pull request references Jira Issue OCPBUGS-44244. The bug has been updated to no longer refer to the pull request using the external bug tracker. All external bug links have been closed. The bug has been moved to the NEW state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
@kannon92: This pull request references Jira Issue OCPBUGS-44244, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (schoudha@redhat.com), skipping review request. The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
13c26b8
to
8b023ce
Compare
Job Failure Risk Analysis for sha: 8b023ce
|
/test e2e-aws-ovn-microshift |
Found a little issue and kicked off a new round of testing. I did not realize that the MachinePhase interval was constructed. So I added it to the Once a test finishes, I will verify the artifacts to see if this is correct. |
44e19ca
to
dd5eb16
Compare
/retest /hold cancel Confirmed that the machines have the right node names in the MachinePhase Intervals. |
Job Failure Risk Analysis for sha: dd5eb16
|
@deads2k @dgoodwin @rphillips |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me just a couple nits.
unexpectedReason: monitorapi.NodeUnexpectedReadyReason, | ||
}, | ||
{ | ||
// OCPBUG-44244: if machine monitor does not have node name than we will get no failures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test looks to expect a failure. Just a comment that needs updating?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea. updated.
unexpectedReason: monitorapi.NodeUnexpectedUnreachableReason, | ||
}, | ||
{ | ||
// OCPBUG-44244: if machine monitor does not have node name than we will get no failures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expects a failure as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated.
/lgtm |
/hold Revision 1bf8c85 was retested 3 times: holding |
Job Failure Risk Analysis for sha: 1bf8c85
|
1bf8c85
to
de21794
Compare
/hold cancel |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dgoodwin, kannon92 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@kannon92: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
5e59368
into
openshift:master
@kannon92: Jira Issue OCPBUGS-44244: All pull requests linked via external trackers have merged: Jira Issue OCPBUGS-44244 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
[ART PR BUILD NOTIFIER] Distgit: openshift-enterprise-tests |
/cherry-pick release-4.17 |
@kannon92: #29269 failed to apply on top of branch "release-4.17":
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
This PR makes sure to add a node name to the DeletedMachine monitor.
Once that is added, we also added some unit tests to this code to make sure that we are correctly detecting UnexpectedNodeNotReady / UnexpectedUnreachable.
The main edge case we had to work around was when Machines get deleted and that node goes not ready. This is correct behavior but our test was flagging this as this code path does not go through a Machine Config change.