Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tweaks to get tests to work with private registry #1870

Merged
merged 5 commits into from
Sep 17, 2019

Conversation

duglin
Copy link

@duglin duglin commented Sep 11, 2019

Signed-off-by: Doug Davis dug@us.ibm.com

Proposed Changes

While trying to get the e2e tests to work with a private registry I had to make some "interesting"
choices. This PR isn't all of them, but they're most of the ones I can share at this time. They include:

  • when creating the various "cluster roles", if the role already exists then we just ignore the error. TBH, I'm not sure why other people are not seeing this, perhaps my env already has them by default, or perhaps they're left around from a previous run and never got cleaned up, but either way, I don't think this should be a test-stopping error condition.
  • I added code to look for a secret in the default namespace called
    kn-eventing-test-pull-secret when creating new namespaces and service accounts. And
    if it exists, we will now inject that secret as an ImagePullSecret into the namespace's
    default service accounts, or the new service account.

I'm not 100% sure these are the best solutions, but they at least worked for me. I'm open to
other ideas for how to get the PullSecret injected. If the PullSecret approach sounds out
I can then update the test docs to mention it.

Note - this does not get around #1862

Release Note

NONE

Signed-off-by: Doug Davis <dug@us.ibm.com>
@googlebot googlebot added the cla: yes Indicates the PR's author has signed the CLA. label Sep 11, 2019
@knative-prow-robot knative-prow-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. area/test-and-release Test infrastructure, tests or release labels Sep 11, 2019
@chizhg
Copy link
Member

chizhg commented Sep 11, 2019

/uncc @chaodaiG
/cc @grantr

@knative-prow-robot knative-prow-robot requested review from grantr and removed request for chaodaiG September 11, 2019 17:38
}

// CreateClusterRoleOrFail creates the given ClusterRole or fail the test if there is an error.
func (client *Client) CreateClusterRoleOrFail(cr *rbacv1.ClusterRole) {
crs := client.Kube.Kube.RbacV1().ClusterRoles()
if _, err := crs.Create(cr); err != nil {
client.T.Fatalf("Failed to create cluster role %q: %v", cr.Name, err)
if !strings.Contains(err.Error(), "already exists") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you change these to use:
"k8s.io/apimachinery/pkg/api/errors"
and in particular IsNotFound()

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was looking for that - thanks!


testSecret, _ := defSecI.Get(TestPullSecretName, metav1.GetOptions{})

// Check again. I've seen cases where it lies and if we need it
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems odd, and is there a reason why you don't check the err returned? I'm not sure what you mean by "It" lies, are you saying above line returns nil, but the one below returns true?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I mean that the apiserver will return nil+err even though the secret does exist. I can't explain why or how, but I ran into it often enough that I decided to just ask again to be sure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vagababov Have you seen something like this before? Seems b0rk3n?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really. What is the actual error you're getting? If I were to guess you got either network problem error (connection refused, dial timeout) or overload error from API server.

Which leads me to the main problem I see here: "never ignore errors".

Also, I'd highly recommend to switch to informers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the err that's returned?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me change the code to show the error (if I even get one) to see what's up.. But to add more intrigue, I've also seen cases like this....

I had this code:

		if testSecret != nil {
			// Found the secret, so now make a copy in our new namespace
			newSecret, err := nsSecI.Create(
				&corev1.Secret{
					ObjectMeta: metav1.ObjectMeta{
						Name: testSecret.ObjectMeta.Name,
					},
					Data: testSecret.Data,
					Type: testSecret.Type,
				})
			if err != nil {
				t.Fatalf("TestSetup: Error copying the secret: %s", err)
			}

Notice I would use the "name" from "testSecret", which should never be empty. Yet I would get this error:

test_runner.go:82: namespace is : "test-default-broker-with-many-deprecated-triggers" test_runner.go:168: 
TestSetup: Error copying the secret: Secret "" is invalid: metadata.name: Required value: name or
generateName is required

I never saw this on my local testing, or IKS, only using Prow.

testSecret, _ = defSecI.Get(TestPullSecretName, metav1.GetOptions{})
}

if testSecret != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function seems similar in description, yet it's different and I'm trying to understand why? :) Can this be hoisted into a separate function and reused instead of having two versions of it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they are similar but different. This one only deals with the "default" SA and knows it can blindly copy the secret. Th other one deals with a new SA (so not "default") and needs to check to see if the secret already exists so it doesn't try to duplicate it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They both have the check and re-check (which I must say is extremely worrying and seems extremely flaky and a bug in the k8s if that's the case).
I think having a method that deals with copying the secret into the namespace would be good to hoist out, then rejiggering the ServiceAccount secret could be the only difference between the two if I understand correctly?

@duglin duglin force-pushed the hackTests branch 2 times, most recently from e40f7ff to 6844dad Compare September 12, 2019 00:20
@duglin
Copy link
Author

duglin commented Sep 12, 2019

/test pull-knative-eventing-integration-tests

Signed-off-by: Doug Davis <dug@us.ibm.com>
@duglin
Copy link
Author

duglin commented Sep 12, 2019

@vaikas-google are you ok with the general direction I went for the pull secrets? if so, I'd like to update the docs as part of this PR before it gets merged.

testSecret, _ = defSecI.Get(TestPullSecretName, metav1.GetOptions{})
}

if testSecret != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverse the flow to be more go-style

if tS == nil { fail/return/break}
// normal code goes here unindented

@knative-prow-robot knative-prow-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Sep 13, 2019
@duglin duglin force-pushed the hackTests branch 3 times, most recently from 0cb93f7 to 8280bfd Compare September 13, 2019 20:00
@duglin
Copy link
Author

duglin commented Sep 13, 2019

/hold

until I do more testing

@knative-prow-robot knative-prow-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 13, 2019
@vaikas
Copy link
Contributor

vaikas commented Sep 13, 2019

So from the test logs it's indeed failing with 'not found':
test_runner.go:83: namespace is : "test-default-broker-with-many-deprecated-triggers"
test_runner.go:201: Error copying the secret into ns "test-default-broker-with-many-deprecated-triggers": secrets "kn-eventing-test-pull-secret" not found

Where is this secret created, is there a race there somewhere?

@duglin
Copy link
Author

duglin commented Sep 13, 2019

those logs were old - it shouldn't print that "not found" error in Prow. That should be fixed now, but I'm testing locally, where I do have the secret created, to see if it does the copy correctly.

Signed-off-by: Doug Davis <dug@us.ibm.com>
@knative-prow-robot knative-prow-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 14, 2019

// Just double checking
if srcSecret == nil {
return nil, fmt.Errorf("Error copying secret, it's nil w/o error")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no formatting errors.New.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// it.
// It'll either return a pointer to the new Secret or and error indicating
// why it couldn't do it.
func CopySecret(client *Client, srcNS string, srcSecretName string, tgtNS string, svcAccount string) (*corev1.Secret, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function is inconsistent. Sometimes it calls Fatalf sometimes returns errors (should always do second, though).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep missed that - fixed

@duglin duglin force-pushed the hackTests branch 5 times, most recently from 3f0d474 to 4a5b5fc Compare September 14, 2019 23:47
Signed-off-by: Doug Davis <dug@us.ibm.com>
@vaikas
Copy link
Contributor

vaikas commented Sep 16, 2019

/approve

leaving lgtm for @vagababov in case he had anything else.

@knative-prow-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: duglin, vaikas-google

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow-robot knative-prow-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 16, 2019
Copy link
Contributor

@vagababov vagababov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, since you asked for a real review, I'll leave some obligatory nits :-)

@@ -17,6 +17,7 @@ limitations under the License.
package common

import (
goerrs "errors"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather we alias k8s package rather than built in, but 🤷‍♂

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I thought about that but decided to keep the amount of changes lower - but I'll do it

_, err = tgtNSSvcAccI.Patch(svcAccount, types.StrategicMergePatchType,
[]byte(`{"imagePullSecrets":[{"name":"`+srcSecretName+`"}]}`))
if err != nil {
return nil, fmt.Errorf("Patch failed on NS/SA (%s/%s): %s",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return nil, fmt.Errorf("Patch failed on NS/SA (%s/%s): %s",
return nil, fmt.Errorf("patch failed on NS/SA (%s/%s): %s",

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

// If the secret already exists then that's ok - some other test
// must have created it
if err != nil && !errors.IsAlreadyExists(err) {
return nil, fmt.Errorf("Error copying the Secret: %s", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return nil, fmt.Errorf("Error copying the Secret: %s", err)
return nil, fmt.Errorf("frror copying the Secret: %s", err)

See: https://github.com/golang/go/wiki/CodeReviewComments#error-strings

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'll go with error instead of frror :-)


// Just double checking
if srcSecret == nil {
return nil, goerrs.New("Error copying Secret, it's nil w/o error")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return nil, goerrs.New("Error copying Secret, it's nil w/o error")
return nil, goerrs.New("error copying Secret, it's nil w/o error")

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

// This is needed for cases where the images are in a private registry.
_, err := CopySecret(client, "default", TestPullSecretName, namespace, "default")
if err != nil && !errors.IsNotFound(err) {
t.Fatalf("Error copying the secret into ns %q: %s", namespace, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
t.Fatalf("Error copying the secret into ns %q: %s", namespace, err)
t.Fatalf("error copying the secret into ns %q: %s", namespace, err)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -38,6 +40,8 @@ import (
_ "k8s.io/client-go/plugin/pkg/client/auth/oidc"
)

var TestPullSecretName = "kn-eventing-test-pull-secret"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
var TestPullSecretName = "kn-eventing-test-pull-secret"
const TestPullSecretName = "kn-eventing-test-pull-secret"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@knative-prow-robot knative-prow-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Sep 16, 2019
Copy link
Contributor

@vagababov vagababov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
Thanks.

@knative-prow-robot knative-prow-robot added the lgtm Indicates that a PR is ready to be merged. label Sep 16, 2019
@knative-prow-robot knative-prow-robot removed the lgtm Indicates that a PR is ready to be merged. label Sep 16, 2019
Signed-off-by: Doug Davis <dug@us.ibm.com>
@duglin
Copy link
Author

duglin commented Sep 17, 2019

forgot to re-ping after my latest push. I think all comments are addressed.
@vaikas-google @vagababov

Copy link
Contributor

@vagababov vagababov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@knative-prow-robot knative-prow-robot added the lgtm Indicates that a PR is ready to be merged. label Sep 17, 2019
@chizhg
Copy link
Member

chizhg commented Sep 17, 2019

Not sure if you'll have to manually merge this change on master - #1898

/retest

@chizhg
Copy link
Member

chizhg commented Sep 17, 2019

/retest

1 similar comment
@chizhg
Copy link
Member

chizhg commented Sep 17, 2019

/retest

@knative-prow-robot knative-prow-robot merged commit 72ba863 into knative:master Sep 17, 2019
@duglin duglin deleted the hackTests branch September 18, 2019 08:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/test-and-release Test infrastructure, tests or release cla: yes Indicates the PR's author has signed the CLA. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants