Validate cluster and machine objects #298

wangzhen127 · 2018-06-06T22:05:17Z

What this PR does / why we need it:
This PR makes clusterctl validate cluster command validate the cluster and machines object. This is part of #168.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

Special notes for your reviewer:

Release note:

Add "kubectl validate cluster" that validates cluster and machine objects.

@kubernetes/kube-deploy-reviewers

wangzhen127 · 2018-06-06T22:05:46Z

/cc @jessicaochen @karan

spew · 2018-06-07T14:29:41Z

clusterctl/cmd/validate_cluster.go

+}
+
+func validateMachineObjects(machines *v1alpha1.MachineList) error {
+	for _, machine := range machines.Items {


Rather than report the first machine that has something wrong with it, it may be better if an error was returned / printed for each non-valid machine.

Not entirely sure how this command is intended to be used, but if I wanted to use it to 'fix' my cluster, as a user I will need to run this, correct what is wrong with the first machine, run it again, etc.

Good point. Updated.

spew

@jessicaochen and @karan have spent more time thinking about this so please wait for their reviews as well, but here are some thoughts i have.

spew · 2018-06-07T14:32:56Z

clusterctl/cmd/validate_cluster.go

+	glog.Infof("Pass. The cluster %q in namespace %q is validated.", cluster.Name, vco.Namespace)
+}
+
+func validateClusterAPIObjects() *v1alpha1.Cluster {


why does this function return a cluster? That doesn't seem like a result type I would expect from a validate function.

Mostly because it needs to print cluster name at the end. And since it allows omitting cluster name, it needs to find the cluster first. I moved getting cluster out of the function now. So it currently only returns error.

spew · 2018-06-07T14:33:46Z

clusterctl/cmd/validate_cluster.go

+		clientcmd.NewDefaultClientConfigLoadingRules(),
+		&clientcmd.ConfigOverrides{}).ClientConfig()
+	if err != nil {
+		glog.Fatalf("Failed to create client for talking to the apiserver: %v", err)


Recommend that this function adds a second return value of 'error' and the calling function, calls glog.Fatal(...) like this:

cluster, error := validateClusterAPIObjects() if err != nil { glog.Fatal(err) }

It's generally undesirable to have calls to exit an application littered in various places because it becomes more complicated to reason about application behavior as there are more possible ways to exit. It's harder to standardize the error messages that are displayed in these exit scenarios as well as there is not a single location where the final error message is printed and exit is called. In addition, it is more difficult to unit test code that can call exit.

And like I said in a previous comment, ideally we would keep the single glog.Exit(...) call in the main Run function defined on the command to match all other commands in clusterctl. That will be easier for future developers to understand.

Makes sense. Updated

spew · 2018-06-07T14:37:17Z

clusterctl/cmd/validate_cluster.go

-		if err := RunValidateCluster(); err != nil {
-			glog.Exit(err)
-		}
+		RunValidateCluster()


I think i would keep the single glog.Exit (...) call to match the other commands in the application and make the application simpler to understand for contributing developers. If we are to change that, we should change it wholesale in each command.

Changed back now.

wangzhen127

Updated the PR with fix.

wangzhen127 · 2018-06-07T18:32:52Z

clusterctl/cmd/validate_cluster.go

-		if err := RunValidateCluster(); err != nil {
-			glog.Exit(err)
-		}
+		RunValidateCluster()


Changed back now.

wangzhen127 · 2018-06-07T18:33:54Z

clusterctl/cmd/validate_cluster.go

+	glog.Infof("Pass. The cluster %q in namespace %q is validated.", cluster.Name, vco.Namespace)
+}
+
+func validateClusterAPIObjects() *v1alpha1.Cluster {


Mostly because it needs to print cluster name at the end. And since it allows omitting cluster name, it needs to find the cluster first. I moved getting cluster out of the function now. So it currently only returns error.

wangzhen127 · 2018-06-07T18:34:02Z

clusterctl/cmd/validate_cluster.go

+		clientcmd.NewDefaultClientConfigLoadingRules(),
+		&clientcmd.ConfigOverrides{}).ClientConfig()
+	if err != nil {
+		glog.Fatalf("Failed to create client for talking to the apiserver: %v", err)


Makes sense. Updated

wangzhen127 · 2018-06-07T18:34:17Z

clusterctl/cmd/validate_cluster.go

+}
+
+func validateMachineObjects(machines *v1alpha1.MachineList) error {
+	for _, machine := range machines.Items {


Good point. Updated.

karan

First pass

karan · 2018-06-07T21:27:15Z

clusterctl/cmd/validate_cluster.go

+		return fmt.Errorf("Failed to create client for talking to the apiserver: %v", err)
+	}
+
+	client, err := clientset.NewForConfig(config)


Can you use https://github.com/kubernetes-sigs/cluster-api/blob/master/util/util.go#L102 instead?

karan · 2018-06-07T21:28:06Z

clusterctl/cmd/validate_cluster.go

+	}
+
+	cluster, err := getClusterObject(client, vco.ClusterName, vco.Namespace)
+	if err != nil {


cuddle this and everywhere else

if err := foo(); err != nil { }

At this place, we need |cluster| outside of the if scope. I have cuddled other places where only err is returned.

karan · 2018-06-07T21:28:43Z

clusterctl/cmd/validate_cluster.go

+
+	cluster, err := getClusterObject(client, vco.ClusterName, vco.Namespace)
+	if err != nil {
+		return err


Is there any useful info you can log here?

This error is already formated in |getClusterObject|, so we do not need to add more info here.

karan · 2018-06-07T21:28:55Z

clusterctl/cmd/validate_cluster.go

+	}
+
+	err = validateClusterAPIObjects(client, cluster)
+	if err != nil {


cuddle, and same comment about logging.

karan · 2018-06-07T21:30:46Z

clusterctl/cmd/validate_cluster.go

+
+	// TODO(wangzhen127): Also validate the cluster in addition to the cluster API objects.
+
+	glog.Infof("Pass. The cluster %q in namespace %q is validated.", cluster.Name, vco.Namespace)


It's not clear what "Pass" means. I like the UI of unit tests where you can see each test that ran and the status of each one. We should shoot for something similar.

Checking foo bar... PASS Checking bar foo... FAIL - Failure log: reason for failure Checking baz baz... PASS

I thought about the unit test style. Ideally, I wanted to do the following:

// Print "Checking foo bar...", but not newline err := RunFooBar() // Print "PASS or FAIL" on the same line as "Checking foo bar..." and then new line

But glog seems to always print newline. Do you know how to work this around? The reason of separating them in 2 prints is because RunFooBar() may take a long time and it is better to tell the user what it is doing.

Use stdout instead?

wangzhen127

Updated. PTAL

wangzhen127 · 2018-06-08T16:05:30Z

clusterctl/cmd/validate_cluster.go

+	}
+
+	cluster, err := getClusterObject(client, vco.ClusterName, vco.Namespace)
+	if err != nil {


At this place, we need |cluster| outside of the if scope. I have cuddled other places where only err is returned.

wangzhen127 · 2018-06-08T16:06:17Z

clusterctl/cmd/validate_cluster.go

+
+	cluster, err := getClusterObject(client, vco.ClusterName, vco.Namespace)
+	if err != nil {
+		return err


This error is already formated in |getClusterObject|, so we do not need to add more info here.

wangzhen127 · 2018-06-08T16:06:33Z

clusterctl/cmd/validate_cluster.go

+	}
+
+	err = validateClusterAPIObjects(client, cluster)
+	if err != nil {


wangzhen127 · 2018-06-08T16:17:32Z

clusterctl/cmd/validate_cluster.go

+
+	// TODO(wangzhen127): Also validate the cluster in addition to the cluster API objects.
+
+	glog.Infof("Pass. The cluster %q in namespace %q is validated.", cluster.Name, vco.Namespace)


I thought about the unit test style. Ideally, I wanted to do the following:

// Print "Checking foo bar...", but not newline err := RunFooBar() // Print "PASS or FAIL" on the same line as "Checking foo bar..." and then new line

But glog seems to always print newline. Do you know how to work this around? The reason of separating them in 2 prints is because RunFooBar() may take a long time and it is better to tell the user what it is doing.

wangzhen127 · 2018-06-08T17:02:38Z

clusterctl/cmd/validate_cluster.go

+		return fmt.Errorf("Failed to create client for talking to the apiserver: %v", err)
+	}
+
+	client, err := clientset.NewForConfig(config)


jessicaochen · 2018-06-11T21:11:28Z

clusterctl/cmd/validate_cluster.go

+	return nil
+}
+
+func validateClusterAPIObjects(client *clientset.Clientset, cluster *v1alpha1.Cluster) error {


Not going to join the review since you seem to have enough reviewers. (Do let me know if you want me to formally review).

My 2cents: Please break the actual validation leg work into a different package. cmd.go files should only contain the limited stuff to do with how to input via command line. Ie. flags etc

Now put into clusterctl/validation.

Should we put them in pkg/validation or staying in clusterctl/ is fine. This depends on whether other code wants to reuse this.

wangzhen127

Updated. PTAL

wangzhen127 · 2018-06-18T18:09:01Z

clusterctl/cmd/validate_cluster.go

+	return nil
+}
+
+func validateClusterAPIObjects(client *clientset.Clientset, cluster *v1alpha1.Cluster) error {


Now put into clusterctl/validation.

Should we put them in pkg/validation or staying in clusterctl/ is fine. This depends on whether other code wants to reuse this.

wangzhen127 · 2018-06-18T18:09:34Z

clusterctl/cmd/validate_cluster.go

+
+	// TODO(wangzhen127): Also validate the cluster in addition to the cluster API objects.
+
+	glog.Infof("Pass. The cluster %q in namespace %q is validated.", cluster.Name, vco.Namespace)


wangzhen127 · 2018-06-20T23:00:02Z

I rebased and squashed the commits to remove conflicts. PTAL.

wangzhen127 · 2018-06-22T17:08:42Z

@spew , since @karan is very busy recently, can I ask you as the main reviewer for this PR?

dims · 2018-06-22T22:00:00Z

/ok-to-test

spew · 2018-06-23T13:37:57Z

I am taking over this review -- please hold until I am able to review.

spew

This is my first pass -- I avoided comments about the UX (the actual output of the command) as it is difficult to imagine without seeing the output in some form. I have some comments in the review about adding tests that verify the output of the command, by adding those tests, and their associated sample output, it will be obvious what the output of the command is and the impact of various print statements, etc. So once those tests are added I may have some comments about the UX with regards to command output.

I also avoided focussing too much on simplifying the code inside of validate_cluster.go and validate_cluster_api_objects.go because I want to wait until some other changes are made as it may impact those files.

Please add release notes -- this PR contains user impacting features and changes.

spew · 2018-06-23T13:40:59Z

clusterctl/validation/validate_cluster_api_objects.go

+
+	machines, err := client.ClusterV1alpha1().Machines(namespace).List(meta_v1.ListOptions{})
+	if err != nil {
+		return fmt.Errorf("Failed to get the machines from the apiserver in namespace %q: %v", namespace, err)


go style: error strings should be lowercase so that they can be easily chained together. Please fix by making all error strings in this PR lowercase.

spew · 2018-06-23T13:42:28Z

clusterctl/cmd/validate_cluster.go

+			kubeconfig = util.GetDefaultKubeConfigPath()
+		}
+	}
+	client, err := clientcmd.NewClusterApiClientForDefaultSearchPath(kubeconfig)


NewClusterApiClientForDefaultSearchPath(...) does all of the work that lines 66 -> 71 do automatically, i.e. loading from $KUBECONFIG and looking in the $HOME folder for kubeconfig(s) (and handling extra cases). Please remove and just pass vco.Kubeconfig to NewClusterApiClientForDefaultSearchPath(...). The empty string is expected and handled by that method.

spew · 2018-06-23T13:44:20Z

clusterctl/cmd/validate_cluster.go

+	validateClusterCmd.Flags().StringVarP(
+		&vco.Namespace, "namespace", "n", v1.NamespaceDefault,
+		"If present, validate the cluster in the specified namespace.")
+	validateClusterCmd.Flags().StringVarP(


These description strings are better than what is present in kubectl, however, it may be better to provide a consistent experience which means we should push back improvmenets to client-go. Please see comments from @roberthbailey about this on my PR here: https://github.com/kubernetes-sigs/cluster-api/pull/384/files

spew · 2018-06-23T13:46:57Z

clusterctl/cmd/validate_cluster.go

+		return fmt.Errorf("Failed to create client for talking to the apiserver: %v", err)
+	}
+
+	if err = validation.ValidateClusterAPIObjects(client, vco.ClusterName, vco.Namespace); err != nil {


I have a PR coming for clusterctl delete cluster where I add the same name and namespace flags. After that PR it will be possible to get a kubernetes & cluster api client that is scoped to the appropraite namespace and clustername so it will mean you don't need to pass them around everywhere and the complexity of this code can be reduced. That PR is likely to merge first so please integrate with it: https://github.com/kubernetes-sigs/cluster-api/pull/384/files

The client object is not scoped to the provided namespace and cluster name only, right? I think we still need to pass in the cluster name and namespace.

spew · 2018-06-23T13:48:24Z

clusterctl/cmd/validate_cluster.go

-			glog.Exit(err)
+			os.Stdout.Sync()
+			fmt.Fprintf(os.Stderr, "ERROR: %v\n", err)
+			os.Exit(1)


These three lines should be reverted and we should leave the consistent "glog.Exit" across all commands (I think there is one inconsistency right now but we should not make it even more inconsistent). If there is an issue with that approach it should be addressed in its own PR and for all commands in clusterctl.

fmt print functions are used in validate_cluster_api_objects.go instead of glog. The reason is that we want to use unittest style of output, like the following.

Checking machine object "gce-master-test1"... PASS Checking machine object "gce-node-4k648"... PASS

In the code, we do:

Print Checking machine object "XYZ"...

Actually do the check

Print PASS if actually pass, otherwise FAIL on the same line.

So in step 1, we need to print some message without newline character after .... glog cannot do that (always adding a newline character). So I have to use fmt print. To be consistent, I used fmt print here, too (glog has its own formatting).

Let me know if you are aware of any other good ways to work around.

spew · 2018-06-23T14:45:13Z

pkg/apis/cluster/v1alpha1/testutil/testutil.go

@@ -16,7 +16,11 @@ limitations under the License.

 package testutil

-import "sigs.k8s.io/cluster-api/pkg/apis/cluster/v1alpha1"
+import (


util packages, especially one as general purpose as "testutil" are an anti pattern -- and inevitably becoming a dumping ground for tons of stuff they shouldn't have. These methods are not yet being used by any other tests / code so there isn't a reason to add them to a util package. These are not things that belong in a package named "testutil" as they are specific to tests needing cluster api fixtures and not every tests. It's also extremely unlikely that the methods can be flexible enough to cover all use cases of all tests that would potentially need a cluster with error fixture or a machine with error fixture so they are out of place in a util package.

Instead, it's better to just put these methods in validate_cluster_api_objects_test.go where they are being used. Please move them there.

spew · 2018-06-23T15:04:48Z

clusterctl/validation/validate_cluster_api_objects_test.go

@@ -0,0 +1,242 @@
+/*


The tests in this file do a good job of verifying the error cases, however, validate_cluster_api_objects.go has a side effect which is a lot of string output. We should have tests that verify this output and catch when changes are made to the output. The 'fmt' library has methods which take an io.Writer. We should change validate_cluster_api_objects to take an io.Writer.

The CLI command can pass in os.Stdout and potentially os.Stderr as well if we want to support writing to a regular output and an error output. The tests would pass in a bytes.Buffer which can be used in place of the io.Writer interface: https://golang.org/pkg/bytes/#Buffer.Write. The buffer can be converted to string and then the output can be compared against what is expected. The expected output should probably be checked into golden files in the testdata folder so that we can reason about the user experience of this command / file.

Please add tests that verify the fmt output of at least two scenarios:

Success (no errors / FAILs)

Errors (at least 2 errors / FAILS in the output).

spew · 2018-06-23T15:34:29Z

clusterctl/validation/validate_cluster_api_objects_test.go

+)
+
+func TestGetClusterObject(t *testing.T) {
+	testClusterName := "test-cluster"


This function has three groupings of test cases, No cluster, one cluster, and two clusters. Tests should be for a single case or a single table driven set of cases all in the same general area testing the same method. It will be extremely difficult for a future developer to understand and make changes to these tests.

Please refactor this into three different test methods.

spew · 2018-06-23T15:38:46Z

clusterctl/validation/validate_cluster_api_objects.go

+	if err = validateMachineObjects(machines); err != nil {
+		return err
+	}
+


please create an TODO issue and link it here for the missing verification of MachineDeployments and MachineSets.

Do we need the verification of those two? Would verifying machines be enough?

Or put it in another way, what are the cases where machines are all ok, but MachineDeployments or MachineSets have errors?

spew · 2018-06-23T15:39:59Z

clusterctl/validation/validate_cluster_api_objects.go

+	if numOfClusters := len(clusters.Items); numOfClusters == 0 {
+		return nil, fmt.Errorf("Fail: No cluster exists in namespace %q.", namespace)
+	} else if numOfClusters > 1 {
+		return nil, fmt.Errorf("Fail: There are more than one cluster in namespace %q. Please specify --cluster-name.", namespace)


grammar issue, please change: "There are more" -> "There is more"

spew · 2018-06-24T00:25:23Z

/assign @spew

wangzhen127

Thanks for the review. Updated the PR. Also adds release notes and test data for output tests. PTAL

wangzhen127 · 2018-06-28T17:17:04Z

clusterctl/cmd/validate_cluster.go

-			glog.Exit(err)
+			os.Stdout.Sync()
+			fmt.Fprintf(os.Stderr, "ERROR: %v\n", err)
+			os.Exit(1)


fmt print functions are used in validate_cluster_api_objects.go instead of glog. The reason is that we want to use unittest style of output, like the following.

Checking machine object "gce-master-test1"... PASS Checking machine object "gce-node-4k648"... PASS

In the code, we do:

Print Checking machine object "XYZ"...

Actually do the check

Print PASS if actually pass, otherwise FAIL on the same line.

So in step 1, we need to print some message without newline character after .... glog cannot do that (always adding a newline character). So I have to use fmt print. To be consistent, I used fmt print here, too (glog has its own formatting).

Let me know if you are aware of any other good ways to work around.

wangzhen127 · 2018-06-28T17:30:56Z

clusterctl/cmd/validate_cluster.go

+			kubeconfig = util.GetDefaultKubeConfigPath()
+		}
+	}
+	client, err := clientcmd.NewClusterApiClientForDefaultSearchPath(kubeconfig)


wangzhen127 · 2018-06-28T17:34:14Z

clusterctl/cmd/validate_cluster.go

+		return err
+	}
+
+	// TODO(wangzhen127): Also validate the cluster in addition to the cluster API objects.


Added the link to github issue now, though I saw other places are not following this rule.

wangzhen127 · 2018-06-28T18:02:24Z

clusterctl/cmd/validate_cluster.go

+		return fmt.Errorf("Failed to create client for talking to the apiserver: %v", err)
+	}
+
+	if err = validation.ValidateClusterAPIObjects(client, vco.ClusterName, vco.Namespace); err != nil {


The client object is not scoped to the provided namespace and cluster name only, right? I think we still need to pass in the cluster name and namespace.

wangzhen127 · 2018-06-28T18:07:39Z

clusterctl/cmd/validate_cluster.go

 		}
 	},
 }

 func init() {
+	validateClusterCmd.Flags().StringVarP(


wangzhen127 · 2018-06-28T18:34:43Z

clusterctl/validation/validate_cluster_api_objects.go

+func validateMachineObjects(machines *v1alpha1.MachineList) error {
+	pass := true
+	for _, machine := range machines.Items {
+		fmt.Printf("Checking machine object %q... ", machine.Name)


wangzhen127 · 2018-06-28T18:36:41Z

clusterctl/validation/validate_cluster_api_objects.go

+		fmt.Printf("PASS\n")
+	}
+	if !pass {
+		return fmt.Errorf("Failed to validate machine objects.")


Changed to "failed to validate some machine object." Just like to start with "fail". :)

wangzhen127 · 2018-06-28T18:45:40Z

clusterctl/validation/validate_cluster_api_objects.go

+	for _, machine := range machines.Items {
+		fmt.Printf("Checking machine object %q... ", machine.Name)
+		if machine.Status.ErrorReason != nil || machine.Status.ErrorMessage != nil {
+			var reason common.MachineStatusError = ""


ErrorReason has its own type MachineStatusError. Doing this means that we need type conversion. Is it ok to always assume ErrorReason being a string under the hood?

wangzhen127 · 2018-06-28T20:46:51Z

clusterctl/validation/validate_cluster_api_objects_test.go

+)
+
+func TestGetClusterObject(t *testing.T) {
+	testClusterName := "test-cluster"


wangzhen127 · 2018-06-28T20:49:15Z

pkg/apis/cluster/v1alpha1/testutil/testutil.go

@@ -16,7 +16,11 @@ limitations under the License.

 package testutil

-import "sigs.k8s.io/cluster-api/pkg/apis/cluster/v1alpha1"
+import (


mkjelland · 2018-07-10T15:10:09Z

clusterctl/cmd/validate_cluster.go

 		}
 	},
 }

 func init() {
+	validateClusterCmd.Flags().StringVarP(
+		&vco.Kubeconfig, "kubeconfig", "", "",
+		"The file path of the kubeconfig file for Cluster API management stack. If not specified, $KUBECONFIG environment variable or ${HOME}/.kube/config is used.")


I would prefer if this said "The file path of the kubeconfig file for the cluster to validate." The kubeconfig doesn't actually specify anything about the cluster api stack directly, instead it will give access to the core api server, which knows about the cluster api server.

mkjelland · 2018-07-10T15:18:28Z

clusterctl/validation/validate_cluster_api_objects.go

+	if err != nil {
+		return fmt.Errorf("failed to get the machines from the apiserver in namespace %q: %v", namespace, err)
+	}
+	if err = validateMachineObjects(w, machines); err != nil {


You can reduce the number of lines here by directly returning validateMachineObjects(w, machines).

mkjelland · 2018-07-10T15:18:46Z

clusterctl/validation/validate_cluster_api_objects.go

+
+	machines, err := client.ClusterV1alpha1().Machines(namespace).List(meta_v1.ListOptions{})
+	if err != nil {
+		return fmt.Errorf("failed to get the machines from the apiserver in namespace %q: %v", namespace, err)


Some of these errors and wrapped before returning, and some are not, it there a reason for that?

That's because |getClusterObject| and |validateClusterObject| are also functions inside this file. They wrap their errors by themselves already to provide the context. Line 40 is some function provided by other pkg, so I want to wrap it and provide some context here.

mkjelland · 2018-07-10T15:22:17Z

clusterctl/validation/validate_cluster_api_objects.go

+	if cluster.Status.ErrorReason != "" || cluster.Status.ErrorMessage != "" {
+		fmt.Fprintf(w,"FAIL\n")
+		fmt.Fprintf(w,"\t[%v]: %s\n", cluster.Status.ErrorReason, cluster.Status.ErrorMessage)
+		return fmt.Errorf("failed to validate the cluster %q.", cluster.Name)


I think it would make more sense to say "cluster failed validation". To me, saying "failed to validate the cluster" implies that the validation code had an error, but in this case, the cluster itself was invalid.

mkjelland · 2018-07-10T15:28:19Z

clusterctl/validation/validate_cluster_api_objects.go

+		fmt.Fprintf(w,"\t[%v]: %s\n", reason, message)
+		return false
+	}
+	if machine.Status.NodeRef == nil {


Do you have plans later to check that the nodeRef points to an existing node, and that the node is in the ready state?

mkjelland · 2018-07-10T22:39:41Z

clusterctl/validation/validate_cluster_api_objects_test.go

+		t.Run(testcase.name, func(t *testing.T) {
+			cluster, err := getClusterObject(client, testcase.clusterName, testcase.namespace)
+			if (testcase.expectErr && err == nil) || (!testcase.expectErr && err != nil) {
+				t.Fatalf("Unexpected returned error. Got: %v, Want Err: %v", err, testcase.expectErr)


Since expectErr is a boolean, this will read:
"Unexpected returned error. Got: error message, Want Err: true"
Is that the intended behavior? It seems like it would be more useful to print the error message that you expected

wangzhen127

Updated. PTAL.

wangzhen127 · 2018-07-11T18:20:14Z

clusterctl/validation/validate_cluster_api_objects.go

+
+	machines, err := client.ClusterV1alpha1().Machines(namespace).List(meta_v1.ListOptions{})
+	if err != nil {
+		return fmt.Errorf("failed to get the machines from the apiserver in namespace %q: %v", namespace, err)


That's because |getClusterObject| and |validateClusterObject| are also functions inside this file. They wrap their errors by themselves already to provide the context. Line 40 is some function provided by other pkg, so I want to wrap it and provide some context here.

wangzhen127 · 2018-07-11T18:21:01Z

clusterctl/validation/validate_cluster_api_objects.go

+	if err != nil {
+		return fmt.Errorf("failed to get the machines from the apiserver in namespace %q: %v", namespace, err)
+	}
+	if err = validateMachineObjects(w, machines); err != nil {


wangzhen127 · 2018-07-11T18:22:52Z

clusterctl/validation/validate_cluster_api_objects.go

+	if cluster.Status.ErrorReason != "" || cluster.Status.ErrorMessage != "" {
+		fmt.Fprintf(w,"FAIL\n")
+		fmt.Fprintf(w,"\t[%v]: %s\n", cluster.Status.ErrorReason, cluster.Status.ErrorMessage)
+		return fmt.Errorf("failed to validate the cluster %q.", cluster.Name)


wangzhen127 · 2018-07-11T18:24:36Z

clusterctl/validation/validate_cluster_api_objects.go

+		fmt.Fprintf(w,"\t[%v]: %s\n", reason, message)
+		return false
+	}
+	if machine.Status.NodeRef == nil {


mkjelland · 2018-07-16T15:53:15Z

clusterctl/validation/validate_cluster_api_objects_test.go

+			var b bytes.Buffer
+			err := validateMachineObjects(&b, &machines, k8sClient)
+			if (testcase.expectErr && err == nil) || (!testcase.expectErr && err != nil) {
+				t.Fatalf("Unexpected returned error. Got: %v, Want Err: %v", err, testcase.expectErr)


The text that gets printed from this line is a bit unclear to me.

In one case, we didn't expect an error but we got one, in that case, this will be printed:
"Unexpected returned error. Got: error message, Want Err: false"
It reads as if the "false" at the end should be an error that we DID expect to get.

In the other case, we did expect an error but didn't get one, in that case it will read:
"Unexpected returned error. Got: nil, Want Err: true"
The first sentence reads as if we got an error, but in fact the error was nil.

It might help to split these cases up so that the text can be different in these 2 cases, what do you think?

mkjelland · 2018-07-16T20:15:29Z

lgtm, can you squash the commits?

@karan Would you mind taking another pass at this and adding the lgtm label if it looks good?

karan · 2018-07-16T20:16:08Z

Yes. I'll take a look

wangzhen127 · 2018-07-16T20:19:46Z

I will squash now. Github is not showing my 3rd commit yet. Will let you know once I squash, so that all changes are in place.

wangzhen127 · 2018-07-16T20:23:55Z

Just rebased and squshed. And it seems that github is functioning correctly now. PTAL @karan

karan · 2018-07-16T20:25:38Z

clusterctl/validation/validate_cluster_api_objects.go

+)
+
+func ValidateClusterAPIObjects(w io.Writer, clusterApiClient *clientset.Clientset, k8sClient kubernetes.Interface, clusterName string, namespace string) error {
+	fmt.Fprintf(w, "Validating cluster API objects in namespace %q\n", namespace)


nit: where Cluster API is used as a noun, should be title case.

karan · 2018-07-16T20:26:22Z

clusterctl/validation/validate_cluster_api_objects.go

+	if err != nil {
+		return err
+	}
+	if err := validateClusterObject(w, cluster); err != nil {


Seems like the cluster object you get above is not used anywhere else in this method. I recommend you getClusterObject in validateClusterObject itself.

Separating them makes testing much easier. So when testing validateClusterObject, I can just provide fake cluster objects. If I move getClusterObject into validateClusterObject, I will have to manipulate with the fake client, too.

To keep current tests, another way is to change to the following:

func ValidateClusterAPIObjects(...) { getAndValidateClusterObject(...) getAndValidateMachineObjects(...) } func getAndValidateClusterObject(...) { cluster := getClusterObject(...) validateClusterObject(cluster ...) } func getAndValidateMachineObjects(...) { machines := getMachineObjects(...) validateMachineObjects(machines ...) }

But I think the above is actually making current code more complex.

karan · 2018-07-16T20:26:45Z

clusterctl/validation/validate_cluster_api_objects.go

+		return fmt.Errorf("failed to get the machines from the apiserver in namespace %q: %v", namespace, err)
+	}
+
+	return validateMachineObjects(w, machines, k8sClient)


Same as above.

wangzhen127

Updated. PTAL

wangzhen127 · 2018-07-16T20:43:43Z

clusterctl/validation/validate_cluster_api_objects.go

+)
+
+func ValidateClusterAPIObjects(w io.Writer, clusterApiClient *clientset.Clientset, k8sClient kubernetes.Interface, clusterName string, namespace string) error {
+	fmt.Fprintf(w, "Validating cluster API objects in namespace %q\n", namespace)


wangzhen127 · 2018-07-16T20:54:53Z

clusterctl/validation/validate_cluster_api_objects.go

+	if err != nil {
+		return err
+	}
+	if err := validateClusterObject(w, cluster); err != nil {


Separating them makes testing much easier. So when testing validateClusterObject, I can just provide fake cluster objects. If I move getClusterObject into validateClusterObject, I will have to manipulate with the fake client, too.

To keep current tests, another way is to change to the following:

func ValidateClusterAPIObjects(...) { getAndValidateClusterObject(...) getAndValidateMachineObjects(...) } func getAndValidateClusterObject(...) { cluster := getClusterObject(...) validateClusterObject(cluster ...) } func getAndValidateMachineObjects(...) { machines := getMachineObjects(...) validateMachineObjects(machines ...) }

But I think the above is actually making current code more complex.

wangzhen127 · 2018-07-16T20:55:00Z

clusterctl/validation/validate_cluster_api_objects.go

+		return fmt.Errorf("failed to get the machines from the apiserver in namespace %q: %v", namespace, err)
+	}
+
+	return validateMachineObjects(w, machines, k8sClient)


wangzhen127 · 2018-07-20T18:04:56Z

Ping @karan :)

wangzhen127 · 2018-07-24T17:39:20Z

Thanks! Just rebased and sqaushed all commits.

karan · 2018-07-24T17:40:23Z

/lgtm

karan · 2018-07-24T18:20:59Z

/approve

karan · 2018-07-24T20:09:51Z

/approve

k8s-ci-robot · 2018-07-24T20:09:53Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: karan, wangzhen127

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [karan]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

…timeout Enable custom boot timeout, VNC, & packer flags

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jun 6, 2018

k8s-ci-robot requested review from jessicaochen and karan June 6, 2018 22:05

spew reviewed Jun 7, 2018

View reviewed changes

wangzhen127 force-pushed the validate-cluster-api branch from 4aeb3ed to 60114fa Compare June 7, 2018 18:31

wangzhen127 commented Jun 7, 2018

View reviewed changes

karan reviewed Jun 7, 2018

View reviewed changes

wangzhen127 commented Jun 8, 2018

View reviewed changes

jessicaochen reviewed Jun 11, 2018

View reviewed changes

wangzhen127 commented Jun 18, 2018

View reviewed changes

wangzhen127 force-pushed the validate-cluster-api branch 2 times, most recently from a134530 to eb250f5 Compare June 20, 2018 22:52

wangzhen127 force-pushed the validate-cluster-api branch from eb250f5 to f424e85 Compare June 22, 2018 20:24

k8s-ci-robot removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jun 22, 2018

spew reviewed Jun 23, 2018

View reviewed changes

k8s-ci-robot assigned spew Jun 24, 2018

wangzhen127 force-pushed the validate-cluster-api branch from f424e85 to 5ef8f64 Compare June 28, 2018 22:49

k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jun 28, 2018

wangzhen127 force-pushed the validate-cluster-api branch from 5ef8f64 to 470bd9b Compare June 29, 2018 17:57

wangzhen127 commented Jun 29, 2018

View reviewed changes

mkjelland suggested changes Jul 10, 2018

View reviewed changes

wangzhen127 commented Jul 11, 2018

View reviewed changes

wangzhen127 force-pushed the validate-cluster-api branch 3 times, most recently from 5f5cc57 to bdfde6a Compare July 13, 2018 21:31

mkjelland reviewed Jul 16, 2018

View reviewed changes

wangzhen127 force-pushed the validate-cluster-api branch from c6968cb to 735cee7 Compare July 16, 2018 20:22

karan suggested changes Jul 16, 2018

View reviewed changes

wangzhen127 commented Jul 16, 2018

View reviewed changes

wangzhen127 force-pushed the validate-cluster-api branch from e748852 to 1fef768 Compare July 19, 2018 01:09

karan approved these changes Jul 24, 2018

View reviewed changes

wangzhen127 force-pushed the validate-cluster-api branch from 1fef768 to cc879bc Compare July 24, 2018 17:37

Validate cluster and machine objects

4f7bdf7

wangzhen127 force-pushed the validate-cluster-api branch from cc879bc to 4f7bdf7 Compare July 24, 2018 17:38

k8s-ci-robot assigned karan Jul 24, 2018

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 24, 2018

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 24, 2018

k8s-ci-robot merged commit ecc34cd into kubernetes-sigs:master Jul 24, 2018

wangzhen127 deleted the validate-cluster-api branch July 24, 2018 20:41

jayunit100 pushed a commit to jayunit100/cluster-api that referenced this pull request Jan 31, 2020

Merge pull request kubernetes-sigs#298 from akutz/feature/image-boot-…

2f34dee

…timeout Enable custom boot timeout, VNC, & packer flags


		// TODO(wangzhen127): Also validate the cluster in addition to the cluster API objects.

		glog.Infof("Pass. The cluster %q in namespace %q is validated.", cluster.Name, vco.Namespace)

Validate cluster and machine objects #298

Validate cluster and machine objects #298

Conversation

wangzhen127 commented Jun 6, 2018 • edited Loading

wangzhen127 commented Jun 6, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

spew left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wangzhen127 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

karan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wangzhen127 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wangzhen127 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wangzhen127 commented Jun 20, 2018

wangzhen127 commented Jun 22, 2018

dims commented Jun 22, 2018

spew commented Jun 23, 2018

spew left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

spew Jun 23, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

spew Jun 23, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

spew commented Jun 24, 2018

wangzhen127 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wangzhen127 commented Jun 6, 2018 •

edited

Loading

spew left a comment •

edited

Loading

spew Jun 23, 2018 •

edited

Loading

spew Jun 23, 2018 •

edited

Loading