Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate cluster and machine objects #298

Merged

Conversation

wangzhen127
Copy link
Contributor

@wangzhen127 wangzhen127 commented Jun 6, 2018

What this PR does / why we need it:
This PR makes clusterctl validate cluster command validate the cluster and machines object. This is part of #168.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

Special notes for your reviewer:

Release note:

Add "kubectl validate cluster" that validates cluster and machine objects.

@kubernetes/kube-deploy-reviewers

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jun 6, 2018
@wangzhen127
Copy link
Contributor Author

/cc @jessicaochen @karan

}

func validateMachineObjects(machines *v1alpha1.MachineList) error {
for _, machine := range machines.Items {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than report the first machine that has something wrong with it, it may be better if an error was returned / printed for each non-valid machine.

Not entirely sure how this command is intended to be used, but if I wanted to use it to 'fix' my cluster, as a user I will need to run this, correct what is wrong with the first machine, run it again, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Updated.

Copy link
Contributor

@spew spew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jessicaochen and @karan have spent more time thinking about this so please wait for their reviews as well, but here are some thoughts i have.

glog.Infof("Pass. The cluster %q in namespace %q is validated.", cluster.Name, vco.Namespace)
}

func validateClusterAPIObjects() *v1alpha1.Cluster {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does this function return a cluster? That doesn't seem like a result type I would expect from a validate function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly because it needs to print cluster name at the end. And since it allows omitting cluster name, it needs to find the cluster first. I moved getting cluster out of the function now. So it currently only returns error.

clientcmd.NewDefaultClientConfigLoadingRules(),
&clientcmd.ConfigOverrides{}).ClientConfig()
if err != nil {
glog.Fatalf("Failed to create client for talking to the apiserver: %v", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend that this function adds a second return value of 'error' and the calling function, calls glog.Fatal(...) like this:

cluster, error := validateClusterAPIObjects()
if err != nil {
    glog.Fatal(err)
}

It's generally undesirable to have calls to exit an application littered in various places because it becomes more complicated to reason about application behavior as there are more possible ways to exit. It's harder to standardize the error messages that are displayed in these exit scenarios as well as there is not a single location where the final error message is printed and exit is called. In addition, it is more difficult to unit test code that can call exit.

And like I said in a previous comment, ideally we would keep the single glog.Exit(...) call in the main Run function defined on the command to match all other commands in clusterctl. That will be easier for future developers to understand.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Updated

if err := RunValidateCluster(); err != nil {
glog.Exit(err)
}
RunValidateCluster()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think i would keep the single glog.Exit (...) call to match the other commands in the application and make the application simpler to understand for contributing developers. If we are to change that, we should change it wholesale in each command.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed back now.

Copy link
Contributor Author

@wangzhen127 wangzhen127 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the PR with fix.

if err := RunValidateCluster(); err != nil {
glog.Exit(err)
}
RunValidateCluster()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed back now.

glog.Infof("Pass. The cluster %q in namespace %q is validated.", cluster.Name, vco.Namespace)
}

func validateClusterAPIObjects() *v1alpha1.Cluster {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly because it needs to print cluster name at the end. And since it allows omitting cluster name, it needs to find the cluster first. I moved getting cluster out of the function now. So it currently only returns error.

clientcmd.NewDefaultClientConfigLoadingRules(),
&clientcmd.ConfigOverrides{}).ClientConfig()
if err != nil {
glog.Fatalf("Failed to create client for talking to the apiserver: %v", err)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Updated

}

func validateMachineObjects(machines *v1alpha1.MachineList) error {
for _, machine := range machines.Items {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Updated.

Copy link
Contributor

@karan karan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass

return fmt.Errorf("Failed to create client for talking to the apiserver: %v", err)
}

client, err := clientset.NewForConfig(config)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}

cluster, err := getClusterObject(client, vco.ClusterName, vco.Namespace)
if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cuddle this and everywhere else

if err := foo(); err != nil {

}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this place, we need |cluster| outside of the if scope. I have cuddled other places where only err is returned.


cluster, err := getClusterObject(client, vco.ClusterName, vco.Namespace)
if err != nil {
return err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any useful info you can log here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error is already formated in |getClusterObject|, so we do not need to add more info here.

}

err = validateClusterAPIObjects(client, cluster)
if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cuddle, and same comment about logging.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


// TODO(wangzhen127): Also validate the cluster in addition to the cluster API objects.

glog.Infof("Pass. The cluster %q in namespace %q is validated.", cluster.Name, vco.Namespace)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear what "Pass" means. I like the UI of unit tests where you can see each test that ran and the status of each one. We should shoot for something similar.

Checking foo bar... PASS
Checking bar foo... FAIL
  - Failure log: reason for failure
Checking baz baz... PASS

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about the unit test style. Ideally, I wanted to do the following:

// Print "Checking foo bar...", but not newline
err := RunFooBar()
// Print "PASS or FAIL" on the same line as "Checking foo bar..." and then new line

But glog seems to always print newline. Do you know how to work this around? The reason of separating them in 2 prints is because RunFooBar() may take a long time and it is better to tell the user what it is doing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use stdout instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor Author

@wangzhen127 wangzhen127 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. PTAL

}

cluster, err := getClusterObject(client, vco.ClusterName, vco.Namespace)
if err != nil {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this place, we need |cluster| outside of the if scope. I have cuddled other places where only err is returned.


cluster, err := getClusterObject(client, vco.ClusterName, vco.Namespace)
if err != nil {
return err
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error is already formated in |getClusterObject|, so we do not need to add more info here.

}

err = validateClusterAPIObjects(client, cluster)
if err != nil {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


// TODO(wangzhen127): Also validate the cluster in addition to the cluster API objects.

glog.Infof("Pass. The cluster %q in namespace %q is validated.", cluster.Name, vco.Namespace)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about the unit test style. Ideally, I wanted to do the following:

// Print "Checking foo bar...", but not newline
err := RunFooBar()
// Print "PASS or FAIL" on the same line as "Checking foo bar..." and then new line

But glog seems to always print newline. Do you know how to work this around? The reason of separating them in 2 prints is because RunFooBar() may take a long time and it is better to tell the user what it is doing.

return fmt.Errorf("Failed to create client for talking to the apiserver: %v", err)
}

client, err := clientset.NewForConfig(config)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

return nil
}

func validateClusterAPIObjects(client *clientset.Clientset, cluster *v1alpha1.Cluster) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not going to join the review since you seem to have enough reviewers. (Do let me know if you want me to formally review).

My 2cents: Please break the actual validation leg work into a different package. cmd.go files should only contain the limited stuff to do with how to input via command line. Ie. flags etc

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+100

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now put into clusterctl/validation.

Should we put them in pkg/validation or staying in clusterctl/ is fine. This depends on whether other code wants to reuse this.

Copy link
Contributor Author

@wangzhen127 wangzhen127 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. PTAL

return nil
}

func validateClusterAPIObjects(client *clientset.Clientset, cluster *v1alpha1.Cluster) error {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now put into clusterctl/validation.

Should we put them in pkg/validation or staying in clusterctl/ is fine. This depends on whether other code wants to reuse this.


// TODO(wangzhen127): Also validate the cluster in addition to the cluster API objects.

glog.Infof("Pass. The cluster %q in namespace %q is validated.", cluster.Name, vco.Namespace)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@wangzhen127 wangzhen127 force-pushed the validate-cluster-api branch 2 times, most recently from a134530 to eb250f5 Compare June 20, 2018 22:52
@wangzhen127
Copy link
Contributor Author

I rebased and squashed the commits to remove conflicts. PTAL.

@wangzhen127
Copy link
Contributor Author

@spew , since @karan is very busy recently, can I ask you as the main reviewer for this PR?

@dims
Copy link
Member

dims commented Jun 22, 2018

/ok-to-test

@k8s-ci-robot k8s-ci-robot removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jun 22, 2018
@spew
Copy link
Contributor

spew commented Jun 23, 2018

I am taking over this review -- please hold until I am able to review.

Copy link
Contributor

@spew spew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is my first pass -- I avoided comments about the UX (the actual output of the command) as it is difficult to imagine without seeing the output in some form. I have some comments in the review about adding tests that verify the output of the command, by adding those tests, and their associated sample output, it will be obvious what the output of the command is and the impact of various print statements, etc. So once those tests are added I may have some comments about the UX with regards to command output.

I also avoided focussing too much on simplifying the code inside of validate_cluster.go and validate_cluster_api_objects.go because I want to wait until some other changes are made as it may impact those files.

Please add release notes -- this PR contains user impacting features and changes.


machines, err := client.ClusterV1alpha1().Machines(namespace).List(meta_v1.ListOptions{})
if err != nil {
return fmt.Errorf("Failed to get the machines from the apiserver in namespace %q: %v", namespace, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

go style: error strings should be lowercase so that they can be easily chained together. Please fix by making all error strings in this PR lowercase.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

kubeconfig = util.GetDefaultKubeConfigPath()
}
}
client, err := clientcmd.NewClusterApiClientForDefaultSearchPath(kubeconfig)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NewClusterApiClientForDefaultSearchPath(...) does all of the work that lines 66 -> 71 do automatically, i.e. loading from $KUBECONFIG and looking in the $HOME folder for kubeconfig(s) (and handling extra cases). Please remove and just pass vco.Kubeconfig to NewClusterApiClientForDefaultSearchPath(...). The empty string is expected and handled by that method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

validateClusterCmd.Flags().StringVarP(
&vco.Namespace, "namespace", "n", v1.NamespaceDefault,
"If present, validate the cluster in the specified namespace.")
validateClusterCmd.Flags().StringVarP(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These description strings are better than what is present in kubectl, however, it may be better to provide a consistent experience which means we should push back improvmenets to client-go. Please see comments from @roberthbailey about this on my PR here: https://github.com/kubernetes-sigs/cluster-api/pull/384/files

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK

return fmt.Errorf("Failed to create client for talking to the apiserver: %v", err)
}

if err = validation.ValidateClusterAPIObjects(client, vco.ClusterName, vco.Namespace); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a PR coming for clusterctl delete cluster where I add the same name and namespace flags. After that PR it will be possible to get a kubernetes & cluster api client that is scoped to the appropraite namespace and clustername so it will mean you don't need to pass them around everywhere and the complexity of this code can be reduced. That PR is likely to merge first so please integrate with it: https://github.com/kubernetes-sigs/cluster-api/pull/384/files

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The client object is not scoped to the provided namespace and cluster name only, right? I think we still need to pass in the cluster name and namespace.

glog.Exit(err)
os.Stdout.Sync()
fmt.Fprintf(os.Stderr, "ERROR: %v\n", err)
os.Exit(1)
Copy link
Contributor

@spew spew Jun 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These three lines should be reverted and we should leave the consistent "glog.Exit" across all commands (I think there is one inconsistency right now but we should not make it even more inconsistent). If there is an issue with that approach it should be addressed in its own PR and for all commands in clusterctl.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fmt print functions are used in validate_cluster_api_objects.go instead of glog. The reason is that we want to use unittest style of output, like the following.

Checking machine object "gce-master-test1"... PASS
Checking machine object "gce-node-4k648"... PASS

In the code, we do:

  1. Print Checking machine object "XYZ"...
  2. Actually do the check
  3. Print PASS if actually pass, otherwise FAIL on the same line.

So in step 1, we need to print some message without newline character after .... glog cannot do that (always adding a newline character). So I have to use fmt print. To be consistent, I used fmt print here, too (glog has its own formatting).

Let me know if you are aware of any other good ways to work around.

@@ -16,7 +16,11 @@ limitations under the License.

package testutil

import "sigs.k8s.io/cluster-api/pkg/apis/cluster/v1alpha1"
import (
Copy link
Contributor

@spew spew Jun 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

util packages, especially one as general purpose as "testutil" are an anti pattern -- and inevitably becoming a dumping ground for tons of stuff they shouldn't have. These methods are not yet being used by any other tests / code so there isn't a reason to add them to a util package. These are not things that belong in a package named "testutil" as they are specific to tests needing cluster api fixtures and not every tests. It's also extremely unlikely that the methods can be flexible enough to cover all use cases of all tests that would potentially need a cluster with error fixture or a machine with error fixture so they are out of place in a util package.

Instead, it's better to just put these methods in validate_cluster_api_objects_test.go where they are being used. Please move them there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -0,0 +1,242 @@
/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests in this file do a good job of verifying the error cases, however, validate_cluster_api_objects.go has a side effect which is a lot of string output. We should have tests that verify this output and catch when changes are made to the output. The 'fmt' library has methods which take an io.Writer. We should change validate_cluster_api_objects to take an io.Writer.

The CLI command can pass in os.Stdout and potentially os.Stderr as well if we want to support writing to a regular output and an error output. The tests would pass in a bytes.Buffer which can be used in place of the io.Writer interface: https://golang.org/pkg/bytes/#Buffer.Write. The buffer can be converted to string and then the output can be compared against what is expected. The expected output should probably be checked into golden files in the testdata folder so that we can reason about the user experience of this command / file.

Please add tests that verify the fmt output of at least two scenarios:

  1. Success (no errors / FAILs)
  2. Errors (at least 2 errors / FAILS in the output).

)

func TestGetClusterObject(t *testing.T) {
testClusterName := "test-cluster"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function has three groupings of test cases, No cluster, one cluster, and two clusters. Tests should be for a single case or a single table driven set of cases all in the same general area testing the same method. It will be extremely difficult for a future developer to understand and make changes to these tests.

Please refactor this into three different test methods.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

if err = validateMachineObjects(machines); err != nil {
return err
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please create an TODO issue and link it here for the missing verification of MachineDeployments and MachineSets.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the verification of those two? Would verifying machines be enough?

Or put it in another way, what are the cases where machines are all ok, but MachineDeployments or MachineSets have errors?

if numOfClusters := len(clusters.Items); numOfClusters == 0 {
return nil, fmt.Errorf("Fail: No cluster exists in namespace %q.", namespace)
} else if numOfClusters > 1 {
return nil, fmt.Errorf("Fail: There are more than one cluster in namespace %q. Please specify --cluster-name.", namespace)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

grammar issue, please change: "There are more" -> "There is more"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@spew
Copy link
Contributor

spew commented Jun 24, 2018

/assign @spew

@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jun 28, 2018
Copy link
Contributor Author

@wangzhen127 wangzhen127 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review. Updated the PR. Also adds release notes and test data for output tests. PTAL

glog.Exit(err)
os.Stdout.Sync()
fmt.Fprintf(os.Stderr, "ERROR: %v\n", err)
os.Exit(1)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fmt print functions are used in validate_cluster_api_objects.go instead of glog. The reason is that we want to use unittest style of output, like the following.

Checking machine object "gce-master-test1"... PASS
Checking machine object "gce-node-4k648"... PASS

In the code, we do:

  1. Print Checking machine object "XYZ"...
  2. Actually do the check
  3. Print PASS if actually pass, otherwise FAIL on the same line.

So in step 1, we need to print some message without newline character after .... glog cannot do that (always adding a newline character). So I have to use fmt print. To be consistent, I used fmt print here, too (glog has its own formatting).

Let me know if you are aware of any other good ways to work around.

kubeconfig = util.GetDefaultKubeConfigPath()
}
}
client, err := clientcmd.NewClusterApiClientForDefaultSearchPath(kubeconfig)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

return err
}

// TODO(wangzhen127): Also validate the cluster in addition to the cluster API objects.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the link to github issue now, though I saw other places are not following this rule.

return fmt.Errorf("Failed to create client for talking to the apiserver: %v", err)
}

if err = validation.ValidateClusterAPIObjects(client, vco.ClusterName, vco.Namespace); err != nil {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The client object is not scoped to the provided namespace and cluster name only, right? I think we still need to pass in the cluster name and namespace.

}
},
}

func init() {
validateClusterCmd.Flags().StringVarP(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

func validateMachineObjects(machines *v1alpha1.MachineList) error {
pass := true
for _, machine := range machines.Items {
fmt.Printf("Checking machine object %q... ", machine.Name)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

fmt.Printf("PASS\n")
}
if !pass {
return fmt.Errorf("Failed to validate machine objects.")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to "failed to validate some machine object." Just like to start with "fail". :)

for _, machine := range machines.Items {
fmt.Printf("Checking machine object %q... ", machine.Name)
if machine.Status.ErrorReason != nil || machine.Status.ErrorMessage != nil {
var reason common.MachineStatusError = ""
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ErrorReason has its own type MachineStatusError. Doing this means that we need type conversion. Is it ok to always assume ErrorReason being a string under the hood?

)

func TestGetClusterObject(t *testing.T) {
testClusterName := "test-cluster"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -16,7 +16,11 @@ limitations under the License.

package testutil

import "sigs.k8s.io/cluster-api/pkg/apis/cluster/v1alpha1"
import (
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}
},
}

func init() {
validateClusterCmd.Flags().StringVarP(
&vco.Kubeconfig, "kubeconfig", "", "",
"The file path of the kubeconfig file for Cluster API management stack. If not specified, $KUBECONFIG environment variable or ${HOME}/.kube/config is used.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer if this said "The file path of the kubeconfig file for the cluster to validate." The kubeconfig doesn't actually specify anything about the cluster api stack directly, instead it will give access to the core api server, which knows about the cluster api server.

if err != nil {
return fmt.Errorf("failed to get the machines from the apiserver in namespace %q: %v", namespace, err)
}
if err = validateMachineObjects(w, machines); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can reduce the number of lines here by directly returning validateMachineObjects(w, machines).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


machines, err := client.ClusterV1alpha1().Machines(namespace).List(meta_v1.ListOptions{})
if err != nil {
return fmt.Errorf("failed to get the machines from the apiserver in namespace %q: %v", namespace, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of these errors and wrapped before returning, and some are not, it there a reason for that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's because |getClusterObject| and |validateClusterObject| are also functions inside this file. They wrap their errors by themselves already to provide the context. Line 40 is some function provided by other pkg, so I want to wrap it and provide some context here.

if cluster.Status.ErrorReason != "" || cluster.Status.ErrorMessage != "" {
fmt.Fprintf(w,"FAIL\n")
fmt.Fprintf(w,"\t[%v]: %s\n", cluster.Status.ErrorReason, cluster.Status.ErrorMessage)
return fmt.Errorf("failed to validate the cluster %q.", cluster.Name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would make more sense to say "cluster failed validation". To me, saying "failed to validate the cluster" implies that the validation code had an error, but in this case, the cluster itself was invalid.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

fmt.Fprintf(w,"\t[%v]: %s\n", reason, message)
return false
}
if machine.Status.NodeRef == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have plans later to check that the nodeRef points to an existing node, and that the node is in the ready state?

Copy link
Contributor Author

@wangzhen127 wangzhen127 Jul 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added now.

t.Run(testcase.name, func(t *testing.T) {
cluster, err := getClusterObject(client, testcase.clusterName, testcase.namespace)
if (testcase.expectErr && err == nil) || (!testcase.expectErr && err != nil) {
t.Fatalf("Unexpected returned error. Got: %v, Want Err: %v", err, testcase.expectErr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since expectErr is a boolean, this will read:
"Unexpected returned error. Got: error message, Want Err: true"
Is that the intended behavior? It seems like it would be more useful to print the error message that you expected

Copy link
Contributor Author

@wangzhen127 wangzhen127 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. PTAL.


machines, err := client.ClusterV1alpha1().Machines(namespace).List(meta_v1.ListOptions{})
if err != nil {
return fmt.Errorf("failed to get the machines from the apiserver in namespace %q: %v", namespace, err)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's because |getClusterObject| and |validateClusterObject| are also functions inside this file. They wrap their errors by themselves already to provide the context. Line 40 is some function provided by other pkg, so I want to wrap it and provide some context here.

if err != nil {
return fmt.Errorf("failed to get the machines from the apiserver in namespace %q: %v", namespace, err)
}
if err = validateMachineObjects(w, machines); err != nil {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

if cluster.Status.ErrorReason != "" || cluster.Status.ErrorMessage != "" {
fmt.Fprintf(w,"FAIL\n")
fmt.Fprintf(w,"\t[%v]: %s\n", cluster.Status.ErrorReason, cluster.Status.ErrorMessage)
return fmt.Errorf("failed to validate the cluster %q.", cluster.Name)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

fmt.Fprintf(w,"\t[%v]: %s\n", reason, message)
return false
}
if machine.Status.NodeRef == nil {
Copy link
Contributor Author

@wangzhen127 wangzhen127 Jul 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added now.

@wangzhen127 wangzhen127 force-pushed the validate-cluster-api branch 3 times, most recently from 5f5cc57 to bdfde6a Compare July 13, 2018 21:31
var b bytes.Buffer
err := validateMachineObjects(&b, &machines, k8sClient)
if (testcase.expectErr && err == nil) || (!testcase.expectErr && err != nil) {
t.Fatalf("Unexpected returned error. Got: %v, Want Err: %v", err, testcase.expectErr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The text that gets printed from this line is a bit unclear to me.

In one case, we didn't expect an error but we got one, in that case, this will be printed:
"Unexpected returned error. Got: error message, Want Err: false"
It reads as if the "false" at the end should be an error that we DID expect to get.

In the other case, we did expect an error but didn't get one, in that case it will read:
"Unexpected returned error. Got: nil, Want Err: true"
The first sentence reads as if we got an error, but in fact the error was nil.

It might help to split these cases up so that the text can be different in these 2 cases, what do you think?

@mkjelland
Copy link
Contributor

lgtm, can you squash the commits?

@karan Would you mind taking another pass at this and adding the lgtm label if it looks good?

@karan
Copy link
Contributor

karan commented Jul 16, 2018

Yes. I'll take a look

@wangzhen127
Copy link
Contributor Author

I will squash now. Github is not showing my 3rd commit yet. Will let you know once I squash, so that all changes are in place.

@wangzhen127
Copy link
Contributor Author

Just rebased and squshed. And it seems that github is functioning correctly now. PTAL @karan

)

func ValidateClusterAPIObjects(w io.Writer, clusterApiClient *clientset.Clientset, k8sClient kubernetes.Interface, clusterName string, namespace string) error {
fmt.Fprintf(w, "Validating cluster API objects in namespace %q\n", namespace)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: where Cluster API is used as a noun, should be title case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

if err != nil {
return err
}
if err := validateClusterObject(w, cluster); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the cluster object you get above is not used anywhere else in this method. I recommend you getClusterObject in validateClusterObject itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separating them makes testing much easier. So when testing validateClusterObject, I can just provide fake cluster objects. If I move getClusterObject into validateClusterObject, I will have to manipulate with the fake client, too.

To keep current tests, another way is to change to the following:

func ValidateClusterAPIObjects(...) {
    getAndValidateClusterObject(...)
    getAndValidateMachineObjects(...)
}
func getAndValidateClusterObject(...) {
    cluster := getClusterObject(...)
    validateClusterObject(cluster ...)
}
func getAndValidateMachineObjects(...) {
    machines := getMachineObjects(...)
    validateMachineObjects(machines ...)
}

But I think the above is actually making current code more complex.

return fmt.Errorf("failed to get the machines from the apiserver in namespace %q: %v", namespace, err)
}

return validateMachineObjects(w, machines, k8sClient)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor Author

@wangzhen127 wangzhen127 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. PTAL

)

func ValidateClusterAPIObjects(w io.Writer, clusterApiClient *clientset.Clientset, k8sClient kubernetes.Interface, clusterName string, namespace string) error {
fmt.Fprintf(w, "Validating cluster API objects in namespace %q\n", namespace)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

if err != nil {
return err
}
if err := validateClusterObject(w, cluster); err != nil {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separating them makes testing much easier. So when testing validateClusterObject, I can just provide fake cluster objects. If I move getClusterObject into validateClusterObject, I will have to manipulate with the fake client, too.

To keep current tests, another way is to change to the following:

func ValidateClusterAPIObjects(...) {
    getAndValidateClusterObject(...)
    getAndValidateMachineObjects(...)
}
func getAndValidateClusterObject(...) {
    cluster := getClusterObject(...)
    validateClusterObject(cluster ...)
}
func getAndValidateMachineObjects(...) {
    machines := getMachineObjects(...)
    validateMachineObjects(machines ...)
}

But I think the above is actually making current code more complex.

return fmt.Errorf("failed to get the machines from the apiserver in namespace %q: %v", namespace, err)
}

return validateMachineObjects(w, machines, k8sClient)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@wangzhen127
Copy link
Contributor Author

Ping @karan :)

@wangzhen127
Copy link
Contributor Author

Thanks! Just rebased and sqaushed all commits.

@karan
Copy link
Contributor

karan commented Jul 24, 2018

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 24, 2018
@karan
Copy link
Contributor

karan commented Jul 24, 2018

/approve

1 similar comment
@karan
Copy link
Contributor

karan commented Jul 24, 2018

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: karan, wangzhen127

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 24, 2018
@k8s-ci-robot k8s-ci-robot merged commit ecc34cd into kubernetes-sigs:master Jul 24, 2018
@wangzhen127 wangzhen127 deleted the validate-cluster-api branch July 24, 2018 20:41
jayunit100 pushed a commit to jayunit100/cluster-api that referenced this pull request Jan 31, 2020
…timeout

Enable custom boot timeout, VNC, & packer flags
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants