Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to configure kubeadm's ignored pre-flight errors via InitConfiguration and JoinConfiguration #75499

Merged
merged 2 commits into from
May 24, 2019
Merged

Add ability to configure kubeadm's ignored pre-flight errors via InitConfiguration and JoinConfiguration #75499

merged 2 commits into from
May 24, 2019

Conversation

marccarre
Copy link
Contributor

@marccarre marccarre commented Mar 20, 2019

What type of PR is this?
/kind feature
/sig cluster-lifecycle

What this PR does / why we need it:
This adds a more declarative way to configure ignored pre-flight errors -- via InitConfiguration and JoinConfiguration, in addition to the "legacy" CLI flag -- and therefore simplifies programmatic usage of kubeadm: just one configuration file to generate & to pass to kubeadm.

Which issue(s) this PR fixes:
Fixes #74246.
Fixes kubernetes/kubeadm/issues/1460.

Does this PR introduce a user-facing change?:

kubeadm's ignored pre-flight errors can now be configured via InitConfiguration and JoinConfiguration.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Mar 20, 2019
@k8s-ci-robot
Copy link
Contributor

Hi @marccarre. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. area/kubeadm labels Mar 20, 2019
@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 20, 2019
Copy link
Member

@neolit123 neolit123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the PR @marccarre
i think, this is pretty much what we want to do, except that it has to happen in v1beta2.

/assign @rosti @fabriziopandini
/hold

cmd/kubeadm/app/apis/kubeadm/v1beta1/types.go Outdated Show resolved Hide resolved
cmd/kubeadm/app/cmd/init.go Outdated Show resolved Hide resolved
@neolit123
Copy link
Member

@marccarre
please squash the commits into one.
/priority important-longterm

@k8s-ci-robot k8s-ci-robot added priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. and removed needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Mar 20, 2019
Copy link
Contributor

@rosti rosti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution @marccarre !
The thing is that we should not modify the existing v1beta1 config. We have to do that modification in the new v1beta2 config if we reach a consensus on that.
Also, my proposition is to have that field in the NodeRegistrationOptions only.

Let's hold this for now, until the time comes for it.
/hold

@marccarre
Copy link
Contributor Author

@rosti, @neolit123, thanks for your feedback ✨
I've addressed some of it (and will squash at the very end once everyone is happy about this PR), but now face build issues as per #75499 (comment). Apologies for being new to kubeadm/kubernetes' development, but any pointer on how to resolve this would be very much appreciated.

Copy link
Contributor

@rosti rosti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating this @marccarre !
However, we need to hold on until the new config (probably v1beta2) becomes available. More developments on this question are to come probably next week.
We need to add the new field in that config and the internal one (not in v1beta1 or v1alpha3).
I'll ping you for a rebase and more guidance once we are ready for this.
It's just that we need to do a few more things before we can proceed with this.

cmd/kubeadm/app/apis/kubeadm/types.go Outdated Show resolved Hide resolved
@marccarre
Copy link
Contributor Author

we need to hold on until the new config (probably v1beta2) becomes available

Sure, no worries. I'll wait then 🙂

I haven't dug into the code generation logic (yet), but I'm guessing some of the build issues I'm seeing will change/disappear once v1beta2 is added.

@rosti
Copy link
Contributor

rosti commented Mar 21, 2019

I haven't dug into the code generation logic (yet), but I'm guessing some of the build issues I'm seeing will change/disappear once v1beta2 is added.

Mostly yes.

@rosti
Copy link
Contributor

rosti commented Apr 30, 2019

@marccarre, with the merger of #76710 you are now able to continue working on this based on v1beta2.
In addition to that we need to verify the new field and make sure it cannot be set to "all" and therefore ignore all pre-flight errors from the config easily.
Thanks for taking this on!

@kad
Copy link
Member

kad commented May 14, 2019

I thought about this PR, and I think the better way to implement that will be to modify validation.ValidateIgnorePreflightErrors such, that it will take two arrays of strings as arguments. One from command line, one from config objects. Then, merging into one set can be done inside validation.ValidateIgnorePreflightErrors by iterating both of the arrays, practically getting rid of helper function union. Also it would be good to validate that array from config object will not contain 'all', as config object should not be used as "hide everything" workarround, instead of explicitly specifying what should be ignored.

@marccarre
Copy link
Contributor Author

@rosti, @fabriziopandini, @neolit123, @kad, I've just rebased this PR on top of latest master, added the necessary changes for v1beta2 and addressed the feedback given so far (thank you all! 🙇). PTAL 🙂

@kad
Copy link
Member

kad commented May 15, 2019

Please squash commits. One for implementation/code changes part, one commit for auto-generated stuff.

Copy link
Contributor

@rosti rosti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update @marccarre !
So far it looks great!
However, we need to deny the ability to specify "all" in the config file (but still permit it in the command line). Once that is done we can merge it.

cmd/kubeadm/app/apis/kubeadm/v1beta2/types.go Show resolved Hide resolved
cmd/kubeadm/app/apis/kubeadm/validation/validation.go Outdated Show resolved Hide resolved
cmd/kubeadm/app/cmd/init.go Show resolved Hide resolved
cmd/kubeadm/app/cmd/init_test.go Outdated Show resolved Hide resolved
@rosti
Copy link
Contributor

rosti commented May 15, 2019

/ok-to-test

@k8s-ci-robot k8s-ci-robot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label May 15, 2019
@k8s-ci-robot k8s-ci-robot removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label May 15, 2019
@rosti
Copy link
Contributor

rosti commented May 15, 2019

I thought about this PR, and I think the better way to implement that will be to modify validation.ValidateIgnorePreflightErrors such, that it will take two arrays of strings as arguments. One from command line, one from config objects. Then, merging into one set can be done inside validation.ValidateIgnorePreflightErrors by iterating both of the arrays, practically getting rid of helper function union. Also it would be good to validate that array from config object will not contain 'all', as config object should not be used as "hide everything" workarround, instead of explicitly specifying what should be ignored.

That's one way of doing it.
I would like to keep things simple and scope of functionality clear, hence ValidateIgnorePreflightErrors should still perform only validation of errors, but with a bool param stating the origin of the ignored errors. This is where "all" from a config file should be checked.
Then a wrapper func, that takes a couple of sets (one from command line, another from the config) would call ValidateIgnorePreflightErrors twice. If the "all" keyword is present in the command line errors, it's returned as a set with a single element. Otherwise, the sets need to be combined.

@kad
Copy link
Member

kad commented May 15, 2019

Then a wrapper func, that takes a couple of sets (one from command line, another from the config) would call ValidateIgnorePreflightErrors twice. If the "all" keyword is present in the command line errors, it's returned as a set with a single element. Otherwise, the sets need to be combined.

This will require third iteration over the combined set with check, for mixture or all plus individual check, like it is currently performed in validation.ValidateIgnorePreflightErrors.

@marccarre
Copy link
Contributor Author

marccarre commented May 16, 2019

I hope to have addressed both your requirements.
PTAL, @rosti, @kad, and thanks again for your feedback! ✨
(Previous push had some tests failing, if there still are some after this push, I'll then look into this.)

Copy link
Member

@kad kad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small nits

cmd/kubeadm/app/apis/kubeadm/validation/validation.go Outdated Show resolved Hide resolved
return nil, err
}
// Also set the union of pre-flight errors to InitConfiguration, to provide a consistent view of the runtime configuration:
cfg.NodeRegistration.IgnorePreflightErrors = ignorePreflightErrorsSet.List()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also nit: do we have usage of this somewhere? Otherwise we are loosing prestine state as we read from config file, isn't it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is currently not used anywhere, but given the configuration object is passed around, and this field could eventually be used elsewhere as a result, I thought doing the above might reduce the chance of a bug/inconsistent behaviour. I can remove it if you'd rather keep the configuration file's initial state. Let me know.

Copy link
Contributor

@rosti rosti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @marccarre !
A couple of nits and we are there!

cmd/kubeadm/app/apis/kubeadm/validation/validation.go Outdated Show resolved Hide resolved
@rosti
Copy link
Contributor

rosti commented May 16, 2019

Ok, the fuzzer test is failing, because we don't have the IgnorePreflightErrors in v1beta1. To fix this, we must reintroduce the fuzzNodeRegistration in the fuzzer test and plug the IgnorePreflightErrors field.
#75261 is the PR that got it deleted. You can check with it how to reintroduce it.
The new version of it should roughly look like this:

func fuzzNodeRegistration(obj *kubeadm.NodeRegistrationOptions, c fuzz.Continue) {	
	c.FuzzNoCustom(obj)	

 	// Pinning values for fields that get defaults if fuzz value is empty string or nil (thus making the round trip test fail)	
	obj.IgnorePreflightErrors = nil
}

Also, the reset test needs fixing (missing parameter in the call to ValidateIgnorePreflightErrors).

Copy link
Contributor

@rosti rosti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @marccarre !
I am happy how things turned out. Can you squash your commits as a final step so we can merge it?
/approve
/hold
For the squash.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: marccarre, rosti

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 22, 2019
Specifically, IgnorePreflightErrors in {Init,Join}Configuration's NodeRegistrationOptions can be used to achieve this.
See also: https://docs.google.com/document/d/1XnP67oO1i9VcDIpw42IzptnJsc5OQM-HTf8cVcjCR2w/edit
This commit contains only changes generated by the build process.
Nothing here was manually changed.

Changes made to:
```
cmd/kubeadm/app/apis/kubeadm/validation/BUILD
cmd/kubeadm/app/cmd/BUILD
```
were generated by running:
````
./hack/update-bazel.sh
```
@marccarre
Copy link
Contributor Author

Done: PR rebased on top of latest master and commits squashed.
Thanks again for the feedback! 👍

@rosti
Copy link
Contributor

rosti commented May 23, 2019

@kad can you do a final pass and lgtm if you are happy with this?
/assign @kad
/hold cancel
/test pull-kubernetes-e2e-gce

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 23, 2019
@kad
Copy link
Member

kad commented May 23, 2019

/lgtm
I have some doubts about some corner cases like in order of operations in upgrade steps (e.g. validating flags before/after trying to load cluster config), but I think it is not a big deal at the moment.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 23, 2019
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

1 similar comment
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@k8s-ci-robot k8s-ci-robot merged commit 73b8011 into kubernetes:master May 24, 2019
@marccarre marccarre deleted the issues/74246-more-decl-kubeadm-cli-args branch May 24, 2019 05:45
odinuge added a commit to odinuge/kubernetes that referenced this pull request Jun 1, 2019
This fixes possible problems when kubeadm upgrade can't load the
InitConfig properly. Some new code introduced in
kubernetes#75499 is placed between
the loading of the config and the error handling, hiding possible
errors.

This error cannot be ignored (as is the case now), since the cfg ptr.
returned from the configutil function will be nil in the case of an
error.

Signed-off-by: Odin Ugedal <odin@ugedal.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubeadm cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
7 participants