KEP 2551 - kubectl exit code normalization #2574

rikatz · 2021-03-16T14:48:59Z

xref Issue: #2551

apelisse · 2021-04-07T17:44:17Z

Note that some exit code are not actual errors. You should definitely account for that.

rikatz · 2021-04-07T19:28:19Z

Note that some exit code are not actual errors. You should definitely account for that.

True!! So we've discussed in todays sig-cli and I'll try to reduce the scope/amount of error codes, like having:

An error code that represents some problem on the client side
A small amount of error codes that might represent apiserver errors (although @soltysh pointed that some forbidden or not found errors may not be well differentiated in some parts like kubectl get, so maybe having the 'forbidden' code for write operations).
Errors over 128 can be used to represent the sum of something forked. As an example: https://www.gnu.org/software/bash/manual/html_node/Exit-Status.html we can use 129 (128 + 1) if some plugin, called from kubectl exists with 1 (and this may apply to diff as well!)

sftim · 2021-04-08T11:42:27Z

keps/sig-cli/2551-return-code-normalization/README.md

+
+### Goals
+
+* Document possible exit codes for kubectl 


Will we also cover expectations for plugins and their exit codes? Maybe that's a non-goal at this stage.

Hey Tim,

IMO we should not normalize the plugins exit codes, but plugins enter in the 'external programs' category, which we can say that the error code is 128 + the plugin exit code (same for kubectl diff, as an example).

Now I'm questioning myself what happens if a plugin ends with an error code bigger then 128, which will be 128 (from the kubectl) + 128 (from the plugin).

During the design, this should be dealt like (if > 255 then return 255 or something inside the supported error range)

We can recommend but that's not a hard requirement.

keps/sig-cli/2551-return-code-normalization/README.md

soltysh · 2021-05-11T12:35:08Z

keps/sig-cli/2551-return-code-normalization/README.md

+
+### Goals
+
+* Document possible exit codes for kubectl 


We can recommend but that's not a hard requirement.

keps/sig-cli/2551-return-code-normalization/README.md

soltysh · 2021-05-11T12:37:39Z

keps/sig-cli/2551-return-code-normalization/README.md

+The majority of commands already are organized as the following:
+* Run Complete to complete missing information with defaults. This runs on the client side
+* Run Validation to check command syntax and missing arguments. This runs on the client side
+* Run the command itself. This might run on the client side, be dry-run, run an external command or call the APIServer.


This might actually change during the separattion of cobra from Options struct, so it would be nice to put it in a more generic way.

Example: validation and execution.

@deejross will take a look on this

soltysh · 2021-05-11T12:38:22Z

keps/sig-cli/2551-return-code-normalization/README.md

+kubectl exec and run uses the pod exit code as its own exit code, we should figure out how we 
+should deal with this
+https://github.com/kubernetes/kubernetes/blob/v1.22.0-alpha.0/test/e2e/kubectl/kubectl.go#L496
+https://github.com/kubernetes/kubernetes/blob/v1.22.0-alpha.0/staging/src/k8s.io/kubectl/pkg/cmd/util/helpers.go#L178-L179


Yeah, that's a big one and how we can properly handle that.

Should this follow the 128+exit code rule proposed for plugins?

One other option is to start kubectl error codes at 201. This would allow exec to continue returning the pod exit code, assuming it's less than 201. If the pod exit code is 201 or greater, then set it to 255. Maybe we should apply the same logic to external commands as well, such as diff to keep it consistent. This would allow those commands to return their codes unaltered (if less than 201), and make kubectl generated error codes distinct to reduce confusion around the origin of an error. Thoughts on this approach?

@soltysh wdyt about this pattern of using RC > 200? I was discussing with Ross here that this may be a good approach but not the desired by sig-cli.

How about having kubectl exec take a parameter --use-exit-code that defaults to true, and that you can force to false if you want success to mean “we were able to exec something”?

soltysh · 2021-05-11T12:39:25Z

keps/sig-cli/2551-return-code-normalization/README.md

+### Creating new error parser functions
+
+Another design solution is to create helper functions for each steps:
+* When running Complete, Validate or other client side steps, call cmdutil.CheckClientErr(err) and exits with some well defined client error code, mapped to ErrorCodeClient


As mentioned before I'd describe this in slightly more generic way, since this implementation might change.

@deejross to take a look into this

k8s-triage-robot · 2021-08-09T12:43:23Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

rikatz · 2021-08-09T12:50:10Z

/remove-lifecycle stale
I will still work on this :D
/lifecycle active

k8s-triage-robot · 2021-11-14T17:11:17Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

eddiezane · 2021-11-22T22:06:25Z

/remove-lifecycle stale

soltysh

A couple of nits, but mostly this looks good
/hold
for you to get PRR file added and nits addressed
/lgtm
/approve

soltysh · 2022-01-28T15:15:44Z

keps/sig-cli/2551-return-code-normalization/README.md

+for the developers to warn when a new deployment fails because of the lack of some permission, so those permissions can be 
+updated for the pipeline to work correctly.
+
+The developers are making a lot of changes, and they keeps asking for Bruce to look for every pipeline execution, even those that 


Suggested change

The developers are making a lot of changes, and they keeps asking for Bruce to look for every pipeline execution, even those that

The developers are making a lot of changes, and they keep asking for Bruce to look for every pipeline execution, even those that

soltysh · 2022-01-28T15:16:02Z

keps/sig-cli/2551-return-code-normalization/README.md

+wants to warn users when the apply command fails because of differences between the manifests.
+
+#### Story 2
+Bruce Wayne, the security administrator of the Gotham Inc Company is following the development of a new product. Bruce asked


Suggested change

Bruce Wayne, the security administrator of the Gotham Inc Company is following the development of a new product. Bruce asked

Bruce Wayne, the security administrator of the Gotham Inc company is following the development of a new product. Bruce asked

soltysh · 2022-01-28T15:18:02Z

keps/sig-cli/2551-return-code-normalization/README.md

+
+| Code  | Description                                                                                           |
+| ----- | ----------------------------------------------------------------------------------------------------- |
+| 1-200 | Reserved for exit codes from exec and external commands                                               |


I was wondering if we need to reserve all 200, if just the initial 100 isn't sufficient? But that's something we'll need to empirically figure out as we go.

agreed. This was a discussion I was on with Ross, that maybe the code reservation can be something more sparse or in a "range" other than reserving all the 200

Let's "close" this list before going to beta, for alpha let's leave it open.

soltysh · 2022-01-28T15:19:33Z

keps/sig-cli/2551-return-code-normalization/README.md

+| 203   | Client configuration error, invalid or missing configuration                                          |
+| 204   | Network failure, API could not be reached                                                             |
+| 205   | Authentication failure, identity could not be determined                                              |
+| 206   | Authorization failure, identity was determined, but does not have access to requested resource(s)     |


This and not found are hard to distinguish due to server returning 404 Not Found in both cases, due to security reasons.

We can leave this as an UNRESOLVED tag so we are stating this is not solved yet

Exactly what I said above :)

soltysh · 2022-01-28T15:23:26Z

keps/sig-cli/2551-return-code-normalization/README.md

+
+### Creating new error parser functions
+
+Another design solution is to create helper functions for each steps:


I was thinking about it earlier this week and I think that implementing a set of new error codes inside kubectl where each command could use them and then having CheckErr return new error codes with that env of yours and just -1 in the backwards compatible way as it does today. Which is close with what you're describing here.

sorry @soltysh is there any action on this? I'm a bit slow this latest days :)

No action required, this is more like a thought for you and @deejross to consider when implementing 😉

soltysh · 2022-01-28T15:24:42Z

keps/sig-cli/2551-return-code-normalization/README.md

+  CRI or CNI may require updating that component before the kubelet.
+-->
+
+## Production Readiness Review Questionnaire


You're missing file in https://github.com/kubernetes/enhancements/blob/master/keps/prod-readiness/ and make sure to update the template with current one.

rikatz · 2022-01-28T21:25:02Z

PRR added
/assign @johnbelamaric

johnbelamaric

Minor tweaks needed to PRR but looks good in general.

keps/sig-cli/2551-return-code-normalization/README.md

johnbelamaric · 2022-01-31T22:27:49Z

keps/sig-cli/2551-return-code-normalization/kep.yaml

+# The milestone at which this feature was, or is targeted to be, at each stage.
+milestone:
+  alpha: "v1.24"
+


you're missing some fields from the latest template, please update

johnbelamaric · 2022-01-31T22:28:44Z

keps/sig-cli/2551-return-code-normalization/README.md

+
+## Production Readiness Review Questionnaire
+
+### Feature Enablement and Rollback


I think you have an older version of the questions, please review the latest template and include any new questions as well. Thanks.

apelisse · 2022-02-01T00:34:15Z

keps/sig-cli/2551-return-code-normalization/kep.yaml

+owning-sig: sig-cli
+status: implementable
+creation-date: 2021-03-16
+reviewers:


I took a look thanks!

rikatz · 2022-02-01T01:30:14Z

@johnbelamaric fixed based on your comments, thanks :)

soltysh

Some minor nits, but this is good, thx!
/lgtm
/approve

soltysh · 2022-02-01T17:05:39Z

keps/sig-cli/2551-return-code-normalization/README.md

+
+Items marked with (R) are required *prior to targeting to a milestone / release*.
+
+- [ ] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)


Nit: some of these checkboxes need to be checked.

soltysh · 2022-02-01T17:07:52Z

keps/sig-cli/2551-return-code-normalization/README.md

+
+| Code  | Description                                                                                           |
+| ----- | ----------------------------------------------------------------------------------------------------- |
+| 1-200 | Reserved for exit codes from exec and external commands                                               |


Let's "close" this list before going to beta, for alpha let's leave it open.

soltysh · 2022-02-01T17:08:09Z

keps/sig-cli/2551-return-code-normalization/README.md

+| 203   | Client configuration error, invalid or missing configuration                                          |
+| 204   | Network failure, API could not be reached                                                             |
+| 205   | Authentication failure, identity could not be determined                                              |
+| 206   | Authorization failure, identity was determined, but does not have access to requested resource(s)     |


Exactly what I said above :)

soltysh · 2022-02-01T17:08:59Z

keps/sig-cli/2551-return-code-normalization/README.md

+
+### Creating new error parser functions
+
+Another design solution is to create helper functions for each steps:


No action required, this is more like a thought for you and @deejross to consider when implementing 😉

johnbelamaric · 2022-02-01T20:07:10Z

/approve

k8s-ci-robot · 2022-02-01T20:07:35Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: johnbelamaric, rikatz, soltysh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~keps/prod-readiness/OWNERS~~ [johnbelamaric]
~~keps/sig-cli/OWNERS~~ [johnbelamaric,soltysh]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

soltysh · 2022-02-02T15:33:03Z

/hold cancel

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Mar 16, 2021

k8s-ci-robot requested review from pwittrock and seans3 March 16, 2021 14:49

k8s-ci-robot added sig/cli Categorizes an issue or PR as relevant to SIG CLI. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Mar 16, 2021

soltysh self-assigned this Mar 16, 2021

rikatz mentioned this pull request Apr 7, 2021

Wrong error code thrown when list of a certain resource is empty kubernetes/kubectl#847

Open

sftim reviewed Apr 8, 2021

View reviewed changes

github-actions bot mentioned this pull request Apr 21, 2021

Week Ending April 14, 2021 dev-obs/actus#357

Open

soltysh reviewed May 11, 2021

View reviewed changes

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 9, 2021

k8s-ci-robot added lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Aug 9, 2021

eddiezane mentioned this pull request Sep 16, 2021

failed label selector should not return code 0 kubernetes/kubectl#1115

Closed

k8s-ci-robot added lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. and removed lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. labels Nov 14, 2021

eddiezane linked an issue Nov 22, 2021 that may be closed by this pull request

kubectl return code normalization #2551

Open

4 tasks

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 22, 2021

eddiezane mentioned this pull request Nov 22, 2021

[kubectl] copy non existent source in container returns zero return code kubernetes/kubernetes#106312

Closed

rikatz and others added 5 commits January 13, 2022 14:29

Initial KEP 2551 commit

8747442

Improve design details

d9d10a8

Add some more details, unresolved things, etc

1d73042

Add a note that impl needs to check max error code

12ef830

Add proposed exit codes

b5399ae

eddiezane mentioned this pull request Jan 18, 2022

kubectl diff exits with code 0 when files don't exist kubernetes/kubectl#1087

Closed

soltysh approved these changes Jan 28, 2022

View reviewed changes

k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm "Looks good to me", indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jan 28, 2022

PRR questionaire and kep review

6de8947

k8s-ci-robot assigned johnbelamaric Jan 28, 2022

Remove beta PRR

71cbf14

johnbelamaric reviewed Jan 31, 2022

View reviewed changes

apelisse reviewed Feb 1, 2022

View reviewed changes

Fix PRR review comments

d51c5c2

rikatz force-pushed the kubectl-err-normalization branch from bacba4f to d51c5c2 Compare February 1, 2022 01:31

soltysh approved these changes Feb 1, 2022

View reviewed changes

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 1, 2022

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 1, 2022

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 2, 2022

k8s-ci-robot merged commit 9df0131 into kubernetes:master Feb 2, 2022

k8s-ci-robot added this to the v1.24 milestone Feb 2, 2022

lauchokyip mentioned this pull request Feb 2, 2022

add an option for kubectl diff to exit 0 when differences are found kubernetes/kubectl#1173

Closed

ardaguclu mentioned this pull request Feb 24, 2022

Enable set commands can pipe through apply even when fails kubernetes/kubernetes#106706

Closed

ardaguclu mentioned this pull request Mar 7, 2022

"kubectl set" fails to output yaml sections that don't match. kubernetes/kubernetes#106617

Closed

	The developers are making a lot of changes, and they keeps asking for Bruce to look for every pipeline execution, even those that
	The developers are making a lot of changes, and they keep asking for Bruce to look for every pipeline execution, even those that

	Bruce Wayne, the security administrator of the Gotham Inc Company is following the development of a new product. Bruce asked
	Bruce Wayne, the security administrator of the Gotham Inc company is following the development of a new product. Bruce asked


		### Creating new error parser functions

		Another design solution is to create helper functions for each steps:


		## Production Readiness Review Questionnaire

		### Feature Enablement and Rollback


		Items marked with (R) are required prior to targeting to a milestone / release.

		- [ ] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)

KEP 2551 - kubectl exit code normalization #2574

KEP 2551 - kubectl exit code normalization #2574

Conversation

rikatz commented Mar 16, 2021 • edited Loading

apelisse commented Apr 7, 2021

rikatz commented Apr 7, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

deejross Aug 16, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

k8s-triage-robot commented Aug 9, 2021

rikatz commented Aug 9, 2021

k8s-triage-robot commented Nov 14, 2021

eddiezane commented Nov 22, 2021

soltysh left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rikatz commented Jan 28, 2022

johnbelamaric left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rikatz commented Feb 1, 2022

soltysh left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

johnbelamaric commented Feb 1, 2022

k8s-ci-robot commented Feb 1, 2022

soltysh commented Feb 2, 2022

rikatz commented Mar 16, 2021 •

edited

Loading

deejross Aug 16, 2021 •

edited

Loading