Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubectl debug: support debugging nodes #92310

Merged
merged 1 commit into from
Jul 13, 2020

Conversation

verb
Copy link
Contributor

@verb verb commented Jun 19, 2020

What type of PR is this?
/kind feature
/sig cli
/priority important-longterm

What this PR does / why we need it:

Which issue(s) this PR fixes:
Fixes #89306

Special notes for your reviewer: This PR includes commits from #90094 and should not merge prior to it. To see the changes introduced by this change alone, use the individual commits view.

Does this PR introduce a user-facing change?:

`kubectl alpha debug` now supports debugging nodes by creating a debugging container running in the node's host namespaces.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

- [KEP]: https://git.k8s.io/enhancements/keps/sig-cli/20190805-kubectl-debug.md#node-troubleshooting-with-privileged-containers

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/feature Categorizes issue or PR as related to a new feature. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. sig/cli Categorizes an issue or PR as relevant to SIG CLI. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jun 19, 2020
names[ref.Name] = ref
}
for i := range pod.Spec.InitContainers {
ref := &pod.Spec.Containers[i]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a typo here ?
In worst case, there may be index out of bounds exception.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @tedyu, this PR is based on #90094, and I think your comment is for the underlying commit. Could you follow up in #90094?

@soltysh soltysh self-assigned this Jun 22, 2020
Copy link
Contributor

@soltysh soltysh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few nits

The action taken by 'debug' varies depending on what resource is specified. Supported
actions include:

* Pod: Create a copy of an existing pod with certain attributes changed, for example
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd call it workload, since it's not a problem to get a pod from a deployment/daemon set/job, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense, and it might be nice to support automatically resolving simple workloads to pods. Updated.

@@ -221,6 +241,11 @@ func (o *DebugOptions) Validate(cmd *cobra.Command) error {
return fmt.Errorf("invalid image pull policy: %s", o.PullPolicy)
}

// Target
if len(o.Target) > 0 && len(o.CopyTo) > 0 {
return fmt.Errorf("--target incompatible with --copy-to. Use --share-processes instead.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is incompatible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

if err != nil {
return nil, "", err
}
klog.V(2).Infof("node debugPod create returned pod: %#v", newPod)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debug message, I'd remove.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

TTY: o.TTY,
VolumeMounts: []corev1.VolumeMount{
{
MountPath: "/node",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: aybe just /host here, and host below

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM. I've updated this and the other messages to use "node" only in the context of the API object and "host" in the context of namespaces and filesystem. Now that you point it out, I realize this is more in line with how these terms are used in the rest of the API.

@verb verb force-pushed the cli-debug-node+aylei branch 2 times, most recently from d60f19b to c1086c6 Compare June 24, 2020 12:43
When called with a node target, `kubectl debug` will create a run-once
pod in the target node's namespaces.
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jul 9, 2020
@verb
Copy link
Contributor Author

verb commented Jul 9, 2020

/retest

@verb verb changed the title WIP: kubectl debug: support debugging nodes kubectl debug: support debugging nodes Jul 9, 2020
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 9, 2020
Copy link
Contributor

@soltysh soltysh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve
/milestone v1.19

@k8s-ci-robot k8s-ci-robot added this to the v1.19 milestone Jul 9, 2020
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 9, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: soltysh, verb

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 9, 2020
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@verb
Copy link
Contributor Author

verb commented Jul 10, 2020

/retest

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

2 similar comments
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@verb
Copy link
Contributor Author

verb commented Jul 11, 2020

/retest

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

1 similar comment
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@joelsmith
Copy link
Contributor

/retest

1 similar comment
@joelsmith
Copy link
Contributor

/retest

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

2 similar comments
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Jul 12, 2020

@verb: The following test failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
pull-kubernetes-e2e-kind-ipv6 c1086c6c5174ef2ee8c183407bc1f54932a5956e link /test pull-kubernetes-e2e-kind-ipv6

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubectl cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/cli Categorizes an issue or PR as relevant to SIG CLI. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support debugging nodes with kubectl debug
6 participants