Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support copying "options" in resolv.conf into pod sandbox when dnsPolicy is Default #54773

Merged
merged 1 commit into from
Nov 9, 2017
Merged

Conversation

phsiao
Copy link
Contributor

@phsiao phsiao commented Oct 30, 2017

What this PR does / why we need it:

This PR adds support for copying "options" from host's /etc/resolv.conf (or --resolv-conf) into pod's resolv.conf when dnsPolicy is Default. Being able to customize options is important because it is common to leverage options to fine-tune the behavior of DNS client.

Which issue this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close that issue when PR gets merged): fixes #42542

Special notes for your reviewer:

I originally wanted to also tackle the issue of copying options for when dnsPolicy is ClusterFirst, but with ability to "merge" with default options (ndots:5 more specifically) when it makes sense. I decided to leave it off for now because the "merging" may need more discussions. Happy to add that to this PR or create another PR for that if it makes sense and is clear what should be done. I think even when dnsPolicy is ClusterFirst it is important to allow customization.

Release note:

kubelet: add support for copying "options" from /etc/resolv.conf (or --resolv-conf if it is used) into pod's /etc/resolv.conf when dnsPolicy is Default.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Oct 30, 2017
@feiskyer
Copy link
Member

cc @kubernetes/sig-network-misc, any comments of this change?

@k8s-ci-robot k8s-ci-robot added the sig/network Categorizes an issue or PR as relevant to SIG Network. label Oct 30, 2017
@dims
Copy link
Member

dims commented Oct 30, 2017

/ok-to-test

@k8s-ci-robot k8s-ci-robot removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Oct 30, 2017
@roffe
Copy link

roffe commented Oct 31, 2017

Just thinking loud here: Might this not potentially enable people to break their system in completely new ways ( thinking of musl for example )

Woudln't it make sense to have some checks of the resolv.conf being imported so it does not have to many search domains & servers etc

@roffe
Copy link

roffe commented Oct 31, 2017

scratch that comment. saw your using checkLimitsForResolvConf

I just have something nagging in the back of my head saying that i read somewhere that the limits glibc imposes are not the same for musl

@phsiao
Copy link
Contributor Author

phsiao commented Nov 1, 2017

ping @kubernetes/sig-network-misc

@k8s-ci-robot
Copy link
Contributor

@phsiao: Reiterating the mentions to trigger a notification:
@kubernetes/sig-network-misc

In response to this:

ping @kubernetes/sig-network-misc

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@thockin
Copy link
Member

thockin commented Nov 2, 2017

kubernetes/community#1276 seems like a better all around answer.

I'm wary of inheriting that stuff because, frankly, inheriting the host's DNS was probably a mistake. The config for the node is pretty unrelated to what the config should be for a pod.

Are you simply trying to set timeout?

@thockin thockin assigned bowei, thockin and MrHohn and unassigned feiskyer and yifan-gu Nov 2, 2017
@phsiao
Copy link
Contributor Author

phsiao commented Nov 2, 2017

For our use case, the host's DNS is more relevant than kube-dns, and it performs much better.

We don't use the kube-dns as pod's resolver, and we delegate cluster domain to kube-dns pods. Pods and VMs/baremetal servers are all first class systems in our environment, and our pods resolve non-k8s names more than they resolve k8s names today.

We actually prefer to use dnsPolicy: Default, but we can't inherit anything from options so none of the policy is ideal. We do need some tweaking to options to leverage our multiple resolvers without causing extended disruption during resolver maintenance.

I think having dnsPolicy: Default inheriting options does not violate what you described. It is not a default behavior, and the least surprise would be to get the /etc/resolv.conf file in its entirety but being able to copy over options is probably just as good.

@phsiao
Copy link
Contributor Author

phsiao commented Nov 2, 2017

I did look at kubernetes/community#1276 and I believe it is trying to address a different problem.

@thockin
Copy link
Member

thockin commented Nov 2, 2017

kubernetes/community#1276 should cover this use case. We should close this until we solve that larger design problem.

Alternatively, I mightbe OK to inherit options but only in Default policy, not ClusterFirst. What do you think?

@phsiao phsiao changed the title Support copying "options" in resolv.conf into pod sandbox Support copying "options" in resolv.conf into pod sandbox when dnsPolicy is Default Nov 2, 2017
@phsiao
Copy link
Contributor Author

phsiao commented Nov 2, 2017

Alternatively, I mightbe OK to inherit options but only in Default policy, not ClusterFirst. What do you think?

That is exactly what this PR is, the title was a little misleading, but the initial comment stated that.

@thockin
Copy link
Member

thockin commented Nov 2, 2017

Sorry, I skimmed too hard.

@bowei can you review or assign? I guess I am OK with inheriting options in Default policy. Unless you can think of reasons not to?

@bowei
Copy link
Member

bowei commented Nov 2, 2017

taking a look

if err != nil {
return nil, nil, false, err
return nil, nil, hostOptions, false, err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return nil, nil, nil, false, er

@@ -429,7 +429,7 @@ func (kl *Kubelet) GenerateRunContainerOptions(pod *v1.Pod, container *v1.Contai
}
}

opts.DNS, opts.DNSSearch, useClusterFirstPolicy, err = kl.GetClusterDNS(pod)
opts.DNS, opts.DNSSearch, _, useClusterFirstPolicy, err = kl.GetClusterDNS(pod)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like resolv.conf options are not piped into pod yet. Are we planing to do that in follow-up PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe only the podSandboxConfig.DnsConfig returned by generatePodSandboxConfig() in pkg/kubelet/kuberuntime/kuberuntime_sandbox.go is used to generate resolv.conf.

This portion of GetClusterDNS() does not appear to be affecting the generation of resolv.conf, so the options were ignored. This portion of settings are used to set parameters for launching a container, and not in the container sandbox.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, looked around and confirmed this is correct.

@phsiao
Copy link
Contributor Author

phsiao commented Nov 3, 2017

/test pull-kubernetes-bazel-build
/test pull-kubernetes-unit

Copy link
Member

@MrHohn MrHohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Thanks, don't have more comments, seems to be a straightforward change.

@@ -429,7 +429,7 @@ func (kl *Kubelet) GenerateRunContainerOptions(pod *v1.Pod, container *v1.Contai
}
}

opts.DNS, opts.DNSSearch, useClusterFirstPolicy, err = kl.GetClusterDNS(pod)
opts.DNS, opts.DNSSearch, _, useClusterFirstPolicy, err = kl.GetClusterDNS(pod)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, looked around and confirmed this is correct.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 3, 2017
@k8s-github-robot k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 3, 2017
@phsiao
Copy link
Contributor Author

phsiao commented Nov 3, 2017

Saw test failures appear to be flakes, and squash the commits.

@MrHohn
Copy link
Member

MrHohn commented Nov 3, 2017

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 3, 2017
@phsiao
Copy link
Contributor Author

phsiao commented Nov 4, 2017

/assign @thockin

@phsiao
Copy link
Contributor Author

phsiao commented Nov 6, 2017

ping @thockin for approval.

k8s-github-robot pushed a commit that referenced this pull request Nov 7, 2017
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Clean up redundant DNS related codes

**What this PR does / why we need it**:
As #54773 (comment) described, resolv.conf setup for pod is handled by `generatePodSandboxConfig()`, though we have some redundant DNS related codes in `GenerateRunContainerOptions()` which seems to have no effect.

This PR cleans up the ineffective codes and rearranges the cluster DNS unit test and hopefully it would be less confusing.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #55201

**Special notes for your reviewer**:
cc @Random-Liu @phsiao 

**Release note**:

```release-note
NONE
```
@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 7, 2017
@k8s-github-robot k8s-github-robot removed lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Nov 7, 2017
@phsiao
Copy link
Contributor Author

phsiao commented Nov 7, 2017

/retest

@phsiao
Copy link
Contributor Author

phsiao commented Nov 7, 2017

@MrHohn please review when you have a chance. It is rebased to account for the changes in #55201.

@MrHohn
Copy link
Member

MrHohn commented Nov 7, 2017

@phsiao Thanks, this still LGTM.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 7, 2017
@thockin
Copy link
Member

thockin commented Nov 9, 2017

/approve

@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: MrHohn, phsiao, thockin

Associated issue: 42542

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-github-robot k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 9, 2017
@k8s-github-robot
Copy link

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot
Copy link

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/network Categorizes an issue or PR as relevant to SIG Network. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[kubelet] ignore keyword "options" define in /etc/resolv.conf, only look for nameserver and search
10 participants