Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split networking out from node initialization into its own package #16269

Merged
merged 2 commits into from
Sep 16, 2017

Conversation

smarterclayton
Copy link
Contributor

@smarterclayton smarterclayton commented Sep 10, 2017

Node and networking are now completely independent initialization paths.

Builds off of #16268, only second commit is new

@deads2k second step towards removing the need for node to be tied together

@openshift-ci-robot openshift-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Sep 10, 2017
@openshift-merge-robot openshift-merge-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 10, 2017
@smarterclayton smarterclayton force-pushed the snip_network branch 2 times, most recently from a460007 to d97345b Compare September 10, 2017 01:53
}

// RunDNS starts the DNS server as soon as services are loaded.
func (c *NetworkConfig) RunDNS() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we start DNS inside the master today?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DNS runs on nodes and masters. Nodes are the primary, masters are only for backcompat.

c.ProxyConfig.ConfigSyncPeriod.Duration,
)

if c.EnableUnidling {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are definitely follow-up questions about the unidler, but this probably isn't making it worse.

Side question, where does the idler fall in the update order with controllers. Will we have to have a version where we have a controller running on its own resource and events?

InternalKubeInformers kinternalinformers.SharedInformerFactory

// ProxyConfig is the configuration for the kube-proxy, fully initialized
ProxyConfig *componentconfig.KubeProxyConfiguration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// TODO, this should wire through on options if at all possible.

@@ -549,7 +327,7 @@ func firstEndpointIPWithNamedPort(endpoints *kapi.Endpoints, port int, portName
}

// TODO: more generic location
func includesNamedEndpointPort(ports []kapi.EndpointPort, port int, portName string) bool {
func includesNamedEndpointPort(ports []kapiv1.EndpointPort, port int, portName string) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not wrong, but why bother switching?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allows us to not load an internal client for kubelet (completely). So the kubelet code path has no internal client logic at all. Given the desire to split, reduces by one the list of dependencies in NodeConfig (which is net good)

// DockerClient is a client to connect to Docker
DockerClient dockertools.Interface
// KubeletServer contains the KubeletServer configuration
KubeletServer *kubeletoptions.KubeletServer
// KubeletDeps are the injected code dependencies for the kubelet, fully initialized
KubeletDeps *kubelet.KubeletDeps
// ProxyConfig is the configuration for the kube-proxy, fully initialized
ProxyConfig *componentconfig.KubeProxyConfiguration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

happy to see this go.

return nil, err
}

func New(options configapi.NodeConfig, server *kubeletoptions.KubeletServer) (*NodeConfig, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually, this should wire through on options like the scheduler and the upstream controllers, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I suspect this method should be split into two parts as we get there - node config to kubelet and proxy Options, then further refine node config with other defaults. I think there are a few things that can't be teased apart yet without duplicating a lot of subtle backwards compatibility on options.


// Build creates the core Kubernetes component configs for a given NodeConfig, or returns
// an error
func Build(options configapi.NodeConfig) (*kubeletoptions.KubeletServer, *componentconfig.KubeProxyConfiguration, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to build these separately. Any reason that they have to be built together?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, there is data calculated for kubelet server that is an input to proxy config. I'd prefer not to separate that in this step (we can do that once kubelet config moves to beta).

return nil, nil, kerrors.NewAggregate(err)
}

proxyconfig, err := buildKubeProxyConfig(options)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't look like it.

@deads2k
Copy link
Contributor

deads2k commented Sep 11, 2017

minor questions/comments about direction, I'm glad to see this moving forward.

@deads2k
Copy link
Contributor

deads2k commented Sep 11, 2017

lgtm. I'll leave tagging to you if there's an order or concern about dcut.

@smarterclayton
Copy link
Contributor Author

smarterclayton commented Sep 11, 2017 via email

@smarterclayton
Copy link
Contributor Author

@openshift/sig-networking should also take a look.

return nil, nil, err
}

// Initialize SDN before building kubelet config so it can modify option
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that comment doesn't really make sense here any more


if network.IsOpenShiftNetworkPlugin(server.NetworkPluginName) {
// set defaults for openshift-sdn
server.HairpinMode = componentconfig.HairpinNone
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be dropped; it gets set in the IsOpenShiftNetworkPlugin() block below as well. (It looks like it was redundant in the old code too, we just hadn't noticed.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

// TODO: SDN plugin depends on the Kubelet registering as a Node and doesn't retry cleanly,
// and Kubelet also can't start the PodSync loop until the SDN plugin has loaded.
if components.Enabled(ComponentKubelet) != components.Enabled(ComponentPlugins) {
return fmt.Errorf("the SDN plugin must be run in the same process as the kubelet")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm... so this check is going away, not just being moved somewhere else, right?

I think that's correct... the proxy filtering code doesn't depend on any of the node code these days. We should get QE to add a test for this case though. (The existing polarion test case OCP-12167 "The endpoints in EgressNetworkPolicy denying cidrSelector will be ignored", but with the node configured to run separate node and proxy processes.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I double checked locally and can't find any crossover.

@deads2k
Copy link
Contributor

deads2k commented Sep 15, 2017

Going to wait till next sprint.

We've branched.

/assign danwinship

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 15, 2017
First refactoring towards separating out the code that launches
networking from the code that launches the kubelet. Ensures that node
options (the part that converts nodeconfig to kubelet config) is
completely separate from how processes are started.
Node and networking are now completely independent initialization paths.
@smarterclayton
Copy link
Contributor Author

Updated

@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 15, 2017
@danwinship
Copy link
Contributor

/lgtm
Filed https://trello.com/c/bTEOBhpx to get the new multitenant+separate node/proxy test written

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Sep 15, 2017
@openshift-merge-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: danwinship, smarterclayton

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@smarterclayton
Copy link
Contributor Author

/retest

termination message log not found

@openshift-merge-robot
Copy link
Contributor

/test all [submit-queue is verifying that this PR is safe to merge]

@openshift-ci-robot
Copy link

openshift-ci-robot commented Sep 16, 2017

@smarterclayton: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
ci/openshift-jenkins/cmd 9fc4265 link /test cmd

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-merge-robot
Copy link
Contributor

Automatic merge from submit-queue (batch tested with PRs 16269, 13282, 16386)

@openshift-merge-robot openshift-merge-robot merged commit 0a33b4c into openshift:master Sep 16, 2017
openshift-merge-robot added a commit that referenced this pull request Sep 22, 2017
Automatic merge from submit-queue (batch tested with PRs 16480, 16486, 16270, 16128, 16489)

Direct exec() the kubelet instead of launching in proc

If only the kubelet is launched by the node process, execve(2) instead of launching in process. Requires some corrections to the upstream flags to support round tripping.  Support `OPENSHIFT_ALLOW_UNSUPPORTED_KUBELET=<path>` to allow a kubelet binary that is not exactly equivalent (symlink or hardlink) to the current file.  If the kubelet binary cannot be found, print a warning and continue with the in-proc flow (so we don't break older users without the kubelet symlink).

To start:

```
$ openshift start node --config=... --enable=kubelet --loglevel=3
<will log, then exec kubelet>
... kubelet logs
```

Networking can be run separately with:

```
$ openshift start network --config=...
```

Did a quick sanity test against this, didn't hit any obvious issues.

Builds off #16269
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants