Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the network luke #1508

Merged
merged 15 commits into from
Apr 25, 2020

Conversation

BenTheElder
Copy link
Member

@BenTheElder BenTheElder commented Apr 24, 2020

implement #148

NOTE: breaking change for docker backend, requires new node images, switches nodes from the default bridge to a kind network. Also causes clusters to start back up on reboot...

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Apr 24, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: BenTheElder

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 24, 2020
@BenTheElder
Copy link
Member Author

this almost definitely:

  • doesn't work with ipv6 single stack (should still work, host reboot / dockerd reboot won't work though, which is the status quo)
  • may entirely break podman ipv6 single stack (I don't think we'd tested this yet anyhow cc @amwat)

on docker node provider and ipv4 nodes though this should work great
it also resolves using the host DNS properly #284 and probably fixes #1461 (need to confirm on a setup with the issue)

both should be resolvable in a future iteration, but

@@ -19,6 +19,8 @@ package config

import (
"bytes"
"fmt"
"net"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this whole commit can be dropped but I'm not fighting with git at 2am and it's not going to show up in the git history anyhow.


# fixup IPs in manifests ...
# TODO: dual-stack LOL :this-is-fine:
curr_ip=$(hostname --ip-address)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'll want to swap this out with something else. this was sufficient for bootstrapping it but we'll need to actually change this to look at /kind/old-ip and get an IP of the same family.

not going to be too hard, but wasn't relevant to the bigger problems. bash wrangling for ipv6 singlestack can be a follow up, but should happen prior to releasing any official kind updates.

// network because the default bridge is actually special versus any other
// docker network and lacks the emebdded DNS
//
// for now this also makes it easier for apps to join the same network
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I.E. you can do docker run --network=kind --name=metallb-test-cluster-sidecar .... when doing fun test stuff.
Having a predictable default network makes that saner.

We could do one per cluster but not only is that more complex, it's going to cause IP allocation issues more readily when you have lots of small clusters (which is much moreso the usage pattern than one cluster with a large number of nodes).

In the future this can be configurable and this can just be the default, but it's roughly equivilant to the current state and keeps the PR focused on the necessary changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also it will modify current behaviour where multiple cluster are in the same network.
IIRC istio was using it

@BenTheElder
Copy link
Member Author

hehe surprise surprise we did actually break ipv6
/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 24, 2020
@BenTheElder
Copy link
Member Author

will fix ipv6 in the morning 😴

Copy link
Member

@neolit123 neolit123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM,
nice PR title!

if string(out) == name+"\n" {
return nil
}
return exec.Command("docker", "network", "create", "-d=bridge", name).Run()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO

if ipv6 {
exec.Command("docker", "network", "create", "--ipv6","-d=bridge", name).Run()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add a comment?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll probably actually fix it 😅

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ohhhh ugh, if we want to support simultaneous ipv4 and ipv6 we're going to have to either:

  • always enable ipv6 on the kind bridge
  • require that users ensure the bridge exists w/ ipv6 before kind
  • require that users create an ipv6 cluster first before any ipv4
  • use a special network for ipv6 clusters

requiring that the ipv6 one is created first seems sanest?
that or having a kind-ipv6 network ...

always enabling ipv6 won't work because docker OOTB will cause:

Error response from daemon: could not find an available, non-overlapping IPv6 address pool among the defaults to assign to the network

due to ipv6 not being configured on the daemon.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i implemented:

require that users create an ipv6 cluster first before any ipv4

just to see how far CI gets.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we just retry and fallback to an ipv4 only bridge?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure what you mean.
ipv6 has to be enabled at network creation time.

the old default one has it IF the daemon has ipv6 configured. but no embedded DNS.

if the user creates a kind cluster with ipv4 then creates ipv6 we'll have created a v4 only bridge.

we could try to check if the daemon has v6 configured when creating v4 clusters and opportunistically enable v6 maybe.

let's see how CI does first.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if err:=  exec.Command("docker", "network", "create", "--ipv6","-d=bridge", name).Run(): err != nil {
 return exec.Command("docker", "network", "create", "-d=bridge", name).Run()
}
return nil

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will report odd errors when the user was actually trying to do ipv6.
I will think on this some more and follow up

@BenTheElder
Copy link
Member Author

BenTheElder commented Apr 24, 2020

I tossed in the subnet from wrapper.sh for now, technically it's customizable if you just create the network before using kind ;-)

edit: adjusted to not overlap 🙃

if ipFamily == config.IPv6Family {
return exec.Command("docker", "network", "create", "-d=bridge", "--ipv6", name).Run()
return exec.Command("docker", "network", "create", "-d=bridge", "--ipv6", "--subnet=fc00:db8:1::/64", name).Run()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can use a /104 , it means 255 addresses and we have a low risk of collisions based on this kubernetes/kubernetes#90115 (comment)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm using a /104 just above this now. i'm not sure this is the value we want to settle on.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BenTheElder BenTheElder force-pushed the use-the-network-luke branch from ad77b29 to 31e8939 Compare April 24, 2020 21:17
@BenTheElder
Copy link
Member Author

Subpath flake
/retest

@BenTheElder
Copy link
Member Author

IPv6 green.
Not sure if reboot will work yet though.. haven't tested. I'd bet not quite.

@BenTheElder
Copy link
Member Author

that flake is similar to the ones we were worried about in the pause pr ...
https://prow.k8s.io/view/gcs/kubernetes-jenkins/pr-logs/pull/kubernetes-sigs_kind/1508/pull-kind-e2e-kubernetes/1253795238921637893
/retest

@BenTheElder
Copy link
Member Author

/hold cancel
IPv6 is not perfect but this is fairly reasonable now.

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 24, 2020
@amwat
Copy link
Contributor

amwat commented Apr 24, 2020

FYI i tested podman ipv6 and it wasn't working before this PR as well, so no regression there.

@BenTheElder
Copy link
Member Author

this does appear to fix #1508, i tested with a host using iptables_nft

one outstanding issue I found that @amwat confirmed is this:

$ kubectl get po -A
NAMESPACE            NAME                                         READY   STATUS         RESTARTS   AGE
kube-system          coredns-66bff467f8-fxzqz                     1/1     Running        1          14m
kube-system          coredns-66bff467f8-kg9sf                     1/1     Running        1          14m
kube-system          etcd-kind-control-plane                      1/1     Running        1          14m
kube-system          kindnet-mr62p                                1/1     Running        1          14m
kube-system          kube-apiserver-kind-control-plane            1/1     Running        1          14m
kube-system          kube-controller-manager-kind-control-plane   1/1     Running        1          14m
kube-system          kube-proxy-gpp6v                             1/1     Running        1          14m
kube-system          kube-scheduler-kind-control-plane            1/1     Running        1          14m
local-path-storage   local-path-provisioner-774f7f8fdb-j8mmd      0/1     NodeAffinity   0          14m
local-path-storage   local-path-provisioner-774f7f8fdb-wxp7w      1/1     Running        0          9s

... a stale local-path-storage pod will be left around with broken NodeAffinity a new one seems to come up healthy though.

haven't debugged it further yet. this is after one physical host reboot.

@aojea
Copy link
Contributor

aojea commented Apr 25, 2020

/test all
before final lgtm

return "", errors.Wrap(err, "failed to get apiserver endpoint")
}
// TODO: check cluster IP family and return the correct IP
// This means IPv6 singlestack is broken on podman
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

noting that per @amwat this never worked, leaving the TODO for the future though.

@@ -87,7 +87,7 @@ spec:
app: local-path-provisioner
spec:
nodeSelector:
node-role.kubernetes.io/master: ''
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this label is deprecated, however we do want to only run on linux nodes for our friends doing windows join hijinks.

@BenTheElder
Copy link
Member Author

I think the local-path-storage issue should be resoolllved now.

@BenTheElder BenTheElder added this to the v0.8.0 milestone Apr 25, 2020
@BenTheElder
Copy link
Member Author

:shipit:
will follow up regarding IPv6 improvements

@aojea
Copy link
Contributor

aojea commented Apr 25, 2020

/lgtm
fantastic

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 25, 2020
@k8s-ci-robot k8s-ci-robot merged commit 1f9ac83 into kubernetes-sigs:master Apr 25, 2020
@BenTheElder BenTheElder deleted the use-the-network-luke branch April 25, 2020 07:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants