Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Config requiring docker network that supports ipv6 which is not a default setting #9000

Closed
sophiajwitt opened this issue Jul 13, 2023 · 18 comments · Fixed by #9183
Closed

Config requiring docker network that supports ipv6 which is not a default setting #9000

sophiajwitt opened this issue Jul 13, 2023 · 18 comments · Fixed by #9183
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@sophiajwitt
Copy link

What steps did you take and what happened?

While following the rapid iterative deployment with tilt tutorial (https://cluster-api.sigs.k8s.io/developer/tilt.html), an issue was run into with the preconfigured cluster script ./hack/kind-install-for-capd.sh as shown below

Creating cluster "capi-test" ...
 ✓ Ensuring node image (kindest/node:v1.27.3) :frame_with_picture:
 ✓ Preparing nodes :package:
 ✗ Writing configuration :scroll:
Deleted nodes: ["capi-test-control-plane"]
ERROR: failed to create cluster: failed to generate kubeadm config content: failed to get IPv6 address for node capi-test-control-plane

Work around used:
editing the /hack/kind-install-for-capd.sh file and removing

networking:
  ipFamily: dual

from the script

What did you expect to happen?

I expected the preconfigured script to work without editing out the networking section.
Is this a common workaround? Or are there other ways to configure docker to use IPv6 correctly?

Cluster API version

v1.4.4

Kubernetes version

No response

Anything else you would like to add?

No response

Label(s) to be applied

/kind bug
One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels.

@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jul 13, 2023
@killianmuldoon
Copy link
Contributor

This was added as part of the work to test dualstack in Cluster API. I don't think we need to have dualstack on by default in the KIND setup though - if it's causing issues for folks we should revert.

Would you be able to open a PR to remove the networking section from that script?

@killianmuldoon
Copy link
Contributor

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jul 13, 2023
@sbueringer
Copy link
Member

sbueringer commented Jul 13, 2023

@killianmuldoon But we have it enabled in our e2e tests right? That means when we run our e2e tests against tilt we have a difference in setup if we just drop the setting?

(we could make it configurable via env var and default to the current behavior instead of just dropping the setting)

I would assume in general the current setting works if IPv6 is available.
(we didn't had any complaints before)

@mdbooth
Copy link
Contributor

mdbooth commented Jul 13, 2023

@killianmuldoon But we have it enabled in our e2e tests right? That means when we run our e2e tests against tilt we have a difference in setup if we just drop the setting?

(we could make it configurable via env var and default to the current behavior instead of just dropping the setting)

I would assume in general the current setting works if IPv6 is available. (we didn't had any complaints before)

I hit the same issue and also used the workaround described above, btw. If it's relevant I don't have a functional ipv6 environment locally.

@sophiajwitt
Copy link
Author

Sure! I will get started on that now

@killianmuldoon
Copy link
Contributor

@killianmuldoon But we have it enabled in our e2e tests right? That means when we run our e2e tests against tilt we have a difference in setup if we just drop the setting?

That's right, @sophiajwitt will you see if you can add a way to configure this instead of just dropping the setting?

@sophiajwitt
Copy link
Author

Yes, will look into that

@mdbooth
Copy link
Contributor

mdbooth commented Jul 13, 2023

On the environment variable, could we define something like TILT_USE_DUAL_STACK which, if set, enables the dual stack configuration? We could then add it to the tilt documentation. I'm guessing anybody working on IPv6 specifically is going to go looking for it in the docs if IPv6 isn't there, so discoverability should be good.

@vincepri
Copy link
Member

We could look if there is a way to programmatically understand if ipv6 networking is enabled in docker and it should be enabled in kind? We can always fallback to single stack, but if all the checks pass, we can enable dual stack by default on most systems?

@mdbooth
Copy link
Contributor

mdbooth commented Jul 13, 2023

EDIT: IGNORE THIS! Explanation below.

How about something like:

enable_dual_stack=0
for i in $(seq 0 4); do
    if ping -6 -c1 www.google.com; then
      enable_dual_stack=1
      echo "Functional IPv6 configuration detected. Enabling dual stack"
      break
    fi
done

if [ $enable_dual_stack == 0 ]; then
  echo "Functional IPv6 configuration not detected."
fi

That's going to ping google.com with IPv6 up to 5 times. Should weed out misconfigured IPv6 as well as no IPv6.

@mdbooth
Copy link
Contributor

mdbooth commented Jul 13, 2023

The problem is that due to a local MTU requirement we're pre-creating the kind network (as documented somewhere, I forget where). Specifically:

$ docker network create -o "com.docker.network.driver.mtu=1200" kind

The network this creates doesn't have IPv6 enabled. If I delete this network and let kind create one then hack/kind-install-for-capd.sh works correctly unmodified, including the dual stack configuration, despite me not having a functional IPv6 environment. Inspecting the kind network, I can see it has EnableIPv6: true:

> docker network inspect kind | jq '.[0].EnableIPv6'
true

How about a note in the docs, something like:


If there is a pre-existing docker kind network configured without IPv6 support you may see:

ERROR: failed to create cluster: failed to generate kubeadm config content: failed to get IPv6 address for node capi-test-control-plane; is docker configured to use IPv6 correctly?

The docker kind network MUST be configured with IPv6 support. Note that this does not require a functioning IPv6 environment. Deleting the docker kind network will allow kind to create a new network with IPv6.


I suspect our actual problem of how to configure a docker kind network with a custom MTU is somewhat esoteric and not worth documenting in CAPI.

@anastaruno
Copy link

anastaruno commented Jul 15, 2023

Defining the needed mtu value in a docker daemon config file (/etc/docker/daemon.json) worked for me. Like so:

{
   "mtu":1280
}

Then, running the script like usual produces both the cluster and the kind network. The network supports IPv6 and has the correct mtu. You can make sure by running docker network inspect kind.

Hope this helps!

@mdbooth
Copy link
Contributor

mdbooth commented Jul 18, 2023

@sophiajwitt Shall we close this?

@killianmuldoon
Copy link
Contributor

@sophiajwitt Shall we close this?

I think we should wait for a PR to fix this issue before closing - otherwise it might be forgotten about.

@mdbooth
Copy link
Contributor

mdbooth commented Jul 18, 2023

Do you want to inspect any pre-existing docker network to see if IPv6 is enabled? So proposed logic:

  • If kind network does not exist, enable dual stack and continue (because it will be created with IPv6 support)
  • If kind network exists, enable dual stack iff the network has IPv6 enabled

We would then use this to do some templating in ./hack/kind-install-for-capd.sh. We should log a message if we disable dual stack support.

@killianmuldoon
Copy link
Contributor

killianmuldoon commented Jul 18, 2023

I think setting an env variable - and defaulting to having networking unset - is a fine solution to this - expecially if we can't be certain that autodetection will work across systems and across time.

I expect the IPv6 case to be the minority case for end-users. We only introduced this recently, and it was only changed to keep the tilt setup in-line with our upstream Prow setup. There's lots of other differences between these two environments though so I don't know how worthwhile it is to invest in keeping this specific networking config directly the same.

@vincepri
Copy link
Member

If we can't safely autodetect that a network is properly configured, we can go the environment variable route, and potentially just run some extra checks (if we can) for additional validation. I agree with @killianmuldoon that in the majority of cases, users don't care about ipv6, especially within kind.

@sbueringer
Copy link
Member

sbueringer commented Jul 18, 2023

Fine for me. The idea was basically that we use the same configuration in dev and in Prow to minimize the differences in the environment. Dual stack tests are already hard to debug and even harder if the configuration is different.

Of course not an issue if "whoever is debugging" is aware of the differences.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants