Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: tailscale extension #154

Merged
merged 1 commit into from
Jun 28, 2023
Merged

Conversation

btrepp
Copy link
Contributor

@btrepp btrepp commented Apr 28, 2023

This downloads and compiles tailscale,
as a talos extension, running as a service.

Uses the host/tun device.

Motivations

Motivation is to enable/simplify simple discovery, e.g by being in Tailscale, you can access your talos cluster anywhere, not needing to have routes setup if you are behind NAT devices, like would be common in hobby and lab setups.

Particularly I was chasing this for items like a NAS storage solution, so that I can use the tailnet IPs and DNS to mount nfs and other storage, that has a bit of a strange story on security.

Extension overview

_out
├── manifest.yaml
└── rootfs
    └── usr
        └── local
            ├── bin
            │   ├── tailscale
            │   └── tailscaled
            ├── etc
            │   └── containers
            │       └── tailscale.yaml
            └── lib
                └── containers
                    └── tailscale
                        └── containerboot

Installation.

machine:
  install:
    extensions:
      - image: docker.io/btrepp/tailscale:1.40.0-v1.4.0-alpha.4-1-g65239da

talosctl upgrade

Configure in Tailscale. It will log a message out, with a url to join the device.
I've chosen this approach as you don't need to manage a key, and would be 'interacting' with the operator on upgrades anyway, so if you nuke var you need to do this again.

  1. talosctl logs ext-tailscale
  2. find the login url and follow instructions
  3. You would need to do this for each talos machine

Persistence.

It mounts /var/lib/tailscale as rw, as Tailscale requires state to not create new nodes.
This could be an issue on upgrades, as if you don't use --preserve, you probably will have wiped the secrets Tailscale needs.

I've accepted this as an 'okay' tradeoff, in upgrades, you are manually firing commands talosctl upgrade anyway, so your procedure will just have a 'second' step, to register (and rename, if required) nodes in Tailscale.

The other option is ephemeral nodes and storing the auth key, which may be better, but Tailscale authkeys do expire, so it would be a trade-off.

Addresses

Shows the addresses registered on the machine.

talosctl get address
NODE           NAMESPACE   TYPE            ID                                                       VERSION   ADDRESS                                       LINK
100.92.247.2   network     AddressStatus   eth0/192.168.1.11/24                                     1         192.168.1.11/24                               eth0
100.92.247.2   network     AddressStatus   eth0/2403:580f:43f:0:dea6:32ff:fedd:480d/64              1         2403:580f:43f:0:dea6:32ff:fedd:480d/64        eth0
100.92.247.2   network     AddressStatus   eth0/fe80::dea6:32ff:fedd:480d/64                        2         fe80::dea6:32ff:fedd:480d/64                  eth0
100.92.247.2   network     AddressStatus   lo/127.0.0.1/8                                           1         127.0.0.1/8                                   lo
100.92.247.2   network     AddressStatus   lo/::1/128                                               1         ::1/128                                       lo
100.92.247.2   network     AddressStatus   tailscale0/100.92.247.2/32                               1         100.92.247.2/32                               tailscale0
100.92.247.2   network     AddressStatus   tailscale0/fd7a:115c:a1e0:ab12:4843:cd96:625c:f702/128   1         fd7a:115c:a1e0:ab12:4843:cd96:625c:f702/128   tailscale0
100.92.247.2   network     AddressStatus   tailscale0/fe80::bb45:dc6d:c2ec:51e6/64                  1         fe80::bb45:dc6d:c2ec:51e6/64                  tailscale0

###Limitations

  • MagicDNS not setup DNS works now
  • Currently can't get Kubernetes components to speak over the Tailscale IPs, this may be an ordering thing
  • Subnet router not configured. Would possibly be a great way of exposing multiple containers into the talent Hardcoded this as a test. Works great, you can even get Tailscale to forward to kubedns, basically getting you all the services accessible from a Tailscale machine, via dns.

Screenshot 2023-04-28 at 7 24 40 pm

@btrepp btrepp force-pushed the feature/tailscale branch 2 times, most recently from 9522433 to 1bdefef Compare April 28, 2023 11:11
@btrepp btrepp changed the title Initial tailscale extension feat: tailscale extension Apr 28, 2023
@btrepp btrepp marked this pull request as ready for review April 28, 2023 11:26
@btrepp btrepp force-pushed the feature/tailscale branch 2 times, most recently from c36b9a2 to dda4e54 Compare April 29, 2023 11:25
@btrepp
Copy link
Contributor Author

btrepp commented Apr 29, 2023

There may be an issue as is, while talosctl seems happy enough, and you can use dns between nodes fine, k8s isn't up healthy. Kube-flannel seems to be struggling

kubectl logs -n kube-system kube-flannel-cc925
Defaulted container "kube-flannel" out of: kube-flannel, install-config (init), install-cni (init)
I0429 11:45:06.035373       1 main.go:211] CLI flags config: {etcdEndpoints:http://127.0.0.1:4001,http://127.0.0.1:2379 etcdPrefix:/coreos.com/network etcdKeyfile: etcdCertfile: etcdCAFile: etcdUsername: etcdPassword: version:false kubeSubnetMgr:true kubeApiUrl: kubeAnnotationPrefix:flannel.alpha.coreos.com kubeConfigFile: iface:[] ifaceRegex:[] ipMasq:true ifaceCanReach: subnetFile:/run/flannel/subnet.env publicIP: publicIPv6: subnetLeaseRenewMargin:60 healthzIP:0.0.0.0 healthzPort:0 iptablesResyncSeconds:5 iptablesForwardRules:true netConfPath:/etc/kube-flannel/net-conf.json setNodeNetworkUnavailable:true useMultiClusterCidr:false}
W0429 11:45:06.035645       1 client_config.go:617] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
E0429 11:45:36.038711       1 main.go:228] Failed to create SubnetManager: error retrieving pod spec for 'kube-system/kube-flannel-cc925': Get "https://10.96.0.1:443/api/v1/namespaces/kube-system/pods/kube-flannel-cc925": dial tcp 10.96.0.1:443: i/o timeout

Unsure if that's related, but I suspect iptables changes for Tailscale may conflict somehow
Not related, seems I changed my talos cluster name at one point, I needed to run 'upgrade-k8s' and reboot nodes to get them to use the endpoint/name.

Cluster is healthy now

kubectl get pods -A
NAMESPACE     NAME                           READY   STATUS    RESTARTS       AGE
kube-system   coredns-d779cc7ff-ljhch        1/1     Running   0              3m38s
kube-system   coredns-d779cc7ff-mvwvc        1/1     Running   0              3m54s
kube-system   kube-apiserver-rpi1            1/1     Running   0              12m
kube-system   kube-controller-manager-rpi1   1/1     Running   1 (12m ago)    12m
kube-system   kube-flannel-g5gq8             1/1     Running   4 (106s ago)   4m16s
kube-system   kube-flannel-mq8ng             1/1     Running   0              5m59s
kube-system   kube-flannel-t57vf             1/1     Running   602            2d3h
kube-system   kube-proxy-8shgb               1/1     Running   16 (14m ago)   2d3h
kube-system   kube-proxy-bdvqw               1/1     Running   0              5m59s
kube-system   kube-proxy-xpcjj               1/1     Running   0              105s
kube-system   kube-scheduler-rpi1            1/1     Running   1 (12m ago)    12m
kubectl get nodes
NAME   STATUS   ROLES           AGE     VERSION
rpi1   Ready    control-plane   2d3h    v1.27.1
rpi2   Ready    <none>          8m19s   v1.27.1
rpi3   Ready    <none>          9h      v1.27.1
talosctl get members
NODE           NAMESPACE   TYPE     ID     VERSION   HOSTNAME           MACHINE TYPE   OS               ADDRESSES
100.92.247.2   cluster     Member   rpi1   3         rpi1.localdomain   controlplane   Talos (v1.4.0)   ["100.92.247.2","192.168.1.11","2403:580f:43f:0:dea6:32ff:fedd:480d","fd7a:115c:a1e0:ab12:4843:cd96:625c:f702"]
100.92.247.2   cluster     Member   rpi2   3         rpi2.localdomain   worker         Talos (v1.4.0)   ["100.120.218.23","192.168.1.12","2403:580f:43f:0:dea6:32ff:fedd:482e","fd7a:115c:a1e0:ab12:4843:cd96:6278:da17"]
100.92.247.2   cluster     Member   rpi3   4         rpi3.localdomain   worker         Talos (v1.4.0)   ["100.120.163.105","192.168.1.13","2403:580f:43f:0:dea6:32ff:fedd:47dd","fd7a:115c:a1e0:ab12:4843:cd96:6278:a369"]

network/tailscale/vars.yaml Outdated Show resolved Hide resolved
Copy link
Member

@frezbo frezbo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also add a README?

@btrepp btrepp marked this pull request as draft May 26, 2023 10:15
@btrepp btrepp force-pushed the feature/tailscale branch 2 times, most recently from 4e00b69 to 0b0810a Compare May 27, 2023 07:55
@btrepp btrepp marked this pull request as ready for review May 27, 2023 07:56
@btrepp btrepp requested a review from frezbo May 27, 2023 07:56
network/tailscale/pkg.yaml Outdated Show resolved Hide resolved
@frezbo
Copy link
Member

frezbo commented Jun 27, 2023

/ok-to-test

@frezbo
Copy link
Member

frezbo commented Jun 27, 2023

@btrepp would it be possible to sign off the commit:

git pull to get the latest changes I pushed, then git commit -s --amend --no-edit and git push --force-with-lease

@btrepp
Copy link
Contributor Author

btrepp commented Jun 28, 2023

@frezbo should be done. Thanks for adding the environmentFile feature too!, that will be awesome for future extensions.

Tailscale as a system service extension.
Creates network devices in the talos 'host'

Requires: siderolabs/talos#7408

Signed-off-by: Noel Georgi <git@frezbo.dev>
Signed-off-by: beau trepp <beautrepp@gmail.com>
Signed-off-by: Noel Georgi <git@frezbo.dev>
@frezbo
Copy link
Member

frezbo commented Jun 28, 2023

/m

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants