Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate providerId from aws://zone/vmid to aws:///zone/vmid #260

Merged
1 commit merged into from
Aug 30, 2023

Conversation

ghost
Copy link

@ghost ghost commented Aug 22, 2023

What type of PR is this?
/kind Bug

What this PR does / why we need it:
To be compatible with talos provider :)
Which issue(s) this PR fixes:
Fixes #259

@ghost ghost marked this pull request as draft August 22, 2023 16:12
@ghost ghost requested review from jerome-jutteau and outscale-mdr August 23, 2023 07:35
@ghost ghost marked this pull request as ready for review August 23, 2023 07:59
@azert9
Copy link

azert9 commented Aug 23, 2023

Hi, I just tested and I can confirm that this PR does solve the issue :)

@ghost
Copy link
Author

ghost commented Aug 23, 2023

Hi @azert9 ,
Is it possible to share your config to create a talos cluster (MachineHealthCheck, OscMachineTemplate, MachineDeployment, TalosConfigTemplate, TalosControlPlane, Cluster, OscCluster, ClusterResourceSet ?) without you ak/sk because we have only tested with kubeadm bootstrapper ?

@azert9
Copy link

azert9 commented Aug 23, 2023

Here is the code that we use for testing (minus the cluster-api-provider-outscale secret):

apiVersion: v1
kind: Namespace
metadata:
  name: capi-cluster
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
  labels:
    ccm: kapi-crs-ccm
    cni: kapi-crs-cni
  name: kapi
  namespace: capi-cluster
spec:
  clusterNetwork:
    pods:
      cidrBlocks:
      - 10.42.0.0/16
  controlPlaneRef:
    apiVersion: controlplane.cluster.x-k8s.io/v1beta1
    kind: TalosControlPlane
    name: kapi-control-plane
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
    kind: OscCluster
    name: kapi
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OscCluster
metadata:
  name: kapi
  namespace: capi-cluster
spec:
  network:
    clusterName: kapi
    internetService:
      clusterName: kapi
    loadBalancer:
      clusterName: kapi
      loadbalancername: kapi-k8s
      securitygroupname: kapi-load-balancer
    natService:
      clusterName: kapi
    net:
      clusterName: kapi
    subregionName: eu-west-2a
    securityGroups:
    - name: kapi-talos-control-plane
      description: Talos control plane nodes.
      securityGroupRules:
      - name: talos-control-plane-apid-from-cp
        flow: Inbound
        ipProtocol: tcp
        ipRange: "10.0.4.0/24"
        fromPortRange: 50000
        toPortRange: 50000
      - name: talos-control-plane-trustd-from-workers
        flow: Inbound
        ipProtocol: tcp
        ipRange: "10.0.3.0/24"
        fromPortRange: 50001
        toPortRange: 50001
      - name: talos-control-plane-trustd-from-cp
        flow: Inbound
        ipProtocol: tcp
        ipRange: "10.0.4.0/24"
        fromPortRange: 50001
        toPortRange: 50001
    - name: kapi-talos-workers
      description: Talos worker nodes.
      securityGroupRules:
      - name: talos-workers-apid-from-cp
        flow: Inbound
        ipProtocol: tcp
        ipRange: "10.0.4.0/24"
        fromPortRange: 50000
        toPortRange: 50000
    - name: kapi-load-balancer
      description: Kubernetes api load balancer.
      securityGroupRules:
      - name: kubernetes-api
        flow: Inbound
        ipProtocol: tcp
        ipRange: "0.0.0.0/0"
        fromPortRange: 6443
        toPortRange: 6443
    - name: kapi-talos-nodes
      description: Common rules for kubernetes nodes.
      securityGroupRules:
      # TODO
      - name: icmp
        flow: Inbound
        ipProtocol: icmp
        ipRange: "0.0.0.0/0"
        fromPortRange: 1
        toPortRange: 1
      - name: open-all-udp
        flow: Inbound
        ipProtocol: udp
        ipRange: "10.0.0.0/16"
        fromPortRange: 3000
        toPortRange: 60000
      - name: open-all-tcp
        flow: Inbound
        ipProtocol: tcp
        ipRange: "10.0.0.0/16"
        fromPortRange: 3000
        toPortRange: 60000
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineDeployment
metadata:
  name: kapi-worker
  namespace: capi-cluster
spec:
  clusterName: kapi
  replicas: 1
  selector:
    matchLabels: null
  template:
    spec:
      bootstrap:
        configRef:
          apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
          kind: TalosConfigTemplate
          name: kapi-worker
      clusterName: kapi
      infrastructureRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
        kind: OscMachineTemplate
        name: kapi-worker
      version: 1.27.4
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OscMachineTemplate
metadata:
  name: kapi-worker
  namespace: capi-cluster
spec:
  template:
    spec:
      node:
        clusterName: kapi
        image:
          name: talos148
        keypair:
          name: bootstrap
        vm:
          clusterName: kapi
          keypairName: bootstrap
          rootDisk:
            rootDiskIops: 1500
            rootDiskSize: 100
            rootDiskType: io1
          subregionName: eu-west-2a
          vmType: tinav5.c4r16p1
          securityGroupNames:
          - name: kapi-talos-nodes
          - name: kapi-talos-workers
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OscMachineTemplate
metadata:
  name: kapi-control-plane
  namespace: capi-cluster
spec:
  template:
    spec:
      node:
        clusterName: kapi
        image:
          name: talos148
        keypair:
          name: bootstrap
        vm:
          clusterName: kapi
          keypairName: bootstrap
          loadBalancerName: kapi-k8s
          role: controlplane
          rootDisk:
            rootDiskIops: 1500
            rootDiskSize: 100
            rootDiskType: io1
          subregionName: eu-west-2a
          vmType: tinav5.c4r16p1
          securityGroupNames:
          - name: kapi-talos-nodes
          - name: kapi-talos-control-plane
---
apiVersion: bootstrap.cluster.x-k8s.io/v1alpha3
kind: TalosConfigTemplate
metadata:
  name: kapi-worker
  namespace: capi-cluster
spec:
  template:
    spec:
      generateType: worker
      talosVersion: v1.4
      configPatches:
      - op: add
        path: /cluster/externalCloudProvider
        value:
          enabled: true
          manifests:
          - "https://raw.githubusercontent.com/outscale/cloud-provider-osc/v0.2.3/deploy/osc-ccm-manifest.yml"
      - op: add
        path: /machine/kubelet/extraArgs
        value:
          cloud-provider: external
      - op: add
        path: /machine/network
        value:
          disableSearchDomain: true
      - op: add
        path: /machine/kubelet/registerWithFQDN
        value: true
---
apiVersion: controlplane.cluster.x-k8s.io/v1alpha3
kind: TalosControlPlane
metadata:
  name: kapi-control-plane
  namespace: capi-cluster
spec:
  controlPlaneConfig:
    controlplane:
      generateType: controlplane
      talosVersion: v1.4
      configPatches:
      - op: add
        path: /cluster/externalCloudProvider
        value:
          enabled: true
          manifests:
          - "https://raw.githubusercontent.com/outscale/cloud-provider-osc/v0.2.0/deploy/osc-ccm-manifest.yml"
      - op: add
        path: /machine/kubelet/extraArgs
        value:
          cloud-provider: external
      - op: add
        path: /machine/network
        value:
          disableSearchDomain: true
      - op: add
        path: /machine/kubelet/registerWithFQDN
        value: true
      - op: replace
        path: /cluster/allowSchedulingOnControlPlanes
        value: false
  infrastructureTemplate:
    apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
    kind: OscMachineTemplate
    name: kapi-control-plane
  replicas: 1
  version: 1.27.4

In addition, Talos control plane nodes must be reachable by the local management cluster during bootstrap. We achieve this by creating a bastion machine with a wireguard server and we connect to it with the appropriate routing rules (we can share more on this if needed).

@ghost
Copy link
Author

ghost commented Aug 23, 2023

Where do you set the secret cluster-api-provider-outscale ? (in TalosConfigTemplate?)

@azert9
Copy link

azert9 commented Aug 23, 2023

We set it manually in the management cluster, and we don't need it in the workload cluster.

@ghost
Copy link
Author

ghost commented Aug 23, 2023

Sorry i mean how to you create the secret osc-secret for the ccm in the workload cluster ?

@ghost
Copy link
Author

ghost commented Aug 23, 2023

Can you please also share your wireguard config with routing rules ? (ps we have a feature https://github.com/outscale/cluster-api-provider-outscale/blob/main/example/cluster-machine-template-bastion.yaml#L62 to have a bastion :) )

@azert9
Copy link

azert9 commented Aug 23, 2023

Sorry i mean how to you create the secret osc-secret for the ccm in the workload cluster ?

We create it manually as well

@azert9
Copy link

azert9 commented Aug 23, 2023

Here is the code that we use for the bastion : bastion.tar.gz

You can set the VPC id and the subnet id as terraform variables. A wireguard config is outputed by terraform, you can apply it with wg-quick.

We saw the integrated bastion feature but we are yet to explore it in depth :)

@ghost
Copy link
Author

ghost commented Aug 23, 2023

Which bootstrapper do you use for your management cluster if it is matter with wireguard config ? (For me i try with kind in a vm in public cloud)
I follow your instruction, i create a file /etc/wireguard/wg0.conf with terraform output but i get :

E0823 15:57:54.196656       1 controller.go:324] "Reconciler error" err="failed to create cluster accessor: error creating client for remote cluster \"capo-cluster/kapi\": error getting rest mapping: failed to get API group resources: unable to retrieve the complete list of server APIs: v1: client rate limiter Wait returned an error: context deadline exceeded - error from a previous attempt: EOF" controller="machine" controllerGroup="cluster.x-k8s.io" controllerKind="Machine" Machine="capo-cluster/kapi-control-plane-gfgkq" namespace="capo-cluster" name="kapi-control-plane-gfgkq" reconcileID=38ec0ed6-62c6-4915-8bab-cc826012f000

I can not access to the cluster to set the ccm secret with the kubeconfig received from clusterctl.
How do i connect with the cluster ?

@azert9
Copy link

azert9 commented Aug 23, 2023

I use a Kind cluster on a local machine. I see "capo-cluster" in your log output, is this correct?

@azert9
Copy link

azert9 commented Aug 23, 2023

I can not access to the cluster to set the ccm secret with the kubeconfig received from clusterctl.
How do i connect with the cluster ?

You can retrieve the config from a secret:

kubectl -n capi-cluster get secret kapi-kubeconfig --template='{{.data.value}}' | base64 -d

@ghost
Copy link
Author

ghost commented Aug 24, 2023

Hi
I am not sure if talos has been installed. I don't understand how cloudinit work with talos boostrap because it is not the format of cloud-init :

cluster:
  ca:
    crt: tata
    key: ""
  controlPlane:
    endpoint: https://kapi-k8s-544913272.eu-west-2.lbu.outscale.com:6443
  discovery:
    enabled: true
    registries:
      kubernetes:
        disabled: true
      service: {}
  externalCloudProvider:
    enabled: true
    manifests:
    - https://raw.githubusercontent.com/outscale/cloud-provider-osc/v0.2.3/deploy/osc-ccm-manifest.yml
  id: toto
  network:
    dnsDomain: cluster.local
    podSubnets:
    - 10.42.0.0/16
    serviceSubnets:
    - 10.96.0.0/12
  secret: tata
  token: tutu
debug: false
machine:
  ca:
    crt: toto
    key: ""
  certSANs: []
  features:
    apidCheckExtKeyUsage: true
    rbac: true
    stableHostname: true
  install:
    wipe: false
  kubelet:
    defaultRuntimeSeccompProfileEnabled: true
    disableManifestsDirectory: true
    extraArgs:
      cloud-provider: external
    image: ghcr.io/siderolabs/kubelet:v1.27.4
    registerWithFQDN: true
  network:
    disableSearchDomain: true
  registries: {}
  token: toto
  type: worker
persist: true
version: v1alpha1

@regisbelson
Copy link

You need to build a Talos omi, it doesn't use cloud init. Here's how we build ours.

@ghost
Copy link
Author

ghost commented Aug 24, 2023

Thanks @regisbelson
I will try it :)

@ghost
Copy link
Author

ghost commented Aug 24, 2023

@regisbelson @azert9 Is it possible to share this omi with 027440686109 ?

@regisbelson
Copy link

We just tried and the permissions button is grayed out. Also we have to uncheck "Mine Only" in the filters to make it appear in the list, if this is related somehow.

@ghost
Copy link
Author

ghost commented Aug 24, 2023

If you are ok, you can make your omi public with global_permission in packer (https://developer.hashicorp.com/packer/plugins/builders/outscale/outscale-bsusurrogate#global_permission)

@ghost
Copy link
Author

ghost commented Aug 24, 2023

I recreated omi with https://www.talos.dev/v1.5/talos-guides/install/cloud-platforms/aws/ and now i get some logs to debug.

@ghost
Copy link
Author

ghost commented Aug 24, 2023

It works :), thanks you

NAMESPACE      NAME                                 CLUSTER   NODENAME                                   PROVIDERID                     PHASE     AGE   VERSION
capu-cluster   kapi-control-plane-bzmwp             kapi      ip-10-0-4-81.eu-west-2.compute.internal    aws:///eu-west-2a/i-c896acba   Running   20m   v1.27.4
capu-cluster   kapi-worker-7857d9b577xcf6jn-pnhv9   kapi      ip-10-0-3-230.eu-west-2.compute.internal   aws:///eu-west-2a/i-7db45201   Running   23m   v1.27.4
root@ip-10-9-24-171:/home/outscale/test-talos/test/bastion# kubectl get pod -A
NAMESPACE     NAME                                                              READY   STATUS    RESTARTS      AGE
kube-system   coredns-67cbdd6d9d-5btl8                                          1/1     Running   0             13m
kube-system   coredns-67cbdd6d9d-pfxp9                                          1/1     Running   0             13m
kube-system   kube-apiserver-ip-10-0-4-81.eu-west-2.compute.internal            1/1     Running   0             12m
kube-system   kube-controller-manager-ip-10-0-4-81.eu-west-2.compute.internal   1/1     Running   2 (13m ago)   12m
kube-system   kube-flannel-2hz8w                                                1/1     Running   1 (11m ago)   13m
kube-system   kube-flannel-42skr                                                1/1     Running   1 (11m ago)   13m
kube-system   kube-proxy-hqdhq                                                  1/1     Running   0             13m
kube-system   kube-proxy-mmdwr                                                  1/1     Running   0             13m
kube-system   kube-scheduler-ip-10-0-4-81.eu-west-2.compute.internal            1/1     Running   4 (13m ago)   12m
kube-system   osc-cloud-controller-manager-msnwk                                1/1     Running   0             87s

@ghost ghost force-pushed the change-providerid branch from f0d10ab to 035ab8c Compare August 25, 2023 12:18
@ghost
Copy link
Author

ghost commented Aug 25, 2023

@azert9 @regisbelson Are you ok if we set your config of talos as an example in cluster-api-provider-outscale repo ?

@ghost ghost force-pushed the change-providerid branch from 035ab8c to bd4c025 Compare August 25, 2023 12:30
@ghost ghost force-pushed the change-providerid branch from bd4c025 to 0367c7b Compare August 25, 2023 12:39
@azert9
Copy link

azert9 commented Aug 25, 2023

Sure! However beware that the security groups are very open, they might need some adjustments.

@ghost ghost requested a review from outscale-toa August 28, 2023 12:19
@ghost ghost merged commit 89fcc50 into main Aug 30, 2023
@jfbus jfbus deleted the change-providerid branch February 12, 2025 13:40
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: mismatch with providerID set by cloud-provider-osc
4 participants