Talos install error - couldn't get current server API group list: - tls: internal error #1401

DavidIlie · 2024-04-02T10:49:44Z

Following the pathway to install Talos, I have an issue where the cluster does seem to setup but the workers do not join, and the masters have these errors:

error refreshing pod status and the error is related to TLS (tls: internal error)
and also controller failed errors too

this is what I see in my terminal while setting it up

onedr0p · 2024-04-02T12:26:49Z

I think this is expected due to etcd not being bootstrapped?

cluster-template/.taskfiles/Talos/Taskfile.yaml

Lines 55 to 63 in b6234fc

    
           bootstrap-install: 
        
             desc: Install the Talos cluster 
        
             dir: "{{.TALOS_DIR}}" 
        
             cmds: 
        
               - echo "Installing Talos... ignore the errors and be patient" 
        
               - until talhelper gencommand bootstrap | bash; do sleep 10; done 
        
               - sleep 10 
        
             preconditions: 
        
               - { msg: "Missing talhelper config file", sh: "test -f {{.TALHELPER_CONFIG_FILE}}" }

It doesn't seem like the above was ran. At what point in the task commands did you get to and was there any errors on the client side?

bojanraic · 2024-04-02T17:24:36Z

In addition to @onedr0p's comments, when I tried to install Talos manually, it took some time for the master to be ready.
In both of the screenshots, uptime is only a few minutes so maybe it hasn't finished bootstrapping yet.
I went back to k3s in the meantime, but please update us on your Talos install progress via this template and I may take another stab at it when time permits.
Good luck!

DavidIlie · 2024-04-02T17:34:39Z

I left it running the whole night yesterday and the same thing happened. I am also sure that I think all scripts are running and before the node first reboots/loads there are errors regarding something like a "admin" certificate

Any ideas?

onedr0p · 2024-04-02T18:22:27Z

Maybe give it another shot when you have a moment? Not sure what happened here to be honest could be a ton of different issues from misconfig to network issues to anything else really :/

The important bits of the config that can really go wrong if not set right are the network and disk selectors.

DavidIlie · 2024-04-03T09:48:04Z

Disk selectors work I believe, data is being written to the disk and network is working o I believe on all the nodes.

The error is just the "tls: internal error" every time the masters try to fetch something from their own localhost IP

DavidIlie · 2024-04-03T09:51:07Z

The bootstrap first begins with these errors in the console

But I believe that's when the nodes get rebooted as then it boots and continues til kubelet is healthy but the error is back:

And then my terminal tries to connect to the VIP and nothing happens

onedr0p · 2024-04-03T12:51:10Z

I saw this in your previous config (sorry this is all I have to go on from #1398 (comment))

    networkInterfaces:
      - deviceSelector:
          hardwareAddr: ""

That should be the nodes mac address, are you sure this is populated? It should be in xx:xx:xx:xx:xx:xx format and be unique per-node.

cluster-template/config.sample.yaml

Line 57 in 28ae26d

    
           #   talos_nic: ""      # (Required: Talos) MAC address of the NIC for this node (talosctl get links -n <ip> --insecure)

onedr0p · 2024-04-03T18:28:35Z

I added validation on talos_nic here to hopefully catch this for other people in the future.

DavidIlie · 2024-04-03T20:43:56Z

I already populated those, I just redacted them when I sent it here. Every single value is present

DavidIlie · 2024-04-03T21:05:54Z

2024-04-04.00-02-53.mp4

This is a recording of what happens

onedr0p · 2024-04-03T21:25:45Z

I wonder if you need to use a different type of network selector in the Talos/talhelper config or change something in the NIC settings on the VM in Proxmox?

I just hand-held someone thru the whole repo who is using bare-metal nodes and we had success after figuring out they were not setting the correct value for talos_nic which lead me to commit validation on that.

DavidIlie · 2024-04-03T21:28:34Z

Do you have an example of what I would need to do?

onedr0p · 2024-04-03T21:32:39Z

I am probably not the best person to ask about that as I do not use any hypervisors in my life right now 😄

Maybe a good start is to review the talos proxmox docs and see if everything lines up there and with the rendered config here.

onedr0p · 2024-04-03T21:42:43Z

Keep in mind there are a bunch of different network selectors you can use so maybe mac address is not the best with PVE? I dunno.

Repository owner locked and limited conversation to collaborators Apr 5, 2024

onedr0p converted this issue into discussion #1407 Apr 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

Talos install error - couldn't get current server API group list: - tls: internal error #1401

Talos install error - couldn't get current server API group list: - tls: internal error #1401

DavidIlie commented Apr 2, 2024

onedr0p commented Apr 2, 2024 •

edited

Loading

bojanraic commented Apr 2, 2024

DavidIlie commented Apr 2, 2024

onedr0p commented Apr 2, 2024 •

edited

Loading

DavidIlie commented Apr 3, 2024

DavidIlie commented Apr 3, 2024

onedr0p commented Apr 3, 2024 •

edited

Loading

onedr0p commented Apr 3, 2024 •

edited

Loading

DavidIlie commented Apr 3, 2024

DavidIlie commented Apr 3, 2024

onedr0p commented Apr 3, 2024

DavidIlie commented Apr 3, 2024

onedr0p commented Apr 3, 2024

onedr0p commented Apr 3, 2024 •

edited

Loading

This issue was moved to a discussion.

This issue was moved to a discussion.

Talos install error - couldn't get current server API group list: - tls: internal error #1401

Talos install error - couldn't get current server API group list: - tls: internal error #1401

Comments

DavidIlie commented Apr 2, 2024

onedr0p commented Apr 2, 2024 • edited Loading

bojanraic commented Apr 2, 2024

DavidIlie commented Apr 2, 2024

onedr0p commented Apr 2, 2024 • edited Loading

DavidIlie commented Apr 3, 2024

DavidIlie commented Apr 3, 2024

onedr0p commented Apr 3, 2024 • edited Loading

onedr0p commented Apr 3, 2024 • edited Loading

DavidIlie commented Apr 3, 2024

DavidIlie commented Apr 3, 2024

onedr0p commented Apr 3, 2024

DavidIlie commented Apr 3, 2024

onedr0p commented Apr 3, 2024

onedr0p commented Apr 3, 2024 • edited Loading

This issue was moved to a discussion.

onedr0p commented Apr 2, 2024 •

edited

Loading

onedr0p commented Apr 2, 2024 •

edited

Loading

onedr0p commented Apr 3, 2024 •

edited

Loading

onedr0p commented Apr 3, 2024 •

edited

Loading

onedr0p commented Apr 3, 2024 •

edited

Loading