Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: Unable to provision multiple nodes using Vagrant #133

Open
aubaugh opened this issue Feb 8, 2023 · 2 comments
Open

bug: Unable to provision multiple nodes using Vagrant #133

aubaugh opened this issue Feb 8, 2023 · 2 comments
Labels
bug Something isn't working

Comments

@aubaugh
Copy link

aubaugh commented Feb 8, 2023

Summary

I have a Vagrantfile that provisions three boxes running AlmaLinux 8 via libvirt, which use the Ansible provisioner to include this role.

I have no agent nodes, thus I'm not tainting the server nodes. I ran into no issues when provisioning a single node cluster, but I run into issues when specifying multiple server nodes in my Ansible inventory.

In High Availability mode:
I run into the following error on the task: Create keepalived config file

An exception occurred during task execution. To see the full traceback, use -vvv.
The error was: ansible.errors.AnsibleUndefinedVariable: 'ansible.vars.hostvars.HostVarsVars object' has no attribute 'ansible_default_ipv4'.
'ansible.vars.hostvars.HostVarsVars object' has no attribute 'ansible_default_ipv4'

I was able to get past this issue by changing the below line to {{ hostvars[host].ansible_host }}

{{ hostvars[host]['ansible_default_ipv4']['address'] }}

Issue Type

Bug Report

Ansible Version

ansible [core 2.14.1]
  config file = None
  configured module search path = ['/home/austin/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /home/austin/.local/lib/python3.11/site-packages/ansible
  ansible collection location = /home/austin/.ansible/collections:/usr/share/ansible/collections
  executable location = /home/austin/.local/bin/ansible
  python version = 3.11.1 (main, Dec 11 2022, 15:18:51) [GCC 10.2.1 20201203] (/usr/bin/python3)
  jinja version = 3.1.2
  libyaml = True

Steps to Reproduce

Have libvirt setup and the vagrant-libvirt plugin installed along with Vagrant, Ansible, and this role.

Below are the three files necessary when running vagrant up:

Vagrantfile

NODES = [
    { hostname: "controller1", ip: "192.168.111.2", ram: 4096, cpu: 2 },
    { hostname: "controller2", ip: "192.168.111.3", ram: 4096, cpu: 2 },
    { hostname: "controller3", ip: "192.168.111.4", ram: 4096, cpu: 2 }
]

Vagrant.configure(2) do |config|
  NODES.each do |node|
    config.vm.define node[:hostname] do |config|
      config.vm.hostname = node[:hostname]
      config.vm.box = "almalinux/8"
      config.vm.network :private_network, ip: node[:ip]

      config.vm.provider :libvirt do |domain|
        domain.memory = node[:ram]
        domain.cpus = node[:cpu]
      end

      config.vm.provision :ansible do |ansible|
        ansible.playbook = "playbooks/provision.yml"
        ansible.inventory_path = "inventory/hosts.ini"
      end
    end
  end
end

playbooks/provision.yml

- hosts: all
  become: true
  vars:
    rke2_channel: stable
    rke2_servers_group_name: rke2_servers
    rke2_agents_group_name: rke2_agents
    rke2_ha_mode: true
  roles:
  - lablabs.rke2

inventory/hosts.ini

[rke2_servers]
controller1 ansible_host=192.168.111.2 rke2_type=server
controller2 ansible_host=192.168.111.3 rke2_type=server
controller3 ansible_host=192.168.111.4 rke2_type=server

[rke2_agents]

[k8s_cluster:children]
rke2_servers
rke2_agents

Expected Results

For three server nodes to be provisioned after running vagrant up

Actual Results

All servers fail to provision rke2 Ansible role.
@aubaugh aubaugh added the bug Something isn't working label Feb 8, 2023
@Larswa
Copy link

Larswa commented Nov 22, 2024

I know this is an ooooold issue, but i case anyone else ends up here, I had this same problem.

In my case it was down to the etcd instances tying themselves to the NAT Network Interface on my guests, and that results in the etcd nodes not being able to contact eachother on that network.
I tried playing with etcd configuration settings https://etcd.io/docs/v3.4/op-guide/configuration/ like the --listen-peer-urls but was never able to get a combination that really worked. I ended giving up on vagrant for muliple controlplane nodes testing and went with multipass as I was using ubuntu servers anyway.

Even on windows with hyperv, that worked really well.

@Larswa
Copy link

Larswa commented Nov 22, 2024

Take a look at the etcd configuration and the logs from the etcd pod/containers, and see if any of the etcd urls in use are 10.0.2.15 which seems to be the default NAT eth0 Network Interface address on all the Vagrant guests Ive been playing with. If that is the case, then this is likely your issue.

sudo cat /var/lib/rancher/rke2/server/db/etcd/config
sudo /var/lib/rancher/rke2/bin/crictl --runtime-endpoint unix:///run/k3s/containerd/containerd.sock ps

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants