Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PARKED - switch from add_host to dynamic inventory #804

Closed
wants to merge 18 commits into from
Closed

PARKED - switch from add_host to dynamic inventory #804

wants to merge 18 commits into from

Conversation

IPvSean
Copy link
Contributor

@IPvSean IPvSean commented Apr 28, 2020

SUMMARY

this switches from the add_host method to using the aws_ec2 dynamic inventory method

PROBLEM STATEMENT WITH ADD_HOST
  • the add_host required multiple loops for each type of instance. If I had 50 node1s that would require you to loop 50 times (just for node1). If i had four nodes (ansible, node1, node2, node3) times 50 student, this would require 200 loops. This also required a get_info (e.g. ec2_instance_info) four times (once for each instance type). (see example below)
  • as you can see below the add_host is also long... 13 lines+ for each host, which literally was creating a 57+ line add_Host section per workshop type or 360 lines of Ansible Playbook just dealing with inventory creation. The dynamic inventory approach can shrink 6 files of 360+ lines and multiple tasks into one file of 44 lines (and honestly without security workshop would be like 30 lines!)
  • the templates with add_host were horrendous! For example we could not use Ansible's natural parallelism to template out files on each device until we got inventory sorted, so you would have to loop through facts for each type of instance, map that to the correct student using tags, then subsequent roles would be clean again (see example below)

there may be an issue with this method as outlined in this github issue here: ansible/ansible#69227

I am going to work on a workaround

ISSUE TYPE
  • Feature Pull Request
COMPONENT NAME
  • provisioner
ADDITIONAL INFORMATION

IPvSean added 8 commits April 28, 2020 19:31
My favorite person ever @liquidat
windows, network, rhel_90 and rhel are all working with dynamic inventory now
makesure workshop_inventory directory is present, failing zuul tests
@cigamit
Copy link
Contributor

cigamit commented Apr 29, 2020

This will break workshops when VMs fail to Tag because of AWS Rate Limiting (with add_hosts they still get setup properly).

This will also break the ability to have provisioned multiple workshops at once. The 2nd workshop will override "{{playbook_dir}}/workshop_inventory/aws_ec2.yml". This stops you from being able to tear down the 1st workshop since it relies on the inventory script.

@IPvSean
Copy link
Contributor Author

IPvSean commented Apr 29, 2020

This will also break the ability to have provisioned multiple workshops at once. The 2nd workshop will override "{{playbook_dir}}/workshop_inventory/aws_ec2.yml". This stops you from being able to tear down the 1st workshop since it relies on the inventory script.

this makes sense to me, I have been thinking about how to solve this with Sloane.

This will break workshops when VMs fail to Tag because of AWS Rate Limiting (with add_hosts they still get setup properly).

the instances get tagged with ec2_tag, not with inventory. I am confused here.

@IPvSean
Copy link
Contributor Author

IPvSean commented Apr 29, 2020

add_host method:
ansible-playbook provision_lab.yml -e @sean.yml  275.34s user 62.35s system 16% cpu 33:15.19 total

dynamic inventory method:
ansible-playbook provision_lab.yml -e @sean.yml  226.91s user 57.68s system 15% cpu 30:10.12 total

time savings of dynamic inventory from initial testing of 2 students is 3 minutes and 5 seconds

making provisioner more stable
Comment on lines +30 to +32
{% for n in range(1, student_total + 1) %}
student{{ n }}: "'student{{ n }}' == tags.Student"
{% endfor %}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
{% for n in range(1, student_total + 1) %}
student{{ n }}: "'student{{ n }}' == tags.Student"
{% endfor %}
keyed_groups:
- key: tags.Student
prefix: ''
separator: ''

@cigamit
Copy link
Contributor

cigamit commented Apr 29, 2020

This will break workshops when VMs fail to Tag because of AWS Rate Limiting (with add_hosts they still get setup properly).

the instances get tagged with ec2_tag, not with inventory. I am confused here.

Currently, the ec2 tagging can fail because of rate limiting. Its a known problem that doesn't affect the running of the workshop (the add_hosts ensure that they are still in the running inventory). They really need to fix the module as some of the other are to detect rate limiting and retry.

With the move to using a dynamic inventory though, the playbook won't know about these hosts as they now don't match the filter. Thus they don't get added to the running inventory, and don't get setup. The only person that will notice is the students when they go to use their lab environment, as the nodes won't be able to be connected to. The playbook run itself will pass, because it will setup every host that it "knows" about.

@IPvSean
Copy link
Contributor Author

IPvSean commented Apr 30, 2020

Currently, the ec2 tagging can fail because of rate limiting. Its a known problem that doesn't affect the running of the workshop (the add_hosts ensure that they are still in the running inventory). They really need to fix the module as some of the other are to detect rate limiting and retry.

but we use tags to use the add_host method->

---
- name: grab facts for node1 node
  ec2_instance_info:
    region: "{{ ec2_region }}"
    filters:
      instance-state-name: running
      "tag:Workshop_node1": "{{ec2_name_prefix}}-node1"
  register: node1_node_facts

are you saying the count_tag is working no matter what, while the ec2_tag module gets rate-limited?

IPvSean added 3 commits April 30, 2020 08:18
also removing old templates
fix windows workshop, accidentally removed this template
@cigamit
Copy link
Contributor

cigamit commented Apr 30, 2020

We do a ec2 call to create the instances (but we apply no tags at creation, not even the same tag we use for count_tag.... we should change that!) and then we run those through a loop and apply the ec2_tags.

The instances I see that are untagged have zero tags applied to them, so yes, they wouldn't even be picked up by our "ec2_instance_info", so it seems they aren't being setup even today. I wish we could reproduce this reliably, then we could check the hosts and see exactly what is going on.

For reference, in Skylight, we don't use the ec2_instance_info call at all, we already have all the information we need about the hosts from the ec2 variables, so why do more API calls to get the same data?

@IPvSean
Copy link
Contributor Author

IPvSean commented Apr 30, 2020

For reference, in Skylight, we don't use the ec2_instance_info call at all, we already have all the information we need about the hosts from the ec2 variables, so why do more API calls to get the same data?

ah this makes way more sense to me now, i need to noodle on this... we are on the same wavelength now

@cigamit
Copy link
Contributor

cigamit commented Apr 30, 2020

I will point out, that you use the info call, because you use the variable in places such as the inventory template, and you use the tags from the variable too, which aren't present in the ec2 variable, since you tag later.

@IPvSean
Copy link
Contributor Author

IPvSean commented May 1, 2020

this PR is parked, it will wait until we have a import_inventory at least

@IPvSean IPvSean changed the title switch from add_host to dynamic inventory PARKED - switch from add_host to dynamic inventory - DO NOT CLOSE May 1, 2020
@IPvSean IPvSean closed this May 2, 2020
@IPvSean IPvSean changed the title PARKED - switch from add_host to dynamic inventory - DO NOT CLOSE PARKED - switch from add_host to dynamic inventory May 2, 2020
@IPvSean IPvSean deleted the gh-pages branch May 2, 2020 13:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants