tfrobot: easy mass deployer for VMs #1504

despiegk · 2024-01-02T10:53:45Z

As a system administrator, I want to deploy a large number of virtual machines across different node groups with specific hardware and network configurations

Acceptance Criteria

Node Group Creation:

The system should allow the creation of node groups with specified attributes such as number of nodes, minimum cores, memory, SSD and HDD storage, IP availability (both IPv4 and IPv6), and region.
Each node group should be able to specify if the machines are dedicated, certified, minimum nodes in the farm, and minimum nodes applying the same rules.
The system should be able to handle a specified bandwidth for each node group.

Virtual Machine Deployment:

VMs should be deployable within the created node groups.
Each VM specification should include the number of VMs, associated node group, CPU cores, memory, SSD capacity and mounting, HDD attachment, public IP (IPv4 and IPv6), file list, root size, and associated SSH key. There should be flexibility in VM deployment, such as specifying the number of VMs

SSH Key Management:

SSH keys should be manageable, allowing the specification of a key name and public key.

Output Specification:

Upon successful deployment, the system should output a YAML file with details of the deployed node groups and VMs, including their names and public IP addresses.
In case of errors, the output file should include the details of the affected node group or VM and the corresponding error message.

Parallel Processing:

The deployment process should be optimized for parallel execution to ensure efficient and quick deployment across multiple node groups and VMs using Batch calls.

Error Handling:

The system should robustly handle errors, providing meaningful error messages in the output file for any issues encountered during the deployment process.
Example Scenario
A user inputs the YAML or JSON configuration for deploying several VMs across different regions with specific hardware and network requirements. The system processes this configuration, setting up the node groups and VMs as specified. Once the deployment is complete, the system outputs a YAML file containing details of each deployed VM and node group, including their public IP addresses. In case of any errors, the system provides detailed error messages to facilitate troubleshooting. The entire process runs efficiently in parallel to minimize deployment time.

example spec for mass deployment

- nodegroup:
    - name: 'group_a'
      #amount of nodes to be found
      nrnodes: 5
      #cores = logical core
      nrcores_min: 10
      #gb of memory
      mem_min: 32
      ssd_min: 2000
      hdd_min: 30000
      # full machine capacity available, can it be made dedicated
      dedicated: true
      pubip4: true
      pubip6: true
      # comma separate list of region's
      # list see: https://apps.who.int/gho/data/node.searo-metadata.UNREGION?lang=en
      region: "UN_Africa,UN_Eastern_Asia"
      certified: true
      #min nr of nodes in farm
      min_nodes_farm: 5
      #nr of nodes in same farm which apply the rules above
      min_nodes_apply_rules: 3
      #bandwidth which can be achieved, do we know this?
      min_bw: 100
- sshkey:
    - name: 'despiegk'
      pubkey: ''
- vms:
    - name: 'mymachine_${nr}'
      nrvm: 4
      nodegroup: 'group_a'
      #logical cores of machine
      nrcpu: 10
      #mb of memory
      mem: 2000
      #ssd capacity in GB
      ssd: 
        -   capacity: 200
            mount: '/mydata'
      # means all HD's are given raw to VM
      hdd_attached: true
      pubip4: true
      pubip6: true
      flist: ...
      rootsize: 200
      sshkey: 'despiegk'
    - name: 'mydb'
      nodegroup: 'group_a'
      # all capacity means we create vm having access to all SSD's, to all mem, to all cores, to all HDD's, SSD all means we create dirs with no limits, each on each SSD and give as /data/1... for each ssd
      all_capacity:true
      pubip4: true
      pubip6: true
      flist: ...
      sshkey: 'despiegk'
      rootsize: 100

can be given in yaml or json

the result is yaml or json and gives the following info

- ok:
    - name: 'group_a'
      pubip4: '333.333.333.333'
      pubip6: '...'
- error:
    - name: 'group_a'
    - msg: ''

despiegk · 2024-01-02T11:09:28Z

maybe need to add something with uptime requirement for node

despiegk · 2024-01-30T08:34:17Z

we need to be able to delete what we deployed as well

despiegk added the type_story label Jan 2, 2024

rawdaGastan mentioned this issue Jan 4, 2024

Add mass deployer pkg threefoldtech/tfgrid-sdk-go#604

Closed

xmonader self-assigned this Jan 11, 2024

xmonader added this to 3.14.x Jan 11, 2024

xmonader moved this to In Progress in 3.14.x Jan 11, 2024

xmonader changed the title ~~easy deployer VM~~ tfrobot: easy mass deployer for VMs Jan 17, 2024

ramezsaeed mentioned this issue Jan 30, 2024

3.13 Testplan #1489

Closed

16 tasks

xmonader mentioned this issue Jan 31, 2024

🐞 [Bug] TFRobot: Add more details in the output file threefoldtech/tfgrid-sdk-go#724

Closed

A-Harby mentioned this issue Feb 5, 2024

TFRobot: missing requirements threefoldtech/tfgrid-sdk-go#756

Closed

xmonader added this to 3.13.x Feb 26, 2024

xmonader removed this from 3.14.x Feb 26, 2024

xmonader added this to the 3.13 milestone Feb 26, 2024

xmonader closed this as completed Feb 26, 2024

github-project-automation bot moved this to Done in 3.13.x Feb 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tfrobot: easy mass deployer for VMs #1504

tfrobot: easy mass deployer for VMs #1504

despiegk commented Jan 2, 2024 •

edited by xmonader

Loading

despiegk commented Jan 2, 2024 •

edited

Loading

despiegk commented Jan 30, 2024

tfrobot: easy mass deployer for VMs #1504

tfrobot: easy mass deployer for VMs #1504

Comments

despiegk commented Jan 2, 2024 • edited by xmonader Loading

Acceptance Criteria

Node Group Creation:

Virtual Machine Deployment:

SSH Key Management:

Output Specification:

Parallel Processing:

Error Handling:

despiegk commented Jan 2, 2024 • edited Loading

despiegk commented Jan 30, 2024

despiegk commented Jan 2, 2024 •

edited by xmonader

Loading

despiegk commented Jan 2, 2024 •

edited

Loading