Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ephemeral os disk #674

Merged
merged 2 commits into from
Nov 24, 2022
Merged

Ephemeral os disk #674

merged 2 commits into from
Nov 24, 2022

Conversation

HappyTobi
Copy link
Contributor

@HappyTobi HappyTobi commented Oct 28, 2022

Hi all,

the pr related to the issue #671
Still open:
robocop fixed (1)

Checklist:

Please check each of the boxes below for which you have completed the corresponding task:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added tests that prove my fix is effective or that my feature works
  • All unit tests pass locally (after my changes)
  • Rubocop reports zero offenses (after my changes)

Please include below the summary portions of the output from the following 2 scripts:

pushd src/bosh_azure_cpi
  ./bin/test-unit
  ./bin/rubocop_check
popd

NOTE: Please see how to setup dev environment and run unit tests in docs/development.md.

Unit Test output:

1031 examples, 0 failures

Rubocop output:

Offenses:

lib/cloud/azure/vms/vm_manager.rb:22:5: C: Metrics/AbcSize: Assignment Branch Condition size for create is too high. [<85, 167, 47> 193.2/192]
    def create(bosh_vm_meta, location, vm_props, disk_cids, network_configurator, env, agent_settings, network_spec, config) ...
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

190 files inspected, 2 offenses detected, 1 offense auto-correctable

Changelog

Add Azure Ephemeral OS disk:
https://learn.microsoft.com/en-us/azure/virtual-machines/ephemeral-os-disks

Update bosh-cloud config to use ephemeral os disk:

"ephemeral_disk": {
      "size": 20480,
      "use_root_disk": true
    },
    "root_disk": {
      "placement": "cache-disk"
    }

New attribute at root_disk, placement
The placement can be set to: remote (default), cache-disk resource-disk

The disk calculation are change when use_root_disk is set to true and placement is different than remote.
The behaviour is that the size of the root and ephemeral disk are added together.

"ephemeral_disk": {
      "size": 20480,
      "use_root_disk": true
    },
    "root_disk": {
      "size": 30720,
      "placement": "cache-disk"
    }

will result in a ephemeral os disk with the size of 51200 placed on the vm cache disk.

Regards

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Oct 28, 2022

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: HappyTobi / name: Tobias Schug (60fcfe0)

@HappyTobi
Copy link
Contributor Author

@rkoster @beyhan feel free to review my pr.

Thx

@beyhan beyhan requested review from a team, beyhan, bgandon and rkoster and removed request for a team and bgandon November 3, 2022 15:41
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Nov 4, 2022

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: HappyTobi / name: Tobias Schug (f724d3e)

@HappyTobi HappyTobi force-pushed the ephemeral_os_disk branch 4 times, most recently from c8b6adb to f724d3e Compare November 4, 2022 14:50
@HappyTobi HappyTobi requested review from beyhan and removed request for rkoster November 10, 2022 11:27
@rkoster rkoster requested review from a team and danielfor and removed request for a team November 10, 2022 15:49
Copy link
Member

@beyhan beyhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have only the comment about the TODO otherwise it looks good to me.

src/bosh_azure_cpi/lib/cloud/azure/restapi/azure_client.rb Outdated Show resolved Hide resolved
@HappyTobi HappyTobi requested review from beyhan and removed request for danielfor November 11, 2022 13:46
@beyhan beyhan requested a review from danielfor November 11, 2022 14:08
@aramprice
Copy link
Member

Would it be possible to squash down the commits in this PR? The merge from master, and various fix commits will not make git history easier to comprehend. -Thanks!

@HappyTobi
Copy link
Contributor Author

I can squash them if you want, or some one else do it later when the pr will be merged.

@rkoster
Copy link
Contributor

rkoster commented Nov 17, 2022

@HappyTobi could you squash the commits as @aramprice has asked? Also, Aram did you just look at the commit messages or did you perform a full review? If so could you add your review to github so we can merge?

@aramprice
Copy link
Member

@rkoster the shape of this looks good to me but I don't have enough context on Azure to do a deep dive on the actual implementation.

@HappyTobi
Copy link
Contributor Author

@rkoster rebase is done

@klakin-pivotal
Copy link
Contributor

I'm the wrong one to give the PR a detailed evaluation, but I do have one question:

What happens if the largest temporary root disk that Azure is willing to provide is smaller than what we ask Azure for?

For context about my concern, notice that Size limit for OS disk row of the first table here https://learn.microsoft.com/en-us/azure/virtual-machines/ephemeral-os-disks, says:

Cache size or temp size for the VM size or 2040 GiB, whichever is smaller. For the cache or temp size in GiB, see DS, ES, M, FS, and GS

To see the maximum temporary storage limit for each machine type, go looking at the various machine sizes here https://learn.microsoft.com/en-us/azure/virtual-machines/sizes-general. (In particular, note that extremely budget-conscious customers might choose a Standard_B2s, which has a temporary storage limit of 8GB, or maybe a Standard_DS2_v2 which has a 14GB limit.)

Also, notice that some machine types don't have a local temporary disk. What happens if the VM size selected has no local temporary disk? https://learn.microsoft.com/en-us/azure/virtual-machines/azure-vms-no-temp-disk

@HappyTobi
Copy link
Contributor Author

@klakin-pivotal
Thx for asking.

I'm the wrong one to give the PR a detailed evaluation, but I do have one question:

What happens if the largest temporary root disk that Azure is willing to provide is smaller than what we ask Azure for?

You will receive an error that the disk cache / resource disk is to big for the select vm-type. (Azure api error response)

For context about my concern, notice that Size limit for OS disk row of the first table here https://learn.microsoft.com/en-us/azure/virtual-machines/ephemeral-os-disks, says:

Cache size or temp size for the VM size or 2040 GiB, whichever is smaller. For the cache or temp size in GiB, see DS, ES, M, FS, and GS

To see the maximum temporary storage limit for each machine type, go looking at the various machine sizes here https://learn.microsoft.com/en-us/azure/virtual-machines/sizes-general. (In particular, note that extremely budget-conscious customers might choose a Standard_B2s, which has a temporary storage limit of 8GB, or maybe a Standard_DS2_v2 which has a 14GB limit.)

Also, notice that some machine types don't have a local temporary disk. What happens if the VM size selected has no local temporary disk? https://learn.microsoft.com/en-us/azure/virtual-machines/azure-vms-no-temp-disk

You also receive an error while creating the VM.

I think you have to think about disks and vm-types upfront before you start to deploy.
The default is like before so no one will run into an issue without changing anything explicit.

Lest take a look at the DSv2-series:
https://learn.microsoft.com/en-us/azure/virtual-machines/dv2-dsv2-series#dsv2-series
Standard_DS4_v2.
Here you can choose between: resource-disk with a max size of 56GB or a cache-disk with 344GB max size.
So it's like i mentioned before, it's important to check the sizes upfront.

With the implementations it's also possible to place the os_disk only to the resource or cache-disk and the ephemeral-disk to a "default" remote one, so the user have the complete choice / freedom to configure the disk for there use-case.

@klakin-pivotal
Copy link
Contributor

Cool. If Azure sends back a response that causes the Bosh VM Creation task to fail, then that handles that concern.

Do you have examples of the error responses that we get back from Azure?

I'm concerned that they may not written well enough for an operator to understand what has gone wrong and how to correct it. If this is the case, we would be required to detect these error conditions and emit a better error message.

@HappyTobi
Copy link
Contributor Author

@klakin-pivotal
sorry for the late reply.
The error message looks like:

Error message: {
  "error": {
    "code": "NotSupported",
    "message": "OS disk of Ephemeral VM with size greater than 50 GB is not allowed for VM size Standard_D2s_v3 when the DiffDiskPlacement is CacheDisk. Please refer to https://aka.ms/Ephemeral for more details."
  }
}

Think the message is fine to everyone understand what's wrong and where they get more information if required.

@rkoster
Copy link
Contributor

rkoster commented Nov 24, 2022

Thanks @HappyTobi

@rkoster rkoster merged commit 13019a1 into cloudfoundry:master Nov 24, 2022
jpalermo added a commit that referenced this pull request Dec 4, 2022
Originally added in response to issue #472, which doesn't have a lot of details, but does mention
Virtual Machine Scale Sets. This doesn't seem to be a fully implemented feature and there is no
recent progress on making a real feature.

The config disk code is current breaking due to new api changes introduce as part of PR #674
Rather than fix those failures, it makes more sense to remove an unused feature.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

5 participants