Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update azurerm_linux|windows_virtual_machine_scale_set - Add support for specialized custom image in vmss #7524

Conversation

ArcturusZhang
Copy link
Contributor

@ArcturusZhang ArcturusZhang commented Jun 29, 2020

Fixes #7772

@katbyte katbyte added enhancement service/vmss Virtual Machine Scale Sets labels Jun 29, 2020
Copy link
Contributor

@tombuildsstuff tombuildsstuff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @ArcturusZhang

Thanks for this PR.

Taking a look through it appears we're making a number of assumptions about the behaviour of the API, which has changed in the past - as such we need to look this information up. We also need to add callouts that there's risks when using a Specialized images which aren't immediately apparent - as such can we add this to the disclaimers block?

Thanks!

@@ -869,14 +898,20 @@ func resourceArmLinuxVirtualMachineScaleSetRead(d *schema.ResourceData, meta int
d.Set("source_image_id", storageImageId)
}

disablePassword := true // default value of `disable_password_authentication` to avoid unexpected diff
provisionVMAgent := true // default value of `provision_vm_agent` to avoid unexpected diff
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

defaulting this to true is an unsafe default - we should default these to false unless we can show otherwise - the VM API should be exposing this information either way, if it's not, that's a bug in the Compute API

Copy link
Contributor Author

@ArcturusZhang ArcturusZhang Jul 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the schema definition, these two attributes are optional and have default value of true, and the biggest reason I have set these default values here is that in the SDK, these two attributes are in the OsProfile block, which is nil in the response when using specialized image (and OsProfile has to be nil in the request).
And we can take two different scenarios to view this:

  1. The user is not using the specialized image (the use case before this PR). If they are not specifying these two attributes, we will send the default value of these two (which is true according to the schema definition). And if the service is not making anything wrong, it should return the same values as the input. Therefore this should be consistency.
  2. The user is using the specialized image (the use case that is introducing in this PR). In this case, this indeed is problematic... I am not quite so sure how we could elegantly deal with this. Because these two attributes come from the OsProfile block, these two attributes are forbidden to set when using a specialized image. And these two attributes are optional with default values of true therefore absence of these two is the same as explicitly assigning those to true. And because of the same reason, I just skipped the checking of these two in L420-436. This would look so weird if I check these two are absent (I have to check these two are set to true). Similarly, if the user does not set these two in the config, they will have the value of their default (aka true). In the response, these two values will not be returned, if we regard this as false, in this case we will never get an empty plan after apply.

Do you have any recommendation on this situation?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, whilst the Compute API can opt to not require this information at Provisioning time - this information should be available from the image - so it should be able to be returned in the Get response once the machine has been provisioned?

From our side, assuming those fields have a default value has security implications - and as such we can't assume they're set to true. For example, if a user is using tooling (such as Sentinel or OpenPolicyAgent) to confirm that these flags are set to true - we'd assume this value is set to true even when it may not be.

As such, I'd suggest reaching out to the Compute Team here since this feels like a bug in the Compute API?

Copy link
Contributor Author

@ArcturusZhang ArcturusZhang Jul 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Emmm... maybe I am not fully getting the point.

But in this case, if a VM is provisioned using a specialized image, the OSProfile block must be nil, and when you get it, you will also get a nil OSProfile block. These two parameters are in the OSProfile.

I am making these two true because when you apply a VM provisioned by a specialized image, the service will not return a valid value for OSProfile and therefore no valid values for disablePassword and provisionVMAgent. When absence, terraform is assuming those as false. And now when you run terraform plan, you will get diff (since their default values are true)

Therefore rather than a bug in compute API, I suppose this should be a shortage of terraform that we cannot assign null to a bool attribute. And I am considering this as a workaround for the unnecessary diff.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But in this case, if a VM is provisioned using a specialized image, the OSProfile block must be nil, and when you get it, you will also get a nil OSProfile block. These two parameters are in the OSProfile.

My point here being even if the API doesn't require sending that information during creation - but this information should be returned from the API during the read, that's a bug in the Azure API.

Unfortunately assuming that these two fields are set to true is misleading and could lead to a subtle security issue where users believe this functionality is enabled (for example, from the image) but it's not.

Whilst not setting this value in Terraform will make the field null - the Azure API itself needs to return this information so that we can confirm to users this security functionality is enabled, else there's a potential security issue here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in agreement with @tombuildsstuff - assumptions on a security posture must be avoided. For context, the issue isn't just procedural interaction with the API, it's the real-world implication of a key security setting. I've worked in environments where this kind of potential exposure could result in severe repercussions; both in business reputation/finance as well as potentially legal in extreme cases.

@@ -918,6 +955,8 @@ func resourceArmWindowsVirtualMachineScaleSetRead(d *schema.ResourceData, meta i
d.Set("source_image_id", storageImageId)
}

enableAutomaticUpdates := true // default value of `enable_automatic_updates` to avoid unexpected diff
provisionVMAgent := true // default value of `provision_vm_agent` to avoid unexpected diff
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(as above) we cannot assume these are enabled by default - there's false-positives here - we need to default these to false and parse them out of the API

@ArcturusZhang
Copy link
Contributor Author

ArcturusZhang commented Jul 2, 2020

Hi @tombuildsstuff I have resolved some of the comments. As for the default values of true I have left some explanations, please have a look, thank you!

@tombuildsstuff
Copy link
Contributor

@ArcturusZhang since the API doesn't return the data we need here unfortunately we're blocked on this one - as such I'm going to close this PR for the moment until there's a means of retrieving this information.

@ghost
Copy link

ghost commented Sep 6, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 hashibot-feedback@hashicorp.com. Thanks!

@ghost ghost locked and limited conversation to collaborators Sep 6, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
documentation enhancement service/vmss Virtual Machine Scale Sets size/XL upstream/microsoft Indicates that there's an upstream issue blocking this issue/PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

azurerm_windows_virtual_machine fails to create when referencing a "specialised" shared gallery image.
4 participants