Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VMware - Secure Boot & Virtual TPM #10225

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

stejskalleos
Copy link
Contributor

@stejskalleos stejskalleos commented Jun 27, 2024

Add support for Secure Boot and TPM
Inspired by https://github.com/ananace/foreman_vmware_advanced
Required FOG PR: fog/fog-vsphere#305

@stejskalleos stejskalleos changed the title VMware - Secure Boot & Virtual TPM [WIP] VMware - Secure Boot & Virtual TPM Jun 27, 2024
@stejskalleos stejskalleos marked this pull request as draft June 27, 2024 08:11
Copy link
Member

@chris1984 chris1984 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment, looks like tests are failing since we don't have fog-vsphere built with your change.

app/views/compute_resources_vms/form/vmware/_base.html.erb Outdated Show resolved Hide resolved
Copy link
Member

@ekohl ekohl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the firmware we have a field auto. Wouldn't it make more sense to add a value there for UEFI + SecureBoot? The default is auto, which derives it from the PXE loader. We should enhance that to automatically detect if secureboot was desired.

@nofaralfasi
Copy link
Contributor

With the firmware we have a field auto. Wouldn't it make more sense to add a value there for UEFI + SecureBoot? The default is auto, which derives it from the PXE loader. We should enhance that to automatically detect if secureboot was desired.

I didn't know that's how the Automatic option works. I'll have to adjust my recent changes accordingly.

@ekohl
Copy link
Member

ekohl commented Jul 18, 2024

It works by deriving the firmware_type from the pxe_loader:

def firmware_type(pxe_loader)
case pxe_loader
when 'None'
:none
when /UEFI/
:uefi
else
:bios
end
end
end

This is then used in the firmware_mapping:

def firmware_mapping(firmware_type)
return 'efi' if firmware_type == :uefi
'bios'
end

And in parsing the arguments this is used to determine the firmware:

args[:firmware] = firmware_mapping(firmware_type) if args[:firmware] == 'automatic'

In my libvirt EFI patch I copied this pattern:
https://github.com/theforeman/foreman/pull/9965/files#diff-525a301145c328b1c4d2e6dd7668a9ce4b9f64edb7b2876a4cdebeb5ee2689f9R162-R177

You can see I already added a TODO there to include secure boot.

This is completely lost in #10209.

@nofaralfasi nofaralfasi changed the title [WIP] VMware - Secure Boot & Virtual TPM VMware - Secure Boot & Virtual TPM Jul 21, 2024
@nofaralfasi nofaralfasi marked this pull request as ready for review July 21, 2024 12:09
@nofaralfasi
Copy link
Contributor

Requires: fog/fog-vsphere#305

@nofaralfasi
Copy link
Contributor

UI changes in my commit:

  • Added a new firmware type UEFI Secure Boot for Secure Boot.
  • Removed the Secure Boot checkbox (replaced it with the new firmware type).
  • Added a label helper indicating that TPM is only compatible with EFI firmware.
  • Hid the TPM option from the UI when it is not relevant.

image

Copy link
Member

@ekohl ekohl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test failure looks related. I think you need to delete (and then handle) virtual_tpm in parse_args.

app/views/compute_resources_vms/form/vmware/_base.html.erb Outdated Show resolved Hide resolved
@nofaralfasi
Copy link
Contributor

Test failure looks related. I think you need to delete (and then handle) virtual_tpm in parse_args.

I thought it's because we're missing the fog/fog-vsphere#305 PR.

@ekohl
Copy link
Member

ekohl commented Jul 22, 2024

No, it's because it tries to assign it to the host model now instead of transforming it to a vsphere api parameter

@chris1984
Copy link
Member

@nofaralfasi did you want to address @ekohl concerns before I test, that way we can test it just once and get this in faster once the code looks good.

@ekohl
Copy link
Member

ekohl commented Jul 22, 2024

concerns before I test, that way we can test it just once and get this in faster once the code looks good.

I assure you, you'll see the exact same failures as the automated tests so there's no point in testing manually.

@nofaralfasi
Copy link
Contributor

No, it's because it tries to assign it to the host model now instead of transforming it to a vsphere api parameter

Where do you see it in the tests result? I still think it's because of the missing PR from fog-vsphere.

@ekohl
Copy link
Member

ekohl commented Jul 24, 2024

Oh yes, I think I misread that. Apologies for that. Perhaps you can modify the PR to actually point to the PR so we can see if the tests would pass otherwise.

@nofaralfasi
Copy link
Contributor

I have another idea on how to implement this. Moving it to WIP until it's ready.

@ekohl
Copy link
Member

ekohl commented Jul 30, 2024

@nofaralfasi and I had some discussion and I came up with ekohl@7071624 as some possible improvements.

@nofaralfasi nofaralfasi requested a review from a team as a code owner July 31, 2024 12:15
@nofaralfasi
Copy link
Contributor

nofaralfasi commented Jul 31, 2024

Explanation of the Update:
Creating a new CR machine differs from creating a bare-metal machine, which means we currently lack a way to validate the VM form fields submitted by the user.

The recent updates have introduced a scenario where setting the firmware type to BIOS and enabling the new Virtual TPM option results in an error, as TPM is not compatible with BIOS.
To address this, the best solution would be to catch this error before sending the data to fog-vsphere and raise an error for the user, while preserving the form with its current values.

Future Improvements:
This part of the code would benefit from future enhancements, such as providing better descriptions of the fields (e.g., explaining what it means to set the firmware to Automatic) and adding validation of the data before it is sent to fog-vsphere. Unfortunately, we lack this functionality not only for the parameters added in this PR but also for the existing ones. If we want to refactor it, we should do it all together, which would require a larger PR.

app/models/compute_resources/foreman/model/vmware.rb Outdated Show resolved Hide resolved
@@ -49,6 +51,12 @@ end %>
{ :disabled => images.empty?, :label => _('Image'), :label_size => "col-md-2" } %>
</div>

<%= checkbox_f f, :virtual_tpm, { :help_inline => _("Add Virtual TPM module to the VM."),
:label => _('Virtual TPM'),
:label_help => _("Only compatible with EFI firmware."),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here it will also be confusing if we have both EFI and UEFI. A user may assume that it's incompatible with UEFI Secure Boot while in practice it's not.

You may solve it like this:

Suggested change
:label_help => _("Only compatible with EFI firmware."),
:label_help => _("Only compatible with (U)EFI firmware."),

But I'd argue to standardize everywhere on UEFI. Wikipedia also redirects from EFI to UEFI.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you think that’s clearer, I’ll proceed with the change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd still prefer to align everything on 1 term so it's clear it's the same under the hood, just secure boot or not.

test 'chooses BIOS firmware when no pxe loader is set and firmware is automatic' do
attrs_in = HashWithIndifferentAccess.new('firmware' => 'automatic')
attrs_out = {:firmware => "bios"}
assert_equal attrs_out, @cr.parse_args(attrs_in)
end
end

context 'virtual_tpm' do
test 'sets virtual_tpm to true when firmware is EFI and virtual_tpm is enabled' do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also expect test cases that cover BIOS firmware and UEFI secure boot

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to test the create_vm method. Testing only parse_args is not sufficient.

bundler.d/vmware.rb Outdated Show resolved Hide resolved
@nofaralfasi
Copy link
Contributor

In progress: improve testing for virtual_tpm.

end
end

if args[:firmware]&.start_with?('uefi')
Copy link
Contributor

@sbernhard sbernhard Aug 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me, this is pretty hard to understand why the firmware type is changed from uefi to efi. Maybe a comment why this needs to be done would help for the furture.
Additionally, I think the code can be simplified:

args[:firmware] = 'efi' if args[:firmware]&.start_with?('uefi')
args[:secure_boot] = args[:firmware] == 'uefi_secure_boot'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that comment should describe the block above (automatic firmware) as well. The handling as a whole is not intuitive so a comment can make sense.

Copy link
Member

@ekohl ekohl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree the current code isn't great at relaying information.

What happens now is the orchestration calls setCompute:

def setCompute
logger.info "Adding Compute instance for #{name}"
if compute_attributes.nil?
failure _("Failed to find compute attributes, please check if VM %s was deleted") % name
return false
end
# TODO: extract the merging into separate class in combination
# with ComputeAttributesMerge and InterfacesMerge http://projects.theforeman.org/issues/14536
final_compute_attrs = compute_attributes.merge(compute_resource.host_compute_attrs(self))
self.vm = compute_resource.create_vm(final_compute_attrs)
rescue => e
# Workaround bug when the exception itself contains unresolved string placeholder
# Example: Foreman could not find a required vSphere resource. Check if Foreman has the required permissions and the resource exists. Reason: %s
# See: https://projects.theforeman.org/issues/32273
begin
failure _("Failed to create a compute %{compute_resource} instance %{name}: %{message}\n ") % { :compute_resource => compute_resource, :name => name, :message => e.message }, e
rescue => e2
logger.warn "Got #{e2.class} when accessing #{e.class} message attribute, falling back to message_untranslated containing #{e.message_untranslated}"
failure _("Failed to create a compute %{compute_resource} instance %{name}: %{message}\n ") % { :compute_resource => compute_resource, :name => name, :message => e.message_untranslated }
end
end

This calls the create_vm which calls parse_args. This means you can't really use the errors framework to get specific field level errors and only can raise an exception. Worse, we're actually performing expensive API calls (in parse_network) when we could fail early.

Still, parse_args can raise an exception and that should bubble up. You can then also add tests without having to test the whole create_vm method.

My thought process behind it is that you should fail as early as possible. Longer term I'd suggest that we should refactor setCompute to call something like normalize_args that effectively does that parse_args does now and returns early. Before any connections to the real compute resource have been made.

@nofaralfasi
Copy link
Contributor

nofaralfasi commented Aug 1, 2024

This calls the create_vm which calls parse_args. This means you can't really use the errors framework to get specific field level errors and only can raise an exception. Worse, we're actually performing expensive API calls (in parse_network) when we could fail early.

Before that, we are calling the new_vm method also from here compute_object. So if we raise the error from parse_args, the exception will result in breaking the whole page.

Update:
I can catch the exception in the compute_object method, which will break the form and display an error message to the user. I was trying to avoid this because I consider it an edge case and didn't want an error that breaks the whole form. However, this approach ensures that the exception is presented in a user-friendly manner, and it's been handled before sending the values to VMware.

My thought process behind it is that you should fail as early as possible...

I agree, but as I mentioned before, I propose adding validations for all the values before we send them to VMware.

@ekohl
Copy link
Member

ekohl commented Aug 1, 2024

Before that, we are calling the new_vm method also from here compute_object. So if we raise the error from parse_args, the exception will result in breaking the whole page.

Ah, this is something I missed altogether. Then I think your approach is good enough now and we need do something more to untangle the mess.

These are the reasons the host creation/edit form is so complex and why so many attempts to improve it end up failing.

Copy link
Member

@ekohl ekohl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given what you told me, I think the previous version with the raising in create_vm was indeed better. I'd prefer to revert back to that, or try what I suggest here in this review. I don't know if this works, but my untested theory is that the error should show up in the form field with TPM itself.


args[:virtual_tpm] = ActiveRecord::Type::Boolean.new.cast(args[:virtual_tpm])
if args[:firmware] == 'bios' && args[:virtual_tpm]
raise ArgumentError, _('TPM is not compatible with BIOS firmware. Please change Firmware or disable TPM.')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given what you pointed out, perhaps we can go back to errors.add(:virtual_tpm, _('TPM is not compatible with BIOS firmware. Please change Firmware or disable TPM.'))

Then later (I'll add a comment there too) we can check for errors

@@ -522,7 +548,6 @@ def create_vm(args = { })
clone_vm(args)
else
vm = new_vm(args)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then you can raise the error here (probably with some message):

Suggested change
vm = new_vm(args)
vm = new_vm(args)
raise ArgumentError if errors.any?

@nofaralfasi nofaralfasi marked this pull request as ready for review August 8, 2024 09:51
stejskalleos and others added 2 commits August 8, 2024 15:34
- Added a new firmware type for Secure Boot.
- Handles TPM conflict with BIOS.
- Removed unnecessary methods from the VMware model.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants