Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encryption failing on Ubuntu 16.04 #334

Closed
russcam opened this issue Feb 3, 2017 · 21 comments
Closed

Encryption failing on Ubuntu 16.04 #334

russcam opened this issue Feb 3, 2017 · 21 comments

Comments

@russcam
Copy link

russcam commented Feb 3, 2017

I'm building Azure Disk Encryption into an ARM template and have configured encryption of both OS disk and attached data disks within the template, to happen after VM provisioning and software installation using the newer Script VM Extension for Linux 2.0 used in Quickstart templates (Azure/azure-quickstart-templates#2340).

The template deployment succeeds and reports successful, but the encryption operation is failing to encrypt the osDisk and the data disks are not encrypted (I guess the process fails on the osDisk and doesn't get to the data disks).

Here's a snippet of the extension log at /var/log/azure/Microsoft.Azure.Security.AzureDiskEncryptionForLinux/0.1.0.999283/extension.log (I can provide the full one if needed)

2017/02/03 00:16:54 [Microsoft.Azure.Security.AzureDiskEncryptionForLinux-1.0]: [StatusReport (0)] op: EnableEncryptionOSVolume
2017/02/03 00:16:54 [Microsoft.Azure.Security.AzureDiskEncryptionForLinux-1.0]: [StatusReport (0)] status: error
2017/02/03 00:16:54 [Microsoft.Azure.Security.AzureDiskEncryptionForLinux-1.0]: [StatusReport (0)] code: 19
2017/02/03 00:16:54 [Microsoft.Azure.Security.AzureDiskEncryptionForLinux-1.0]: [StatusReport (0)] msg: Failed to encrypt OS volume with error: Attempt #1 to unmount /oldroot failed with error: Command umount /oldroot failed with return code 32
2017/02/03 00:16:54 stdout:
2017/02/03 00:16:54
2017/02/03 00:16:54 stderr:
2017/02/03 00:16:54 umount: /oldroot: target is busy
2017/02/03 00:16:54         (In some cases useful info about processes that
2017/02/03 00:16:54          use the device is found by lsof(8) or fuser(1).)
2017/02/03 00:16:54 , stack trace: Traceback (most recent call last):
2017/02/03 00:16:54   File "/var/lib/waagent/Microsoft.Azure.Security.AzureDiskEncryptionForLinux-0.1.0.999283/main/oscrypto/ubuntu_1604/Ubuntu1604EncryptionStateMachine.py", line 166, in start_encryption
2017/02/03 00:16:54     self.enter_unmount_oldroot()
2017/02/03 00:16:54   File "/var/lib/waagent/Microsoft.Azure.Security.AzureDiskEncryptionForLinux-0.1.0.999283/transitions/transitions/core.py", line 222, in trigger
2017/02/03 00:16:54     return self.machine.process(f)
2017/02/03 00:16:54   File "/var/lib/waagent/Microsoft.Azure.Security.AzureDiskEncryptionForLinux-0.1.0.999283/transitions/transitions/core.py", line 526, in process
2017/02/03 00:16:54     return trigger()
2017/02/03 00:16:54   File "/var/lib/waagent/Microsoft.Azure.Security.AzureDiskEncryptionForLinux-0.1.0.999283/transitions/transitions/core.py", line 247, in _trigger
2017/02/03 00:16:54     if t.execute(event):
2017/02/03 00:16:54   File "/var/lib/waagent/Microsoft.Azure.Security.AzureDiskEncryptionForLinux-0.1.0.999283/transitions/transitions/core.py", line 148, in execute
2017/02/03 00:16:54     self._change_state(event_data)
2017/02/03 00:16:54   File "/var/lib/waagent/Microsoft.Azure.Security.AzureDiskEncryptionForLinux-0.1.0.999283/transitions/transitions/core.py", line 159, in _change_state
2017/02/03 00:16:54     event_data.machine.get_state(self.dest).enter(event_data)
2017/02/03 00:16:54   File "/var/lib/waagent/Microsoft.Azure.Security.AzureDiskEncryptionForLinux-0.1.0.999283/transitions/transitions/core.py", line 48, in enter
2017/02/03 00:16:54     event_data.machine.callback(oe, event_data)
2017/02/03 00:16:54   File "/var/lib/waagent/Microsoft.Azure.Security.AzureDiskEncryptionForLinux-0.1.0.999283/transitions/transitions/core.py", line 518, in callback
2017/02/03 00:16:54     func(*event_data.args, **event_data.kwargs)
2017/02/03 00:16:54   File "/var/lib/waagent/Microsoft.Azure.Security.AzureDiskEncryptionForLinux-0.1.0.999283/main/oscrypto/ubuntu_1604/Ubuntu1604EncryptionStateMachine.py", line 114, in on_enter_state
2017/02/03 00:16:54     super(Ubuntu1604EncryptionStateMachine, self).on_enter_state()
2017/02/03 00:16:54   File "/var/lib/waagent/Microsoft.Azure.Security.AzureDiskEncryptionForLinux-0.1.0.999283/main/oscrypto/OSEncryptionStateMachine.py", line 65, in on_enter_state
2017/02/03 00:16:54     self.state_objs[self.state].enter()
2017/02/03 00:16:54   File "/var/lib/waagent/Microsoft.Azure.Security.AzureDiskEncryptionForLinux-0.1.0.999283/main/oscrypto/ubuntu_1604/encryptstates/UnmountOldrootState.py", line 134, in enter
2017/02/03 00:16:54     self.command_executor.Execute('umount /oldroot', True)
2017/02/03 00:16:54   File "/var/lib/waagent/Microsoft.Azure.Security.AzureDiskEncryptionForLinux-0.1.0.999283/main/CommandExecutor.py", line 70, in Execute
2017/02/03 00:16:54     raise Exception(msg)
2017/02/03 00:16:54 Exception: Command umount /oldroot failed with return code 32
2017/02/03 00:16:54 stdout:
2017/02/03 00:16:54
2017/02/03 00:16:54 stderr:
2017/02/03 00:16:54 umount: /oldroot: target is busy
2017/02/03 00:16:54         (In some cases useful info about processes that
2017/02/03 00:16:54          use the device is found by lsof(8) or fuser(1).)
2017/02/03 00:16:54
2017/02/03 00:16:54 , stack trace: Traceback (most recent call last):
2017/02/03 00:16:54   File "/var/lib/waagent/Microsoft.Azure.Security.AzureDiskEncryptionForLinux-0.1.0.999283/main/handle.py", line 1522, in daemon_encrypt
2017/02/03 00:16:54     os_encryption.start_encryption()
2017/02/03 00:16:54   File "/var/lib/waagent/Microsoft.Azure.Security.AzureDiskEncryptionForLinux-0.1.0.999283/main/oscrypto/ubuntu_1604/Ubuntu1604EncryptionStateMachine.py", line 184, in start_encryption
2017/02/03 00:16:54     raise Exception(message)
2017/02/03 00:16:54 Exception: Attempt #1 to unmount /oldroot failed with error: Command umount /oldroot failed with return code 32
2017/02/03 00:16:54 stdout:
2017/02/03 00:16:54
2017/02/03 00:16:54 stderr:
2017/02/03 00:16:54 umount: /oldroot: target is busy
2017/02/03 00:16:54         (In some cases useful info about processes that
2017/02/03 00:16:54          use the device is found by lsof(8) or fuser(1).)

Checking the encryption status through the Azure PowerShell SDK correlates with the problem in the log:

Get-AzureRmVmDiskEncryptionStatus -ResourceGroupName "encrypted-cluster" -VMName "data-0"
Get-AzureRmVmDiskEncryptionStatus : Long running operation failed with status 'Failed'.
ErrorCode: VMExtensionProvisioningError
ErrorMessage: VM has reported a failure when processing extension 'AzureDiskEncryptionForLinux'. Error message: "Failed to encrypt OS volume with error: Attempt #1 to unmount /oldroot failed with 
error: Command umount /oldroot failed with return code 32
stdout:
stderr:
umount: /oldroot: target is busy
        (In some cases useful info about processes that
         use the device is found by lsof(8) or fuser(1).)
, stack trace: Traceback (most recent call last):
  File "/var/lib/waagent/Microsoft.Azure.Security.AzureDiskEncryptionForLinux-0.1.0.999283/main/oscrypto/ubuntu_1604/Ubuntu1604EncryptionStateMachine.py", line 166, in start_encryption
    self.enter_unmount_oldroot()
  File "/var/lib/waagent/Microsoft.Azure.Security.AzureDiskEncryptionForLinux-0.1.0.999283/transitions/transitions/core.py", line 222, in trigger
    return self.machine.process(f)
  File "/var/lib/waagent/Microsoft.Azure.Security.AzureDiskEncryptionForLinux-0.1.0.999283/transitions/transitions/core.py", line 526, in process
    return trigger()
  File "/var/lib/waagent/Microsoft.Azure.Security.AzureDiskEncryptionForLinux-0.1.0.999283/transitions/transitions/core.py", line 247, in _trigger
    if t.execute(event):
  File "/var/lib/waagent/Microsoft.Azure.Security.AzureDiskEncryptionForLinux-0.1.0.999283/transitions/transitions/core.py", line 148, in execute
    self._change_state(event_data)
  File "/var/lib/waagent/Microsoft.Azure.Security.AzureDiskEncryptionForLinux-0.1.0.999283/transitions/transitions/core.py", line 159, in _change_state
    event_data.machine.get_state(self.dest).enter(event_data)
  File "/var/lib/waagent/Microsoft.Azure.Security.AzureDiskEncryptionForLinux-0.1.0.999283/transitions/transitions/core.py", line 48, in enter
    event_data.machine.callback(oe, event_data)
  File "/var/lib/waagent/Microsoft.Azure.Security.AzureDiskEncryptionForLinux-0.1.0.999283/transitions/transitions/core.py", line 518, in callback
    func(*event_data.args, **event_data.kwargs)
  File "/var/lib/waagent/Microsoft.Azure.Security.AzureDiskEncryptionForLinux-0.1.0.999283/main/oscrypto/ubuntu_1604/Ubuntu1604EncryptionStateMachine.py", line 114, in on_enter_state
    super(Ubuntu1604EncryptionStateMachine, self).on_enter_state()
  File "/var/lib/waagent/Microsoft.Azure.Security.AzureDiskEncryptionForLinux-0.1.0.999283/main/oscrypto/OSEncryptionStateMachine.py", line 65, in on_enter_state
    self.state_objs[self.state].enter()
  File "/var/lib/waagent/Microsoft.Azure.Security.AzureDiskEncryptionForLinux-0.1.0.999283/main/oscrypto/ubuntu_1604/encryptstates/UnmountOldrootState.py", line 134, in enter
    self.command_executor.Execute('umount /oldroot', True)
  File "/var/lib/waagent/Microsoft.Azure.Security.AzureDiskEncryptionForLinux-0.1.0.999283/main/CommandExecutor.py", line 70, in Execute
    raise Exception(msg)
Exception: Command umount /oldroot failed with return code 32

My understanding is that it is possible to encrypt the disks on a running VM based on the examples in the quickstart templates. The data disks are RAID0ed as part of the script that installs the software.

Should the encryption happen before the VM Script extension runs and software is deployed, or if it can happen after this, is this a bug in the encryption process?

@marty-buly
Copy link

Similar issue here running encryption of OS + Data via Powershell on Ubuntu 16.04:

2017/03/09 18:46:02 [Microsoft.Azure.Security.AzureDiskEncryptionForLinux-1.0]: 1447: [Info] Command /bin/mount /dev/sdc1 /mnt/azure_bek_disk -t vfat failed with return code 32
2017/03/09 18:46:02 stdout:
2017/03/09 18:46:02
2017/03/09 18:46:02 stderr:
2017/03/09 18:46:02 mount: /dev/sdc1 is already mounted or /mnt/azure_bek_disk busy
2017/03/09 18:46:02 /dev/sdc1 is already mounted on /mnt/azure_bek_disk
2017/03/09 18:46:02
2017/03/09 18:46:02 [Microsoft.Azure.Security.AzureDiskEncryptionForLinux-1.0]: 1447: [Info] /var/lib/azure_disk_encryption_config/azure_crypt_mount does not exist

@24X7
Copy link

24X7 commented Apr 3, 2017

Is this getting worked on?

@marty-buly
Copy link

marty-buly commented Apr 3, 2017 via email

@anniehedgpeth
Copy link

anniehedgpeth commented May 17, 2017

I tried increasing the memory, and that did not help. I'm getting the same similar error when I run it from Powershell or Azure CLI. Could it be related?

Command

az vm encryption enable --aad-client-id <aad-client-id> --disk-encryption-keyvault <dek> -n <vmname> -g <rgname> --aad-client-secret <aad-client-secret> --volume-type All

Error

VM has reported a failure when processing extension 'AzureDiskEncryptionForLinux'. Error message: "Failed to enable the extension with error: 'NoneType' object has no attribute 'getheader', stack trace: Traceback (most recent call last):
  File "/var/lib/waagent/Microsoft.Azure.Security.AzureDiskEncryptionForLinux-0.1.0.999297/main/handle.py", line 671, in enable_encryption
    DiskEncryptionKeyFileName=extension_parameter.DiskEncryptionKeyFileName)
  File "/var/lib/waagent/Microsoft.Azure.Security.AzureDiskEncryptionForLinux-0.1.0.999297/main/KeyVaultUtil.py", line 63, in create_kek_secret
    bearerHeader = result.getheader("www-authenticate")
AttributeError: 'NoneType' object has no attribute 'getheader'

Updated to add: This may have been an issue with my Key Vault settings and/or SPN.

I've opened a new issue for my error.

@vsukhin
Copy link

vsukhin commented May 24, 2017

I'm using PowerShell, the same issue

Set-AzureRmVMDiskEncryptionExtension : Long running operation failed with status 'Failed'.
ErrorCode: VMExtensionProvisioningError
ErrorMessage: VM has reported a failure when processing extension 'AzureDiskEncryptionForLinux'. Error message:
"Failed to enable the extension with error: 'NoneType' object has no attribute 'getheader', stack trace: Traceback
(most recent call last):
File "/var/lib/waagent/Microsoft.Azure.Security.AzureDiskEncryptionForLinux-0.1.0.999297/main/handle.py", line 671,
in enable_encryption
DiskEncryptionKeyFileName=extension_parameter.DiskEncryptionKeyFileName)
File "/var/lib/waagent/Microsoft.Azure.Security.AzureDiskEncryptionForLinux-0.1.0.999297/main/KeyVaultUtil.py", line
63, in create_kek_secret
bearerHeader = result.getheader("www-authenticate")
AttributeError: 'NoneType' object has no attribute 'getheader'

@vsukhin
Copy link

vsukhin commented May 25, 2017

It's an exception in HTTPUtil.Call method for KeyVaultURL. Tried it manually from inside VM, it works fine

@olivierba
Copy link

Same issue here on a VM with 14Gb of RAM, tried several time without success

[AzureDiskEncryption] 2943: [Info] Attempt #11 to unmount /oldroot
[AzureDiskEncryption] 2943: [Info] Attempt #11 to unmount /oldroot failed with error: Could not unmount /oldroot in 10 attempts, stack trace: Traceback (most recent call last):
File "/var/lib/waagent/Microsoft.Azure.Security.AzureDiskEncryptionForLinux-0.1.0.999297/main/oscrypto/ubuntu_1604/Ubuntu1604EncryptionStateMachine.py", line 168, in start_encryption
raise Exception("Could not unmount /oldroot in 10 attempts")
Exception: Could not unmount /oldroot in 10 attempts

I'm mounting the data disk in /opt instead of /mnt via cloud-init

@psinger
Copy link

psinger commented Jun 30, 2017

I did encrypt two different vms, and for both I am unable to get them up again.

Checking the log, it appears to end in the error umount: /oldroot/mnt: not mounted.

Any ideas?

@vsukhin
Copy link

vsukhin commented Jun 30, 2017 via email

@psinger
Copy link

psinger commented Jun 30, 2017

@vsukhin Can you please elaborate?

@olivierba
Copy link

Hi,
I've solved the issue by not activating basic metric in the diagnostic settings. According to the support it install an extention on the VM an it interfere with the encryption process.

@psinger
Copy link

psinger commented Jun 30, 2017

Basic metric is already off :/

@vsukhin
Copy link

vsukhin commented Jun 30, 2017 via email

@psinger
Copy link

psinger commented Jun 30, 2017

That's not really a solution to me because I can't even remove the encryption.

@vsukhin
Copy link

vsukhin commented Jun 30, 2017 via email

@makhdumi
Copy link

It seems that Disk Encryption for Linux is not really implemented and is unsafe. It seems way easier to just run the two cryptsetup commands and mount the volume yourself.

#497
#496

@psinger
Copy link

psinger commented Nov 30, 2017

It seems to work quite well with vanilla Ubuntu.

@makhdumi
Copy link

It doesn't seem like KEK is implemented, based on the feedback I've gotten from Azure support, and the permissions on the keyfile are open to everyone. I really recommend not using this extension.

@ejarvi
Copy link
Collaborator

ejarvi commented Mar 29, 2018

This initial issue appears to have been due to incorrect key vault settings, or perhaps an incorrect format being passed through the ARM template. If this occurred then it would certainly surface in the form the extension error logs provided.

With respect to the other issues brought up here, they seem to be similar end result (unfriendly and unhelpful error messages from the extension) but different root causes, so I will start chipping away on separate threads for those.

In the hopes that it will help shed more light on how to avoid this type of error in the future, here are some links on setting up key vault, the recommended workflow, and common troubleshooting tips:

Powershell script demonstrating how to set up the necessary key vault prerequisites:
https://github.com/Azure/azure-powershell/blob/master/src/ResourceManager/Compute/Commands.Compute/Extension/AzureDiskEncryption/Scripts/AzureDiskEncryptionPreRequisiteSetup.ps1

Recommended Workflow:
https://docs.microsoft.com/en-us/azure/security/azure-security-disk-encryption-faq#what-is-the-recommended-azure-disk-encryption-workflow-for-Linux

Troubleshooting Guide:
https://docs.microsoft.com/en-us/azure/security/azure-security-disk-encryption-tsg

@ejarvi ejarvi closed this as completed Mar 29, 2018
@darrell-tethr
Copy link

This problem appears to have been caused by low memory. I had the minimum amount of RAM required (7gb) on the VM but still got the /oldroot errors. Try stopping the Elasticsearch service on the ES Data Node before running encryption on the OS and data disks. It worked for me.

Example--
monit stop elasticsearch

@ejarvi
Copy link
Collaborator

ejarvi commented Nov 2, 2018

Thanks for closing the loop on this.. I suspect that even with total RAM of 7GB if disk layout is different than the gallery image or if available memory is low even though total memory is high due to other active memory use this problem can still be triggered during OS encryption stage as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests