-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix device expansion when VM is powered off #9111
Conversation
Since this code takes different code paths for device expansion depending on if For example, I create two pools:
Then I did the following:
Here we can see the size of the pools still match:
Additionally, I then destroyed the two pools, and then recreated them with
|
Also worth noting, this is a port of this delphix-os commit (originally authored by @grwilson) which never made it upstream into illumos; we've been running with the delphix-os commit since at least August 2014. |
@behlendorf any thoughts on this one? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Functionally this looks good to me. Just one question, and a small nit below. You mentioned that the label is getting rewritten with the correct values which is what's causing this issue. Do you know exactly what in your environment is doing this when powering off/on?
lib/libefi/rdwr_efi.c
Outdated
} | ||
} | ||
VERIFY3U(data_start + data_size, ==, resv_start); | ||
VERIFY3U(limit, >=, resv_start); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Technically I believe we should be using verify()
here for portability since this is built solely as a library. That said, returning VT_EINVAL
as an error would probably be better since we are working with values read from disk and the vtoc could be damaged/incorrect. It would be better to handle the error rather than crash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can try to make this change if you'd like it in this PR..
I don't know for sure, but my guess is this is done by the ESX hypervisor. |
@behlendorf in illumos we had seen that ESX was updating the label during this type of expansion. It caught us by surprise that it did that. |
When running on an ESXi based VM, I've found that "zpool online -e" will not expand the zpool, if the disk was expanded in ESXi while the VM was powered off. For example, take the following scenario: 1. VM running on top of VMware ESXi 2. ZFS pool created with a given device "sda" of size 8GB 3. VM powered off 4. Device "sda" size expanded to 16GB 5. VM powered on 6. "zpool online -e" used on device "sda" In this situation, after (2) the zpool will be roughly 8GB in size. After (6), the expectation is the zpool's size will expand to roughly 16GB in size; i.e. expand to the new size of the "sda" device. Unfortuantely, I've seen that after (6), the zpool size does not change. What's happening is after (5), the EFI label of the "sda" device will be such that fields "efi_last_u_lba", "efi_last_lba", and "efi_altern_lba" all reflect the new size of the disk; i.e. "33554398", "33554431", and "33554431" respectively. Thus, the check that we perform in "efi_use_whole_disk": if ((efi_label->efi_altern_lba == 1) || (efi_label->efi_altern_lba >= efi_label->efi_last_lba)) { This will return true, and then we return from the function without having expanded the size of the zpool/device. In contrast, if we remove steps (3) and (5) in the sequence above, i.e. the device is expanded while the VM is powered on, things change. In that case, the fields "efi_last_u_lba" and "efi_altern_lba" do not change (i.e. they still reflect the old 8GB device size), but the "efi_last_lba" field does change (i.e. it now reflects the new 16GB device size). Thus, when we evaluate the same conditional in "efi_use_whole_disk", it'll return false, so the zpool is expanded. Taking all of this into account, this PR updates "efi_use_whole_disk" to properly expand the zpool when the underlying disk is expanded while the VM is powered off. Signed-off-by: Prakash Surya <prakash.surya@delphix.com>
@behlendorf I've squashed this to a single commit (with your feedback) and rebased onto master. |
Codecov Report
@@ Coverage Diff @@
## master #9111 +/- ##
==========================================
- Coverage 79.11% 79.03% -0.09%
==========================================
Files 400 400
Lines 121790 121806 +16
==========================================
- Hits 96357 96265 -92
- Misses 25433 25541 +108
Continue to review full report at Codecov.
|
When running on an ESXi based VM, I've found that "zpool online -e" will not expand the zpool, if the disk was expanded in ESXi while the VM was powered off. For example, take the following scenario: 1. VM running on top of VMware ESXi 2. ZFS pool created with a given device "sda" of size 8GB 3. VM powered off 4. Device "sda" size expanded to 16GB 5. VM powered on 6. "zpool online -e" used on device "sda" In this situation, after (2) the zpool will be roughly 8GB in size. After (6), the expectation is the zpool's size will expand to roughly 16GB in size; i.e. expand to the new size of the "sda" device. Unfortunately, I've seen that after (6), the zpool size does not change. What's happening is after (5), the EFI label of the "sda" device will be such that fields "efi_last_u_lba", "efi_last_lba", and "efi_altern_lba" all reflect the new size of the disk; i.e. "33554398", "33554431", and "33554431" respectively. Thus, the check that we perform in "efi_use_whole_disk": if ((efi_label->efi_altern_lba == 1) || (efi_label->efi_altern_lba >= efi_label->efi_last_lba)) { This will return true, and then we return from the function without having expanded the size of the zpool/device. In contrast, if we remove steps (3) and (5) in the sequence above, i.e. the device is expanded while the VM is powered on, things change. In that case, the fields "efi_last_u_lba" and "efi_altern_lba" do not change (i.e. they still reflect the old 8GB device size), but the "efi_last_lba" field does change (i.e. it now reflects the new 16GB device size). Thus, when we evaluate the same conditional in "efi_use_whole_disk", it'll return false, so the zpool is expanded. Taking all of this into account, this PR updates "efi_use_whole_disk" to properly expand the zpool when the underlying disk is expanded while the VM is powered off. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Wilson <gwilson@delphix.com> Reviewed-by: Don Brady <don.brady@delphix.com> Signed-off-by: Prakash Surya <prakash.surya@delphix.com> Closes openzfs#9111
When running on an ESXi based VM, I've found that "zpool online -e" will not expand the zpool, if the disk was expanded in ESXi while the VM was powered off. For example, take the following scenario: 1. VM running on top of VMware ESXi 2. ZFS pool created with a given device "sda" of size 8GB 3. VM powered off 4. Device "sda" size expanded to 16GB 5. VM powered on 6. "zpool online -e" used on device "sda" In this situation, after (2) the zpool will be roughly 8GB in size. After (6), the expectation is the zpool's size will expand to roughly 16GB in size; i.e. expand to the new size of the "sda" device. Unfortunately, I've seen that after (6), the zpool size does not change. What's happening is after (5), the EFI label of the "sda" device will be such that fields "efi_last_u_lba", "efi_last_lba", and "efi_altern_lba" all reflect the new size of the disk; i.e. "33554398", "33554431", and "33554431" respectively. Thus, the check that we perform in "efi_use_whole_disk": if ((efi_label->efi_altern_lba == 1) || (efi_label->efi_altern_lba >= efi_label->efi_last_lba)) { This will return true, and then we return from the function without having expanded the size of the zpool/device. In contrast, if we remove steps (3) and (5) in the sequence above, i.e. the device is expanded while the VM is powered on, things change. In that case, the fields "efi_last_u_lba" and "efi_altern_lba" do not change (i.e. they still reflect the old 8GB device size), but the "efi_last_lba" field does change (i.e. it now reflects the new 16GB device size). Thus, when we evaluate the same conditional in "efi_use_whole_disk", it'll return false, so the zpool is expanded. Taking all of this into account, this PR updates "efi_use_whole_disk" to properly expand the zpool when the underlying disk is expanded while the VM is powered off. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Wilson <gwilson@delphix.com> Reviewed-by: Don Brady <don.brady@delphix.com> Signed-off-by: Prakash Surya <prakash.surya@delphix.com> Closes openzfs#9111
When running on an ESXi based VM, I've found that "zpool online -e" will not expand the zpool, if the disk was expanded in ESXi while the VM was powered off. For example, take the following scenario: 1. VM running on top of VMware ESXi 2. ZFS pool created with a given device "sda" of size 8GB 3. VM powered off 4. Device "sda" size expanded to 16GB 5. VM powered on 6. "zpool online -e" used on device "sda" In this situation, after (2) the zpool will be roughly 8GB in size. After (6), the expectation is the zpool's size will expand to roughly 16GB in size; i.e. expand to the new size of the "sda" device. Unfortunately, I've seen that after (6), the zpool size does not change. What's happening is after (5), the EFI label of the "sda" device will be such that fields "efi_last_u_lba", "efi_last_lba", and "efi_altern_lba" all reflect the new size of the disk; i.e. "33554398", "33554431", and "33554431" respectively. Thus, the check that we perform in "efi_use_whole_disk": if ((efi_label->efi_altern_lba == 1) || (efi_label->efi_altern_lba >= efi_label->efi_last_lba)) { This will return true, and then we return from the function without having expanded the size of the zpool/device. In contrast, if we remove steps (3) and (5) in the sequence above, i.e. the device is expanded while the VM is powered on, things change. In that case, the fields "efi_last_u_lba" and "efi_altern_lba" do not change (i.e. they still reflect the old 8GB device size), but the "efi_last_lba" field does change (i.e. it now reflects the new 16GB device size). Thus, when we evaluate the same conditional in "efi_use_whole_disk", it'll return false, so the zpool is expanded. Taking all of this into account, this PR updates "efi_use_whole_disk" to properly expand the zpool when the underlying disk is expanded while the VM is powered off. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Wilson <gwilson@delphix.com> Reviewed-by: Don Brady <don.brady@delphix.com> Signed-off-by: Prakash Surya <prakash.surya@delphix.com> Closes #9111
Motivation and Context
When running on an ESXi based VM, I've found that "zpool online -e" will
not expand the zpool, if the disk was expanded in ESXi while the VM was
powered off.
For example, take the following scenario:
In this situation, after (2) the zpool will be roughly 8GB in size.
After (6), the expectation is the zpool's size will expand to roughly
16GB in size; i.e. expand to the new size of the "sda" device.
Unfortuantely, I've seen that after (6), the zpool size does not change.
What's happening is after (5), the EFI label of the "sda" device will be
such that fields "efi_last_u_lba", "efi_last_lba", and "efi_altern_lba"
all reflect the new size of the disk; i.e. "33554398", "33554431", and
"33554431" respectively.
Thus, the check that we perform in "efi_use_whole_disk":
This will return true, and then we return from the function without
having expanded the size of the zpool/device.
In contrast, if we remove steps (3) and (5) in the sequence above, i.e.
the device is expanded while the VM is powered on, things change. In
that case, the fields "efi_last_u_lba" and "efi_altern_lba" do not
change (i.e. they still reflect the old 8GB device size), but the
"efi_last_lba" field does change (i.e. it now reflects the new 16GB
device size). Thus, when we evaluate the same conditional in
"efi_use_whole_disk", it'll return false, so the zpool is expanded.
Taking all of this into account, this PR updates "efi_use_whole_disk" to
properly expand the zpool when the underlying disk is expanded while the
VM is powered off.
Signed-off-by: Prakash Surya prakash.surya@delphix.com
How Has This Been Tested?
Types of changes
Checklist:
Signed-off-by
.