-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
disk lookup by LUN ID can match a wrong disk #2034
Comments
@andyzhangx - there's an older similar issue which was closed without a solution: #584 |
@RomanBednar thanks for the reminder, I will check try to fix it in Nov. |
@RomanBednar can you set udev rule on every node? thus |
close this issue since |
@andyzhangx Sorry for late reply, we were waiting for customer feedback. They're using this set of udev rules currently and still hit the issue: https://github.com/Azure/WALinuxAgent/blob/master/config/66-azure-storage.rules And the
If I understand it right, the reason to have the udev rules was so that the driver matches the disk in But when the driver logs this warning: |
@RomanBednar that's incorrect, if azuredisk-csi-driver/pkg/azuredisk/azure_common_linux.go Lines 120 to 131 in 1054758
This azure disk csi driver will always find disk under |
@andyzhangx I might have described it wrong, you're right there is no matching done for devices under What we have under Maybe there's a different issue that I don't see yet? Because we have the right udev rules, the correct path is populated and still we're getting a wrong disk. |
What happened:
When the driver is trying to match a disk by LUN ID in findDiskByLunWithConstraint it can end up matching a wrong disk if system is configured with multiple iSCSI disks with same suffix, for example
/sys/bus/scsi/devices/6:0:0:0
and/sys/bus/scsi/devices/4:0:1:0
- both have:0
suffix which is what the driver uses for matching.In the case we observed
6:0:0:0
was the disk expected to match but instead4:0:1:0
was matched. Probably because there are two search paths and if the LUN ID0
lookup fails for/dev/disk/azure/scsi1/
but succeeds for/dev/disk/by-id/
it might return incorrect disk as there are no further checks.The disk with
4:0:1:0
in our case appears to be a Bitlocker Encryption Key (BEK) Volume which contained partitions and caused mount-utils to fail on format detection which returns"unknown data, probably partitions"
in such case:Not sure why the code is written this way, but would it be possible to change this so the
/dev/disk/azure/scsi1/
path is searched first and any possible fallback toby-id
is more sophisticated to ensure we match correct disk? Current code will match almost anythingby-id
which is not safe as shown above.What you expected to happen:
Driver should match correct disk for systems with multiple iSCSI disks with same suffix.
How to reproduce it:
Not sure - in our case it happened during upgrade when previous volumes were provisioned with in-tree plugin, but we don't have any evidence on when the BEK Volume was added. And as @jsafrane pointed out, the in-tree plugin is using the same device discovery so it's unclear why we did not hit this issue before.
Anything else we need to know?:
Environment:
kubectl version
):uname -a
):The text was updated successfully, but these errors were encountered: