-
Notifications
You must be signed in to change notification settings - Fork 308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alpine Linux Cannot Navigate Azure Files Mounts Reliably(SMB client issue in alpha image) #1325
Comments
could you provide more details about azure file? what's the azure file storage class? Have you tried using premium file? |
I would reference the linked thread for more details. We're using Azure Files standard. Azure Files Premium breaks our cost structure due to the 100 TiB minimum quota requirement. Per the musl team in the thread I linked to, the issue I'm reporting does not appear in Ubuntu containers using the GNU C library but does appear in Alpine-based containers using the MUSL C library. The root cause appears to be somewhere in the kernel -- it's caching data in some way that causes unexpected results when iterating-while-deleting 64+ files on a mounted NFS or SMB share. The issue does not happen on GNU C because it uses much, much larger reads for directory iteration, which seems to effectively prevent the kernel from buffering the directory listing. |
This would be useful to get more information about. Many performance optimizations for metadata went in after the 4.18 kernel (especially around the 5.0 kernel) for SMB3 queries, but even on 4.15 there are a few obvious things that are worth trying, and it is also possible that this bug has been fixed in the last three years (and is in more recent kernels) but in the meantime can you try some potential workarounds? Have you tried setting mount option "actimeo=0" (to disable caching and see if the "can't remove directory" issue goes away)? The reverse, caching directory entries for longer periods of time (SMB3 defaults to a short cache lifetime for dentries of only 1 second) by e.g. setting actime=60 (instead of its default of one second) would be useful to see how that affected your workload. In addition there are valid cases where "rm -rf" would fail. For example, if the application left open one of the files in that directory tree then it can not be deleted from the server until the file is closed (although it can be marked as to be deleted on close). NFS client on Linux can work around this with a strategy called 'silly-rename' and so if you do turn out to have this problem (ie the application forgot to close one or more of the files before deleting it), there may be a 'silly-rename' strategy on SMB3 client that we can add to cifs.ko to workaround the application problem in a similar way. One way to check if the "application forgot to close a file" is related to your problem is to do an "lsof +D " before you do "rm -rf" to ensure no files are open in that directory tree. |
Per the info in this article, we are able to demonstrate |
I tried |
I tried an experiment just now with this old Ubuntu kernel (4.15) mounted with SMB3 to Azure and didn't see a problem with either 256 or 2048 files (see below) using the test script mentioned earlier in the post. To reproduce this problem may require a more complex setup with containers (or perhaps a very old kernel missing some fixes?). root@smf-old-ubuntu:/mnt/smftestshares# ~/test.sh 256 Trying to delete test files... root@smf-old-ubuntu:/mnt/smftestshares# ls Trying to delete test files... root@smf-old-ubuntu# uname -a |
@smfrench: As I mentioned, the bug is not reproducible on Ubuntu because it uses GNU Standard C instead of MUSL C. GNU C works around the kernel bug by doing large reads to avoid caching; MUSL does small reads. You would need to try this on an Alpine container. |
This comment has been minimized.
This comment has been minimized.
I was wrong, it could repro, I forgot to
|
Here is the way how to repro this issue on your local environment, it’s directly related to SMB client issue(don’t need AKS cluster): mkdir /tmp/test
sudo mount -t cifs //accountname.file.core.windows.net/test /tmp/test -o vers=3.0,username= accountname,password=…,dir_mode=0777,file_mode=0777,cache=strict,actimeo=30
wget -O /tmp/test/test.sh https://raw.githubusercontent.com/andyzhangx/demo/master/debug/test.sh
docker run -it -v /tmp/test:/var/www/html/data/ --name alpine alpine:3.10 sh
# cd /var/www/html/data/
/var/www/html/data # ./test.sh 128
Creating '128' test files...
Trying to delete test files...
DELETED: 66 BEFORE: 128 AFTER: 62
DELETED: 63 BEFORE: 62 AFTER: 0 We are already looping SMB experts to take a look at this issue. Also, Same result on AKS node Ubuntu 18.04 5.0.0-1036-azure running with alpine:3.10 image: ./test.sh 128
Creating '128' test files...
Trying to delete test files...
DELETED: 66 BEFORE: 128 AFTER: 62
DELETED: 63 BEFORE: 62 AFTER: 0 |
while by using # docker run -it -v /tmp/test:/var/www/html/data/ --name ubuntu ubuntu:16.04 sh
# cd /var/www/html/data/
# ./test.sh 128
Creating '128' test files...
Trying to delete test files...
DELETED: 129 BEFORE: 128 AFTER: 0
|
Action required from @Azure/aks-pm |
Action required from @Azure/aks-pm |
@VybavaRamadoss @RenaShahMSFT could you help? |
When debugged this back in May, didn't this show the bug in the Alpine version of ls not in the network fs client(s)? The SMB3 (and presumably nfs client as well) was returning the expected files, and delete worked fine but the Alpine library (unlike libc library called by ls) had a bug. It seemed to be related to Alpine library not restarting the search properly after changing the directory contents after removing some of the files in the middle of doing a directory search. |
@smfrench No, the issue is that there is a kernel bug that GNU LibC avoids by doing greedy/large reads. Alpine does smaller reads of directory listings to limit memory consumption. So, it is more accurate to say that Alpine does not work around the kernel bug while GNU LibC does. But it is hard to say whether GNU was aware that they were working around the bug or whether it was just coincedental. |
Could you confirm if this issue should still be open then if it's specific to Alpine? |
This issue has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs within 15 days of this comment. |
This issue will now be closed because it hasn't had any activity for 15 days after stale. GuyPaddock feel free to comment again on the next 7 days to reopen or open a new issue after that time if you still have a question/issue or suggestion. |
I did some digging into this issue and also discussed this in the linux-cifs mailing list. The documentation on readdir reads (https://pubs.opengroup.org/onlinepubs/9699919799/functions/readdir_r.html): So the different filesystems are left free to choose their own behaviour when this happens. This issue is seen quite often in Alpine, because it uses musl libc, which seems to send much smaller buffers down to VFS to read the dirents into. However, the main issue here is the implementation of rm used here (I don't know if this is the default GNU version of coreutils). It depends on the undefined behaviour of Linux VFS, where it should not. When doing recursive readdirs (where it knows that the directory has changed), it should rewind back to position 0 and start the next readdir again. This way, the problem can be fixed; and to me, that sounds like the right way to fix this problem. |
When using Azure Files with Alpine-Linux-based containers on AKS, you may observe strange behavior when applications attempt to navigate folders containing more than 62 files. In fact, commands like
rm -rf
from CLI will fail withrm: can't remove 'test': Directory not empty
.A copious amount of more information (including repro steps, environment, etc) is available here:
https://gitlab.alpinelinux.org/alpine/aports/issues/10960
I'm posting a link to this issue here for two reasons:
Our nodes are currently running the following kernel version:
4.15.0-1063-azure #68-Ubuntu SMP Fri Nov 8 09:30:20 UTC 2019 x86_64 Linux
.With this version of Kubernetes:
The text was updated successfully, but these errors were encountered: