-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hang in traverse - how to debug? #555
Comments
For that backtrace, only thread 2 is doing anything (the others are blocked in g_cond_wait), so the next steps would be |
Just did...
and now waiting(!) |
|
I pressed CTRL-C in the
That's "just" the directory where I call rmlint from (cwd of the call to rmlint), not the path of the file in question (?)
|
Then either GDB is confused or the path is corrupt. If it is definitely blocked in the kernel (something like |
I have to humbly ask for a translation if possible ;)
The original files were copied from a failing disk. This could be(?) a root of the problem perhaps(?) |
In GDB, run |
Is this the relevant pointer?:
|
Yep, it's definitely in a kernel syscall. Running |
Thanks for all of your help :) Stack
i.e. the files are empty...(?) dmesg
Does this look like issue? I do see the same message sporadically in the past week. I have had uas problems (even leading to device offlining) with other raspberrypis. It is a known problem area. Downgrading to usb2(!) can help, as can simply rebooting and unplugging/replugging the drives(!) and kernel updates help too. I checked the kernel version on this machine and see it is very old. gdb
|
Dunno why there are no kernel stacks in /proc (maybe it's an optional feature or your kernel has limited stack unwinding?) but I would definitely attribute the hang to those kernel messages. Not an rmlint issue. |
Will close this as it really appears to be my system - thanks 😊 for the great input and help |
Investigating this:
So I went back to "old school" debugging and placed a logging output in traverse.c:
And now I see that rmlint chokes on AppContainerUserCertRead:
This is a pipe/FIFO:
You mentioned excluding such files from rmlint: Originally posted by @cebtenzzre in #555 (comment) |
I am able to reproduce the issue like this:
I am not surprised that you are one of the first people to hit this, because it requires that the non-default unstripped binary search is enabled, and that it encounters a FIFO, and that the FIFO is marked executable (apparently because it came from a Windows system). Here's the fix: lib/utilities.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lib/utilities.c b/lib/utilities.c
index 30d82607..00299728 100644
--- a/lib/utilities.c
+++ b/lib/utilities.c
@@ -467,7 +467,7 @@ bool rm_util_is_nonstripped(_UNUSED const char *path, _UNUSED RmStat *statp) {
#if HAVE_LIBELF
g_return_val_if_fail(path, false);
- if(statp && (statp->st_mode & (S_IXUSR | S_IXGRP | S_IXOTH)) == 0) {
+ if(!S_ISREG(statp->st_mode) || (statp->st_mode & (S_IXUSR | S_IXGRP | S_IXOTH)) == 0) {
return false;
}
|
Only attempt to open regular files. It is entirely possible that we encounter an executable FIFO, which causes open() to block indefinitely if it is not opened for writing by another process. Fixes sahib#555
Your very fast patch fixed the issue for me :) |
Only attempt to open regular files. It is entirely possible that we encounter an executable FIFO, which causes open() to block indefinitely if it is not opened for writing by another process. Fixes sahib#555
Only attempt to open regular files. It is entirely possible that we encounter an executable FIFO, which causes open() to block indefinitely if it is not opened for writing by another process. Fixes sahib#555
Only attempt to open regular files. It is entirely possible that we encounter an executable FIFO, which causes open() to block indefinitely if it is not opened for writing by another process. Fixes sahib#555
Only attempt to open regular files. It is entirely possible that we encounter an executable FIFO, which causes open() to block indefinitely if it is not opened for writing by another process. Fixes sahib#555
Only attempt to open regular files. It is entirely possible that we encounter an executable FIFO, which causes open() to block indefinitely if it is not opened for writing by another process. Fixes sahib#555
Running master 2.10.1 and develop plus a few patches and I am getting a hang during initial traverse with both versions.
Traversing (316260 usable files / 0 + 0 ignored files / folders)
Always at the same file count.
After this, the drives spin down and nothing appears to be happening. The process continues to run, waiting for something. CPU usage is low.
I checked RAM use - no problems there. 4GB RAM, ca. 2GB used. (Raspberry Pi 4B)
With
-vvv
set, the last message I (always) see is:WARNING: Added big file /srv/path/Scans BU Dec 2014/Africa_Turmoil_WEB.pdf
Would it be this file causing the problem? (probably more likely something afterwards)
Running gdb (develop version compiled with DEBUG=1) when I get to the "same place" I do a CTRL-C (not sure if this is useful)
I am not sure if these backtraces help - I am VERY rusty at debugging.
Any hints on how to hunt this problem down appreciated :)
The text was updated successfully, but these errors were encountered: