v2api: fsstress crashes #372

rfjakob · 2020-09-18T11:05:42Z

These are pretty hard to reproduce, but they do happen after hours of running fsstress in a loop. We seem to have a race in the LOOKUP/FORGET reference counting. I'm working on finding the race, and will use this ticket to track progress.

I can repro both in loopback and also in gocryptfs (v2api branch). I had two failure modes so far:

2020/09/15 23:36:23 unknown node 3245
panic: unknown node 3245

goroutine 13312812 [running]:
log.Panicf(0x55b2fe1082ac, 0xf, 0xc0003c7db8, 0x1, 0x1)
	log/log.go:358 +0xc2
github.com/hanwen/go-fuse/v2/fs.(*rawBridge).inode(0xc0000fa5a0, 0xcad, 0x0, 0x0, 0x0)
	github.com/hanwen/go-fuse/v2@v2.0.4-0.20200908172753-0b6cbc515082/fs/bridge.go:292 +0x17d
github.com/hanwen/go-fuse/v2/fs.(*rawBridge).GetXAttr(0xc0000fa5a0, 0xc0008621e0, 0xc00097d158, 0xc0031e2b40, 0x13, 0x0, 0x0, 0x0, 0xc0000da0b8)
[...]

The FUSE operation in the backtrace varies.

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x38 pc=0x56021a70cf2a]

goroutine 6044480 [running]:
github.com/hanwen/go-fuse/v2/fs.(*rawBridge).releaseFileEntry(0xc0001725a0, 0x1763, 0x25, 0x0, 0x0)
	github.com/hanwen/go-fuse/v2@v2.0.4-0.20200908172753-0b6cbc515082/fs/bridge.go:775 +0xba
github.com/hanwen/go-fuse/v2/fs.(*rawBridge).Release(0xc0001725a0, 0xc001190360, 0xc000a86a98)
	github.com/hanwen/go-fuse/v2@v2.0.4-0.20200908172753-0b6cbc515082/fs/bridge.go:734 +0x6c
github.com/hanwen/go-fuse/v2/fuse.doRelease(0xc0001c2000, 0xc000a86900)
	github.com/hanwen/go-fuse/v2@v2.0.4-0.20200908172753-0b6cbc515082/fuse/opcode.go:379 +0x50
[...]

fsstress logs for reference (does not include FUSE debug output):
fsstress-loopback.log
fsstress-gocryptfs.log

The text was updated successfully, but these errors were encountered:

rfjakob · 2020-09-23T18:24:31Z

I had collect-crashes.sh running overnight with FUSE debug output.

It caught eight crashes, logfiles tar'ed up here:
fsstress.log.2020-09-23.tar.gz

This is the go-fuse commit being tested: rfjakob@f61e9a6 . Only some debug output added compared to master.

rfjakob · 2020-09-24T12:16:56Z

Tracked it down a little bit. Seems like the kernel does not like it when MKDIR hands out a recycled inode number before the previous RMDIR finishes. There is an extra FORGET 1 that causes a lookupCount underflow.

*03:09:08.680484 rx 553441476: FORGET i581 {Nlookup=4}  req=0xc000290900
*03:09:08.680490 removeRef: i581 lookupCount=0 nl=0 n=0xc0001aae00
*03:09:08.680493 -- 553441476:     done (no reply)
*03:09:09.694282 rx 553501112: MKDIR i307 {0777 (00)} ["d12XXXXXXXXXXXXX"] 17b req=0xc000117680
*03:09:09.694350 addNewChild: i581 lookupCount=1 nl=1 addr=0xc0002f25b0
*03:09:09.694376 tx 553501112:     OK, {i581 g1 tE=1s tA=1s {M040755 SZ=4096 L=2 1026:1026 B8*4096 i0:581 A 1600909749.692894 M 1600909749.692894 C 1600909749.692894}}
*03:09:09.699302 rx 553501486: RMDIR i307 ["d12XXXXXXXXXXXXX"] 17b req=0xc0001f4480
*03:09:09.699492 rx 553501516: MKDIR i214 {0777 (00)} ["d22XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"] 72b req=0xc00033efc0
*03:09:09.699539 addNewChild: i581 lookupCount=2 nl=2 addr=0xc0002f25b0
*03:09:09.699550 tx 553501516:     OK, {i581 g1 tE=1s tA=1s {M040755 SZ=4096 L=2 1026:1026 B8*4096 i0:581 A 1600909749.697894 M 1600909749.697894 C 1600909749.697894}}
*03:09:09.699724 rx 553501564: LOOKUP i214 ["d22XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"] 72b req=0xc000c77680
*03:09:09.699751 tx 553501564:     OK, {i581 g1 tE=1s tA=1s {M040755 SZ=4096 L=2 1026:1026 B8*4096 i0:581 A 1600909749.697894 M 1600909749.697894 C 1600909749.697894}}
*03:09:09.699813 rx 553501578: READDIRPLUS i214 {Fh 46 [0 +4096)  L 0 NONBLOCK,DIRECTORY,0x8000}  req=0xc000c77680
*03:09:09.699902 addNewChild: i581 lookupCount=4 nl=4 addr=0xc0002f25b0
*03:09:09.699959 tx 553501578:     OK,  2408b data "\x00\x00\x00\x00\x00\x00\x00\x00"...
*03:09:09.700010 rx 553501604: FORGET i581 {Nlookup=1}  req=0xc000116000
*03:09:09.700016 -- 553501604:     done (no reply)
*03:09:09.699348 tx 553501486:     OK
*03:09:09.701691 rx 553501722: FORGET i581 {Nlookup=4}  req=0xc0003158c0
*03:09:09.701695 ino 581 lookupCount underflow: lookupCount=3, decrement=4

hanwen · 2020-09-24T12:48:39Z

why is it handing out a recycled number? Is this the loopback FS ?

rfjakob · 2020-09-24T13:32:36Z

Yes loopback on ext4, and ext4 recycles inode numbers

Survived 12 hours of fsstress testing. Fixes hanwen#372 . BUG: Leaks memory (grew to 12GB RSS during fsstress testing). Change-Id: Ibb36a886f15d48727daa10b9717ea88e45a6b8af

Also add a wrapper script, fsstress.collect-crashes.sh, to collect the debug output. hanwen/go-fuse#372

Using the inode numbers as the nodeid causes problems when the fs reuses inode numbers. This is the case with any overlay filesystem that is backed by ext4 like the loopback example or gocryptfs. We already had the expSleep() and re-add hack, 7090b02 fs: wait out apparent inode type change 68f7052 fs: addNewChild(): handle concurrent FORGETs to mitigate some of the problems (at the risk of deadlocking forever), but I could not find a way to work around the following case uncovered by fsstress: The kernel expects a fresh nodeid from MKDIR (see hanwen#372 for a debug log). This is now guaranteed by passing the O_EXCL to addNewChild(). However, this also means that the hard link detection must happen in addNewChild() as opposed to newInodeUnlocked() before. The expSleep and re-add hacks are no longer needed and have been dropped. This survived 24 hours (42587 iterations) of fsstress testing. Tested was the loopback example running on top of ext4 on Linux 5.8.10. Fixes hanwen#372 . Change-Id: Ibb36a886f15d48727daa10b9717ea88e45a6b8af

Using the inode numbers as the nodeid causes problems when the fs reuses inode numbers. This is the case with any overlay filesystem that is backed by ext4 like the loopback example or gocryptfs. We already had the expSleep() and re-add hack, 7090b02 fs: wait out apparent inode type change 68f7052 fs: addNewChild(): handle concurrent FORGETs to mitigate some of the problems (at the risk of deadlocking forever), but I could not find a way to work around the following case uncovered by fsstress: The kernel expects a fresh nodeid from MKDIR (see hanwen#372 for a debug log). This is now guaranteed by passing the O_EXCL to addNewChild(). However, this also means that the hard link detection must happen in addNewChild() as opposed to newInodeUnlocked() before. The expSleep and re-add hacks are no longer needed and have been dropped. This survived 24 hours (42587 iterations) of fsstress testing. Tested was the loopback example running on top of ext4 on Linux 5.8.10. Fixes hanwen#372 . v2: Rename inoMap -> stableAttrs, nodeidMap -> kernelNodeIds acc. to feedback from Han-Wen. Change-Id: Ibb36a886f15d48727daa10b9717ea88e45a6b8af

rfjakob mentioned this issue Sep 24, 2020

Add support for FIDO2 tokens rfjakob/gocryptfs#505

Closed

rfjakob added a commit to rfjakob/gocryptfs that referenced this issue Oct 3, 2020

tests: fsstress-gocryptfs.bash: add DEBUG option

66449bf

Also add a wrapper script, fsstress.collect-crashes.sh, to collect the debug output. hanwen/go-fuse#372

hanwen closed this as completed in 1d0096e Nov 4, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v2api: fsstress crashes #372

v2api: fsstress crashes #372

rfjakob commented Sep 18, 2020 •

edited

Loading

rfjakob commented Sep 23, 2020 •

edited

Loading

rfjakob commented Sep 24, 2020

hanwen commented Sep 24, 2020

rfjakob commented Sep 24, 2020

v2api: fsstress crashes #372

v2api: fsstress crashes #372

Comments

rfjakob commented Sep 18, 2020 • edited Loading

rfjakob commented Sep 23, 2020 • edited Loading

rfjakob commented Sep 24, 2020

hanwen commented Sep 24, 2020

rfjakob commented Sep 24, 2020

rfjakob commented Sep 18, 2020 •

edited

Loading

rfjakob commented Sep 23, 2020 •

edited

Loading