Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid memory address or nil pointer dereference, in 1.4.3 on Debian testing #260

Closed
lechner opened this issue Aug 28, 2018 · 11 comments
Closed

Comments

@lechner
Copy link
Contributor

lechner commented Aug 28, 2018

Hi,

Gocryptfs gets a real workout here on kerberized NFSv4. It recently panicked for the first time. To be sure, the latest Linux kernel for Debian testing (4.17.0-3-amd64 #1 SMP Debian 4.17.17-1 (2018-08-18) x86_64 GNU/Linux) introduces other serious problems related to networking and NFS, but perhaps this unexpected error should be caught. I am still on version 1.4.3. Thank you!

Kind regards,
Felix

Aug 28 02:50:02 lechner-desktop gocryptfs[2381]: checkAndPadHole: Fstat failed: stat /acct/lechner/home.wallace.gocryptfs/q_9gL4t52f0DyjAvu8kECQ==/5wrvV6x4cDaun6HaytQw5A==/7EwZI0Zth4qXXaU5aXaoNs8NI9tDr1Hx5WSWhZn03-E=/6WneSjhUaLdlTtXuNlcz3Q==/pnruj-DrlwNH5StDw_Y2Yw==/hCO4faTzPtCbUUTOHFLvhwb15IJAdSweA7wMp7Y0Vfs=/pckyYw3XQipQTuFjZshrxg==/VrZrnj3rfRJbQMPOwKi7Jz44yRhh_tglZGSm9WGShn_CkG_96EzpDfMWD-0Xp6Qb: permission denied
Aug 28 02:50:02 lechner-desktop gocryptfs[2381]: ino3674214 fh7999: statPlainSize: stat /acct/lechner/home.wallace.gocryptfs/q_9gL4t52f0DyjAvu8kECQ==/5wrvV6x4cDaun6HaytQw5A==/7EwZI0Zth4qXXaU5aXaoNs8NI9tDr1Hx5WSWhZn03-E=/JmviWj0hSQ7xFNM1x-S9QA==: permission denied
Aug 28 02:50:02 lechner-desktop gocryptfs[2381]: checkAndPadHole: Fstat failed: stat /acct/lechner/home.wallace.gocryptfs/q_9gL4t52f0DyjAvu8kECQ==/5wrvV6x4cDaun6HaytQw5A==/7EwZI0Zth4qXXaU5aXaoNs8NI9tDr1Hx5WSWhZn03-E=/6WneSjhUaLdlTtXuNlcz3Q==/_18jMQTdeD-atPYOrn_0IQ==/Onqxf2fWTmcVJ7Rm7gjnDQ==/kYiq5tok_C6EKcnvU7iK2Q==/y1rQzR7cVg7X5VGhW49GDtyjvngj0pyWbLUKrarSvJpNYc0vI9YJIskgwViEw1ZU: permission denied
Aug 28 02:50:08 lechner-desktop gocryptfs[2381]: checkAndPadHole: Fstat failed: stat /acct/lechner/home.wallace.gocryptfs/q_9gL4t52f0DyjAvu8kECQ==/5wrvV6x4cDaun6HaytQw5A==/7EwZI0Zth4qXXaU5aXaoNs8NI9tDr1Hx5WSWhZn03-E=/nNQo_3GpS7FEMf17Z-ver4uDuVpgc-6bAzmo82w3rDI=: permission denied
Aug 28 02:50:08 lechner-desktop gocryptfs[2381]: checkAndPadHole: Fstat failed: stat /acct/lechner/home.wallace.gocryptfs/q_9gL4t52f0DyjAvu8kECQ==/5wrvV6x4cDaun6HaytQw5A==/7EwZI0Zth4qXXaU5aXaoNs8NI9tDr1Hx5WSWhZn03-E=/0zEgZ9S7OdyrKFJzPGi7IREFjtKv5dsxCsp7QQR2wc4=: permission denied
Aug 28 02:50:13 lechner-desktop gocryptfs-2381-logger: panic: runtime error: invalid memory address or nil pointer dereference
Aug 28 02:50:13 lechner-desktop gocryptfs-2381-logger: [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x5ac73b]
Aug 28 02:50:13 lechner-desktop gocryptfs-2381-logger: 
Aug 28 02:50:13 lechner-desktop gocryptfs-2381-logger: goroutine 258741 [running]:
Aug 28 02:50:13 lechner-desktop gocryptfs-2381-logger: github.com/hanwen/go-fuse/fuse/pathfs.(*pathInode).GetAttr(0xc424222cc0, 0xc4233fa100, 0x6932c0, 0xc4245c5480, 0xc4233fa198, 0x0)
Aug 28 02:50:13 lechner-desktop gocryptfs-2381-logger: #011/build/gocryptfs-hTLzpi/gocryptfs-1.4.3/obj-x86_64-linux-gnu/src/github.com/hanwen/go-fuse/fuse/pathfs/pathfs.go:608 +0x9b
Aug 28 02:50:13 lechner-desktop gocryptfs-2381-logger: github.com/hanwen/go-fuse/fuse/nodefs.(*rawBridge).GetAttr(0xc4200aa3f0, 0xc4233fa180, 0xc4233fa0f0, 0x523a1e)
Aug 28 02:50:13 lechner-desktop gocryptfs-2381-logger: #011/build/gocryptfs-hTLzpi/gocryptfs-1.4.3/obj-x86_64-linux-gnu/src/github.com/hanwen/go-fuse/fuse/nodefs/fsops.go:142 +0x98
Aug 28 02:50:13 lechner-desktop gocryptfs-2381-logger: github.com/hanwen/go-fuse/fuse.doGetAttr(0xc4200ee000, 0xc4233fa000)
Aug 28 02:50:13 lechner-desktop gocryptfs-2381-logger: #011/build/gocryptfs-hTLzpi/gocryptfs-1.4.3/obj-x86_64-linux-gnu/src/github.com/hanwen/go-fuse/fuse/opcode.go:242 +0x61
Aug 28 02:50:13 lechner-desktop gocryptfs-2381-logger: github.com/hanwen/go-fuse/fuse.(*Server).handleRequest(0xc4200ee000, 0xc4233fa000, 0xc4233fa000)
Aug 28 02:50:13 lechner-desktop gocryptfs-2381-logger: #011/build/gocryptfs-hTLzpi/gocryptfs-1.4.3/obj-x86_64-linux-gnu/src/github.com/hanwen/go-fuse/fuse/server.go:404 +0x279
Aug 28 02:50:13 lechner-desktop gocryptfs-2381-logger: github.com/hanwen/go-fuse/fuse.(*Server).loop(0xc4200ee000, 0xc422637d01)
Aug 28 02:50:13 lechner-desktop gocryptfs-2381-logger: #011/build/gocryptfs-hTLzpi/gocryptfs-1.4.3/obj-x86_64-linux-gnu/src/github.com/hanwen/go-fuse/fuse/server.go:376 +0x158
Aug 28 02:50:13 lechner-desktop gocryptfs-2381-logger: created by github.com/hanwen/go-fuse/fuse.(*Server).readRequest
Aug 28 02:50:13 lechner-desktop gocryptfs-2381-logger: #011/build/gocryptfs-hTLzpi/gocryptfs-1.4.3/obj-x86_64-linux-gnu/src/github.com/hanwen/go-fuse/fuse/server.go:284 +0x2ae
@rfjakob
Copy link
Owner

rfjakob commented Aug 28, 2018

Wow. These "Fstat failed" messages mean that we have an open file descriptor to the file, but could still not read the attributes of the file. But in any case, gocryptfs should not crash!

The crash is inside the go-fuse library, do you run Debian testing?
Debian version 0.0_git20171124.0.14c3015-2 looks like it matches the backtrace:
pathfs.go:608
fsops.go:142

@lechner
Copy link
Contributor Author

lechner commented Aug 28, 2018

Yes, we recently uploaded a new go-fuse but I believe my gocryptfs executable was built with 0.0_git20171124.0.14c3015-4. That would be a different Debian revision than you mentioned, but according to the change log they are identical (except for some contact information).

For what is worth, I think NFS refused to authenticate for a moment, possibly after a Kerberos ticket renewal. Then Gocryptfs crashed. Some type of blocking would probably be preferable. Thank you!

@rfjakob
Copy link
Owner

rfjakob commented Aug 28, 2018

I'll see if I can spot the problem next week or the week after - i guess the chance of getting this to reproduce is slim?

@lechner
Copy link
Contributor Author

lechner commented Aug 28, 2018

I have only had the kernel for two days, so we may get lucky. (I can also expire the ticket manually.) What are we looking for? Thank you!

@lechner
Copy link
Contributor Author

lechner commented Aug 29, 2018

It happened again. Here is the log, this time from syslog:

Aug 28 16:12:03 lechner-desktop kernel: [35766.879463] gocryptfs       D    0  2740      1 0x00000000
Aug 28 16:12:03 lechner-desktop kernel: [35766.879703] INFO: task gocryptfs:2793 blocked for more than 120 seconds.
Aug 28 16:12:03 lechner-desktop kernel: [35766.879714] gocryptfs       D    0  2793      1 0x00000000
Aug 28 16:13:22 lechner-desktop gocryptfs-1857-logger: panic: runtime error: invalid memory address or nil pointer dereference
Aug 28 16:13:22 lechner-desktop gocryptfs-1857-logger: [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x5ac73b]
Aug 28 16:13:22 lechner-desktop gocryptfs-1857-logger: 
Aug 28 16:13:22 lechner-desktop gocryptfs-1857-logger: goroutine 117934 [running]:
Aug 28 16:13:22 lechner-desktop gocryptfs-1857-logger: github.com/hanwen/go-fuse/fuse/pathfs.(*pathInode).GetAttr(0xc4202956b0, 0xc424488340, 0x6932c0, 0xc421490d80, 0xc4244883d8, 0x0)
Aug 28 16:13:22 lechner-desktop gocryptfs-1857-logger: #011/build/gocryptfs-hTLzpi/gocryptfs-1.4.3/obj-x86_64-linux-gnu/src/github.com/hanwen/go-fuse/fuse/pathfs/pathfs.go:608 +0x9b
Aug 28 16:13:22 lechner-desktop gocryptfs-1857-logger: github.com/hanwen/go-fuse/fuse/nodefs.(*rawBridge).GetAttr(0xc4200f03c0, 0xc4244883c0, 0xc424488330, 0x523a1e)
Aug 28 16:13:22 lechner-desktop gocryptfs-1857-logger: #011/build/gocryptfs-hTLzpi/gocryptfs-1.4.3/obj-x86_64-linux-gnu/src/github.com/hanwen/go-fuse/fuse/nodefs/fsops.go:142 +0x98
Aug 28 16:13:22 lechner-desktop gocryptfs-1857-logger: github.com/hanwen/go-fuse/fuse.doGetAttr(0xc4200dc000, 0xc424488240)
Aug 28 16:13:22 lechner-desktop gocryptfs-1857-logger: #011/build/gocryptfs-hTLzpi/gocryptfs-1.4.3/obj-x86_64-linux-gnu/src/github.com/hanwen/go-fuse/fuse/opcode.go:242 +0x61
Aug 28 16:13:22 lechner-desktop gocryptfs-1857-logger: github.com/hanwen/go-fuse/fuse.(*Server).handleRequest(0xc4200dc000, 0xc424488240, 0xc424488240)
Aug 28 16:13:22 lechner-desktop gocryptfs-1857-logger: #011/build/gocryptfs-hTLzpi/gocryptfs-1.4.3/obj-x86_64-linux-gnu/src/github.com/hanwen/go-fuse/fuse/server.go:404 +0x279
Aug 28 16:13:22 lechner-desktop gocryptfs-1857-logger: github.com/hanwen/go-fuse/fuse.(*Server).loop(0xc4200dc000, 0xc42144f001)
Aug 28 16:13:22 lechner-desktop gocryptfs-1857-logger: #011/build/gocryptfs-hTLzpi/gocryptfs-1.4.3/obj-x86_64-linux-gnu/src/github.com/hanwen/go-fuse/fuse/server.go:376 +0x158
Aug 28 16:13:22 lechner-desktop gocryptfs-1857-logger: created by github.com/hanwen/go-fuse/fuse.(*Server).readRequest
Aug 28 16:13:22 lechner-desktop gocryptfs-1857-logger: #011/build/gocryptfs-hTLzpi/gocryptfs-1.4.3/obj-x86_64-linux-gnu/src/github.com/hanwen/go-fuse/fuse/server.go:284 +0x2ae

Here is the corresponding /var/log/messages, which looks mostly identical:

Aug 28 16:12:03 lechner-desktop kernel: [35766.879463] gocryptfs       D    0  2740      1 0x00000000
Aug 28 16:12:03 lechner-desktop kernel: [35766.879714] gocryptfs       D    0  2793      1 0x00000000
Aug 28 16:13:22 lechner-desktop gocryptfs-1857-logger: panic: runtime error: invalid memory address or nil pointer dereference
Aug 28 16:13:22 lechner-desktop gocryptfs-1857-logger: [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x5ac73b]
Aug 28 16:13:22 lechner-desktop gocryptfs-1857-logger: 
Aug 28 16:13:22 lechner-desktop gocryptfs-1857-logger: goroutine 117934 [running]:
Aug 28 16:13:22 lechner-desktop gocryptfs-1857-logger: github.com/hanwen/go-fuse/fuse/pathfs.(*pathInode).GetAttr(0xc4202956b0, 0xc424488340, 0x6932c0, 0xc421490d80, 0xc4244883d8, 0x0)
Aug 28 16:13:22 lechner-desktop gocryptfs-1857-logger: #011/build/gocryptfs-hTLzpi/gocryptfs-1.4.3/obj-x86_64-linux-gnu/src/github.com/hanwen/go-fuse/fuse/pathfs/pathfs.go:608 +0x9b
Aug 28 16:13:22 lechner-desktop gocryptfs-1857-logger: github.com/hanwen/go-fuse/fuse/nodefs.(*rawBridge).GetAttr(0xc4200f03c0, 0xc4244883c0, 0xc424488330, 0x523a1e)
Aug 28 16:13:22 lechner-desktop gocryptfs-1857-logger: #011/build/gocryptfs-hTLzpi/gocryptfs-1.4.3/obj-x86_64-linux-gnu/src/github.com/hanwen/go-fuse/fuse/nodefs/fsops.go:142 +0x98
Aug 28 16:13:22 lechner-desktop gocryptfs-1857-logger: github.com/hanwen/go-fuse/fuse.doGetAttr(0xc4200dc000, 0xc424488240)
Aug 28 16:13:22 lechner-desktop gocryptfs-1857-logger: #011/build/gocryptfs-hTLzpi/gocryptfs-1.4.3/obj-x86_64-linux-gnu/src/github.com/hanwen/go-fuse/fuse/opcode.go:242 +0x61
Aug 28 16:13:22 lechner-desktop gocryptfs-1857-logger: github.com/hanwen/go-fuse/fuse.(*Server).handleRequest(0xc4200dc000, 0xc424488240, 0xc424488240)
Aug 28 16:13:22 lechner-desktop gocryptfs-1857-logger: #011/build/gocryptfs-hTLzpi/gocryptfs-1.4.3/obj-x86_64-linux-gnu/src/github.com/hanwen/go-fuse/fuse/server.go:404 +0x279
Aug 28 16:13:22 lechner-desktop gocryptfs-1857-logger: github.com/hanwen/go-fuse/fuse.(*Server).loop(0xc4200dc000, 0xc42144f001)
Aug 28 16:13:22 lechner-desktop gocryptfs-1857-logger: #011/build/gocryptfs-hTLzpi/gocryptfs-1.4.3/obj-x86_64-linux-gnu/src/github.com/hanwen/go-fuse/fuse/server.go:376 +0x158
Aug 28 16:13:22 lechner-desktop gocryptfs-1857-logger: created by github.com/hanwen/go-fuse/fuse.(*Server).readRequest
Aug 28 16:13:22 lechner-desktop gocryptfs-1857-logger: #011/build/gocryptfs-hTLzpi/gocryptfs-1.4.3/obj-x86_64-linux-gnu/src/github.com/hanwen/go-fuse/fuse/server.go:284 +0x2ae

@rfjakob
Copy link
Owner

rfjakob commented Aug 29, 2018

Unfortunately I'm abroad without my pc right now and can't really do much for another 10 days or so. Maybe try v1.6 and see if the same thing happened? You could just download the release and replace your gocryptfs binary.

@lechner
Copy link
Contributor Author

lechner commented Aug 29, 2018

Please do not worry. I am using an older kernel. Have a good trip! Thank you!

@rfjakob
Copy link
Owner

rfjakob commented Aug 29, 2018

Thanks :)

rfjakob added a commit to rfjakob/go-fuse that referenced this issue Sep 8, 2018
gocryptfs user Felix Lechner reported a nil pointer dereference
in GetAttr: rfjakob/gocryptfs#260

The crash is in line

	n.setClientInode(fi.Ino)

because fi is nil.

This can happen when file.GetAttr() returns an error code other than
ENOSYS and EBADF. For gocryptfs, this can only happen when an open
file descriptor breaks. In this case it was triggered by a failing
NFS volume.

Fix the crash by erroring out for error codes that are not handled
by the retry-by-path logic.
@rfjakob
Copy link
Owner

rfjakob commented Sep 8, 2018

I think I found it. Can you test rfjakob/go-fuse@b11e293 ? I have attached a static gocryptfs v1.6 build that includes the patch, but it looks like it should also apply cleanly to the go-fuse version in Debian.

gocryptfs.zip (why zip? github does not support tar.gz/xz attachments!)

sha256sum gocryptfs
d3c343334330571822304b9ec9a3a278cadbb833ecbd95a3677d33ab36699e68  gocryptfs

@lechner
Copy link
Contributor Author

lechner commented Oct 2, 2018

Thank you for sending the patched version. I cannot conclusively test the patch, as NFSv4 in recent Debian kernels no longer withdraws access rights when a Kerberos ticket is deleted, but your patched version has been running flawlessly. Most notably, there have been no more crashes, compared to one or two a day before. I recommend the adoption of this fix upstream. Thank you for your excellent customer support!

@rfjakob
Copy link
Owner

rfjakob commented Oct 2, 2018

Great, thanks for testing! Maybe you can comment "tested and fixed the issue for me" at hanwen/go-fuse#232 ?

@lechner lechner closed this as completed Oct 2, 2018
hanwen pushed a commit to hanwen/go-fuse that referenced this issue Oct 8, 2018
gocryptfs user Felix Lechner reported a nil pointer dereference
in GetAttr: rfjakob/gocryptfs#260

The crash is in line

	n.setClientInode(fi.Ino)

because fi is nil.

This can happen when file.GetAttr() returns an error code other than
ENOSYS and EBADF. For gocryptfs, this can only happen when an open
file descriptor breaks. In this case it was triggered by a failing
NFS volume.

Fix the crash by erroring out for error codes that are not handled
by the retry-by-path logic.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants