-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zfs receive -> unable to handle kernel NULL pointer dereference #1518
Comments
Any idea what the system was doing at the time? Got a reliable way to |
The only purpose of this system is receiving zfs streams on a regular interval. |
I talked too fast :-( [240595.185428] ---[ end trace 38f5ebf420aad262 ]--- I will rollback 9bdf8c5 to see what happens ... |
@edillmann The stack trace above following the zfsdev_ioctl+0x493 looks suspiciously bogus to me. I've got to wonder if something in the zfs_ioc_recv call chain isn't smashing the stack. The 0x493 offset is, itself, suspiciously large compared to the size of the zfsdev_ioctl function. I'll check out all the lower function calls and see if there's any obvious stack reduction that needs to be done. |
@edillmann I think the problem has to do with some conflicts relative the linux-ization introduced by 3558fd7. I'm going to investigate further. |
@dweeezil thank you for tracking that ... |
@edillmann It appears this problem occurs during an "online" receive; on in which the file system on the receiving side is mounted. Does your receiving system keep its received file systems mounted? The code handling this is immediately following the comment "online recv" in zfs_ioctl.c. I still can't duplicate your problem, however, there's clearly something going wrong with the "wrap-up" code for online receives in which the mounted file system (on the receiving side) is briefly unmounted and re-mounted in order to make it match the received file system. I think the call to dmu_resume_fs is giving a NULL pointer and that's what's causing your panic. Possibly related, I can trigger an ASSERT rather easily in my debug-enabled module when doing the equivalent rollback operation (which similarly requires the remounting operation). I'm continuing to investigate. |
@dweeezil I confirm that the receiving filesystem's are mounted |
It seems that issue #1536 is not related to this problem and now that I've got a small patch to deal with it, I can continue investigating this NULL dereference (which, so far, I've not been able to duplicate). |
@edillmann I've tried to make sense out of this stack and as far as I can tell, the actual trace is:
The problem occurs as part of the remounting in zfs_ioc_recv. The actual work is performed in zfs_resume_fs by code that was added in 7b3e34b. Apparently, prior to this commit, rollbacks on mounted file systems didn't work properly. Receives to mounted file system (so-called "online recvs") likely shared a similar property because they both rely on much of the same code. After the stream is received during an online recv, I/O is suspended to the receiving file system while the DMU finishes dealing with the stream. Then the I/O is resumed. That's where your NULL dereference is occurring. For some reason, an error is occurring during the resuming phase that cause it to try to unmount the file system. The panic occurs during the unmounting. I've got no idea why this is happening to you. The patch does, however, add a bunch of new pool logging that might be useful. You should grab the output of a "zpool history -i" in the receiving side. The logging in there might give us a clue as to what happened and/or how far it got. I can also whip up a patch to log the cause of the failure that is triggering the umount operation. That would at least tell us what part is failing that presumably shouldn't be. |
@dweeezil Regarding the stack above you can usually ignore the lines which begin with ?'s. They are purely speculative and just indicate that the address for that particular symbol was found on the stack. They are likely the return addresses from a call path recently walked and not overwritten. So the stack really looks like this:
I also think there's a very good chance that
I suspect wrapping the above code in a The first ASSERT() you mentioned is also somewhat troubling to me, I'd expect there to be a reference. However, I suppose since the umount isn't originating from the VFS we have no reason to believe the inode cache has already been pruned. Normally the VFS will take care of this before calling zfs_umount(). |
Hi, For now I did unmount (and canmount=false) on the receiving side and the problem did not occur again. To help analysis of this problem I did a zpool history -i The following pattern is repeating : 2013-06-12.09:27:17 zfs rollback -r bfs/vms/lxc@zo20130612013742 Thank you for the time you are spending tracking such problem down. Regards, |
@edillmann Were you still getting the NULL dereference after rolling back 9bdf8c5? FYI, that commit is an older version of my pull request #1496 but the only difference is the commit log message and authorship (the code should be the same). |
@behlendorf I find it somewhat coincidental that if the dmu_objset_own() fails in zfs_resume_fs, it does I noticed that in both Illumos and FreeBSD, the umount call has a protective "if" statement:
Do we need something similar? Maybe a comparison of zsb->z_sb->s_fs_info against zsb? |
@dweeezil I'm not getting the NULL dereference any more since I set canmount to false and unmounted the receiving side filesystems. |
Yes, it absolutely does. By definition Adding a similar check around the @dweeezil Can you put together a patch for this and @edillmann can you verify it solves the issue?
|
@edillmann Could you please try dweeezil/zfs@e5a4d3c. It's a small patch which should protect against this type of NULL dereference in the few places it can occur. Of course, you'll have to re-mount your file systems on the receiving system while testing this. This should layer on top of the 9bdf8c5 you had been using or you can use its replacement commit of dweeezil/zfs@8592f00 which differs only in the log message. @behlendorf If this works, should I roll it into my pull request? It's not clear to my why my #1496 patch should have caused this problem in which case this patch ought to be turned into a separate pull request based on master. |
@dweeezil This does look like a long standing issue so I think a separate pull request is the most appropriate. |
@behlendorf I just created submitted this as pull request #1543 and backed it out of #1496. @edillmann You either grab the original patch I mentioned or dweeezil/zfs@47a432c which is from the standalone pull request #1543. Urgh... I hate typos in commit logs but I also hate re-writing commits. The log needs s/NULl/NULL/. |
@dweeezil thank's for this work, the version is deployed and i will wait several days before closing this issue. |
This was merged long ago and since we haven't heard anything I'm closing this as fixed. |
From time to time I get hit by the following BUG:
The kernel is vanilla 3.8.13, with latest zfs commit plus following patches :
20ad617
9bdf8c5
9e300f3
[158179.598272] ---[ end trace 0a7e959cc3d5f499 ]---
[158194.436990] BUG: unable to handle kernel NULL pointer dereference at (null)
[158194.438576] IP: [] dmu_objset_pool+0x6/0x30 [zfs]
[158194.440137] PGD 159d01067 PUD 1a23ab067 PMD 0
[158194.441630] Oops: 0000 [#5] SMP
[158194.443068] Modules linked in: xt_nat(F) iptable_filter(F) ipt_MASQUERADE(F) iptable_nat(F) nf_conntrack_ipv4(F) nf_defrag_ipv4(F) nf_nat_ipv4(F) nf_nat(F) nf_conntrack(F) ip_tables(F) x_tables(F) ip_gre(F) gre(F) bridge(F) stp(F) llc(F) microcode(F) aesni_intel(F) aes_x86_64(F) xts(F) lrw(F) gf128mul(F) ablk_helper(F) ghash_clmulni_intel(F) cryptd(F) i915(F) kvm_intel(F) kvm(F) video(F) drm_kms_helper(F) drm(F) lp(F) i2c_algo_bit(F) parport(F) psmouse(F) serio_raw(F) gpio_ich(F) lpc_ich(F) coretemp(F) zfs(PO) zunicode(PO) zavl(PO) zcommon(PO) znvpair(PO) spl(O) zlib_deflate(F) raid10(F) raid456(F) async_raid6_recov(F) async_memcpy(F) async_pq(F) async_xor(F) xor(F) async_tx(F) ahci(F) libahci(F) e1000e(F) raid6_pq(F) raid1(F) raid0(F) multipath(F) linear(F) [last unloaded: ipmi_msghandler]
[158194.454385] CPU 0
[158194.454404] Pid: 87148, comm: zfs Tainted: PF D O 3.8.13-lxc #1 /DH67BL
[158194.457678] RIP: 0010:[] [] dmu_objset_pool+0x6/0x30 [zfs]
[158194.459374] RSP: 0018:ffff8801d4fdfb80 EFLAGS: 00010202
[158194.461028] RAX: ffff88017e459710 RBX: ffff8802029d0000 RCX: 00000000000000fe
[158194.462695] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000
[158194.464372] RBP: ffff8801d4fdfbd0 R08: 0000000000000000 R09: 0000000000000000
[158194.466057] R10: 0000000000000000 R11: ffffffffa00e683b R12: 0000000000000010
[158194.467733] R13: ffff880203ac2c00 R14: 0000000000000001 R15: ffff8802029d0400
[158194.469401] FS: 00007f3359da3b80(0000) GS:ffff88021f200000(0000) knlGS:0000000000000000
[158194.471078] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[158194.472755] CR2: 0000000000000000 CR3: 0000000195439000 CR4: 00000000000407f0
[158194.474450] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[158194.476128] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[158194.477792] Process zfs (pid: 87148, threadinfo ffff8801d4fde000, task ffff88017e459710)
[158194.479473] Stack:
[158194.481138] ffff8801d4fdfbd0 ffffffffa0215aed ffffffffa01b5b81 ffff8802029d0000
[158194.482868] ffff8802029d0378 ffff8802029d0000 0000000000000010 ffff880203ac2c00
[158194.484612] ffff88017e459710 ffff8802029d0400 ffff8801d4fdfbf8 ffffffffa0216120
[158194.486351] Call Trace:
[158194.488093] [] ? zfs_sb_teardown+0x4d/0x370 [zfs]
[158194.489856] [] ? dsl_dataset_own+0x71/0x80 [zfs]
[158194.491646] [] zfs_umount+0x30/0xc0 [zfs]
[158194.493448] [] zfs_resume_fs+0xfe/0x2a0 [zfs]
[158194.495247] [] zfs_ioc_recv+0x9ee/0xce0 [zfs]
[158194.497044] [] ? dsl_prop_get_ds+0x16e/0x230 [zfs]
[158194.498819] [] ? __kmalloc+0x55/0x170
[158194.500598] [] ? kmem_alloc_debug+0x198/0x3b0 [spl]
[158194.502389] [] ? kmem_alloc_debug+0x198/0x3b0 [spl]
[158194.504168] [] ? kmem_alloc_debug+0x198/0x3b0 [spl]
[158194.505942] [] ? dsl_dataset_drop_ref+0x12/0x20 [zfs]
[158194.507678] [] ? dsl_dataset_rele+0x42/0x120 [zfs]
[158194.509347] [] zfsdev_ioctl+0x493/0x4c0 [zfs]
[158194.510968] [] do_vfs_ioctl+0x99/0x560
[158194.512567] [] sys_ioctl+0x91/0xb0
[158194.514166] [] ? sys_write+0x52/0xa0
[158194.515794] [] system_call_fastpath+0x16/0x1b
[158194.517376] Code: f8 04 00 00 48 c7 c6 80 9c 23 a0 48 c7 c7 58 1f 24 a0 e8 ee 5c f4 ff e9 5c ff ff ff 66 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 <48> 8b 07 48 89 e5 48 85 c0 74 0f 48 8b 00 48 85 c0 74 07 48 8b
[158194.520965] RIP [] dmu_objset_pool+0x6/0x30 [zfs]
[158194.522694] RSP
[158194.524346] CR2: 0000000000000000
The text was updated successfully, but these errors were encountered: