Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.zfs/snapshot automount broken #3030

Closed
edillmann opened this issue Jan 22, 2015 · 32 comments
Closed

.zfs/snapshot automount broken #3030

edillmann opened this issue Jan 22, 2015 · 32 comments
Labels
Type: Building Indicates an issue related to building binaries
Milestone

Comments

@edillmann
Copy link
Contributor

Here is my setup :
kernel-3.18.3 vanilla
zfs master@b0cf0676c0beb5dcb149774a3264580a18304ac1
spl master@54cccfc2e30fa84463c056e8ad04b2be9448999e

When I try to access to a filesystem snapshot it's not working and I get a kernel message like

ZFS: Unable to automount rpool/backup/test@D16 at /rpool/backup/test/.zfs/snapshot/D16: 512

@Perseid
Copy link

Perseid commented Jan 22, 2015

I've got the same issue, logging the same kernel message. I'm running a current Arch this time with Kernel 3.18.2-2 and - according to the zfs and spl package info - zfs master@d958324f and spl master@03a78353.

I would guess it is not related to #2841 (comment) as I cannot access any snapshot. I would also guess it is not directly related to #1768 (comment) as it also fails on the first access of the directory directly after boot.

The symptom looks like this from userspace:

$ ls /$anyZfsFsPath/.zfs/snapshot/$anyValidSnapshotName/ -la
ls: cannot access /$anyZfsFsPath/.zfs/snapshot/$anyValidSnapshotName/.: Input/output error
ls: cannot access /$anyZfsFsPath/.zfs/snapshot/$anyValidSnapshotName/..: Input/output error
total 0
d????????? ? ? ? ?            ? .
d????????? ? ? ? ?            ? ..
$ ls /$anyZfsFsPath/.zfs/snapshot/$anyValidSnapshotName/ -la
dr-xr-xr-x 1 root root 0 Jan 22 18:43 /$anyZfsFsPath/.zfs/snapshot/$anyValidSnapshotName/

@edillmann
Copy link
Contributor Author

In this case it seem to be a duplicate of #2841 as the snapshot did not disapear (I can clone, or send them), it just seem's to be a mount problem

@mgmartin
Copy link

Same issue here. I confirmed the latest spl/zfs code works fine on a 3.14 kernel but produces the "ZFS: Unable to automount" error on 3.18 kernels. The error comes from the call_usermodehelper in zfs_ctldir.c ~line 850.

A manual mount.zfs works fine for me mounting any snapshot--just the automount under .zfs/snapshots breaks.

@behlendorf behlendorf added this to the 0.6.4 milestone Feb 5, 2015
@behlendorf behlendorf added the Type: Building Indicates an issue related to building binaries label Feb 5, 2015
@behlendorf
Copy link
Contributor

@mgmartin Thanks for the addeddetail, it sounds like this is an issue to 3.18 which will need to be investigated.

@edillmann
Copy link
Contributor Author

I give 3.19.0-rc7 a try and this issue disappears

@behlendorf behlendorf modified the milestones: 0.6.5, 0.6.4 Feb 5, 2015
@Mic92
Copy link
Contributor

Mic92 commented Feb 7, 2015

Happens with 3.19.0-rc7 too.

@Ringdingcoder
Copy link

I get the same with 3.18.6, while it works with 3.17.8. So definitely 3.18. (zfs/spl 0.6.3-1.2)

@behlendorf
Copy link
Contributor

I was able to reproduce this as well on 3.18 but I haven't had a chance to investigate. It would be great if someone has the time to dig in to why the mounts are failing.

@janlam7
Copy link

janlam7 commented Mar 31, 2015

Doing a git bisect lead me to kernel commit bafc9b754f752ea798c39f9b099a228fd56604e0. I hope it helps.

Full bisect log is:

Bisecting: a merge base must be tested
[bfe01a5ba2490f299e1d2d5508cbbbadd897bbe9] Linux 3.17
Bisecting: 6249 revisions left to test after this (roughly 13 steps)
[4d9708ea5e5a45973df7cf965805fdfb185dd5bf] Merge tag 'media/v3.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Bisecting: 3131 revisions left to test after this (roughly 12 steps)
[88ed806abb981cc8ec61ee7fab93ecfe63521ebf] Merge tag 'md/3.18' of git://neil.brown.name/md
Bisecting: 1538 revisions left to test after this (roughly 11 steps)
[faafcba3b5e15999cf75d5c5a513ac8e47e2545f] Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Bisecting: 770 revisions left to test after this (roughly 10 steps)
[fd9879b9bb3258ebc27a4cc6d2d29f528f71901f] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux
Bisecting: 417 revisions left to test after this (roughly 9 steps)
[5ff0b9e1a1da58b584aa4b8ea234be20b5a1164b] Merge tag 'xfs-for-linus-3.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs
Bisecting: 200 revisions left to test after this (roughly 8 steps)
[d0ca47575ab3b41bb7f0fe5feec13c6cddb2913a] Merge branch 'parisc-3.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Bisecting: 124 revisions left to test after this (roughly 7 steps)
[5e40d331bd72447197f26525f21711c4a265b6a6] Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Bisecting: 61 revisions left to test after this (roughly 6 steps)
[6889e783cd68b79f8330ad4d10a2571c67c3f7df] Merge branch 'xfs-misc-fixes-for-3.18-3' into for-next
Bisecting: 30 revisions left to test after this (roughly 5 steps)
[2ec3a12a667847d303d4d0c0576d5ff388052b48] cachefiles_write_page(): switch to __kernel_write()
Bisecting: 15 revisions left to test after this (roughly 4 steps)
[c143c2333c48f1430231b31a8c17e074b9b504eb] vfs: Remove d_drop calls from d_revalidate implementations
Bisecting: 7 revisions left to test after this (roughly 3 steps)
[7af1364ffa64db61e386628594836e13d2ef04b5] vfs: Don't allow overwriting mounts in the current mount namespace
Bisecting: 3 revisions left to test after this (roughly 2 steps)
[b3ca406f2755c20cea1cc1169672c56dd03c266c] autofs - remove obsolete d_invalidate() from expire
Bisecting: 1 revision left to test after this (roughly 1 step)
[3ccb354d641d910309b916b9c856e2a82ced7237] vfs: Document the effect of d_revalidate on d_find_alias
Bisecting: 0 revisions left to test after this (roughly 0 steps)
[bafc9b754f752ea798c39f9b099a228fd56604e0] vfs: More precise tests in d_invalidate
bafc9b754f752ea798c39f9b099a228fd56604e0 is the first bad commit
commit bafc9b754f752ea798c39f9b099a228fd56604e0
Author: Eric W. Biederman ebiederm@xmission.com
Date: Thu Feb 13 07:54:28 2014 -0800

vfs: More precise tests in d_invalidate

The current comments in d_invalidate about what and why it is doing
what it is doing are wildly off-base.  Which is not surprising as
the comments date back to last minute bug fix of the 2.2 kernel.

The big fat lie of a comment said: If it's a directory, we can't drop
it for fear of somebody re-populating it with children (even though
dropping it would make it unreachable from that root, we still might
repopulate it if it was a working directory or similar).

[AV] What we really need to avoid is multiple dentry aliases of the
same directory inode; on all filesystems that have ->d_revalidate()
we either declare all positive dentries always valid (and thus never
fed to d_invalidate()) or use d_materialise_unique() and/or d_splice_alias(),
which take care of alias prevention.

The current rules are:
- To prevent mount point leaks dentries that are mount points or that
  have childrent that are mount points may not be be unhashed.
- All dentries may be unhashed.
- Directories may be rehashed with d_materialise_unique

check_submounts_and_drop implements this already for well maintained
remote filesystems so implement the current rules in d_invalidate
by just calling check_submounts_and_drop.

The one difference between d_invalidate and check_submounts_and_drop
is that d_invalidate must respect it when a d_revalidate method has
earlier called d_drop so preserve the d_unhashed check in
d_invalidate.

Reviewed-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

:040000 040000 8cc719dd130833db305075c8c6a717db54796dbe 07a430172c871cd7de5dc5f31cd3bb23baed30a0 M fs

@Bronek
Copy link

Bronek commented Apr 10, 2015

I'm seeing the same problem, running kernel 3.18.11 . Will report back after reverting https://lkml.org/lkml/2014/2/25/117 Duh, this commit cannot be reverted, too many changes since in d_invalidate . For one its return type is now void. I guess this means we need to look for a bug in ZoL

@Ringdingcoder
Copy link

FWIW, I applied the reverse patch of 3ccb354d641..c143c2333c4 from mainline (this contains the changeset which @janlam7 found with his bisection) to a Fedora 3.18.7 kernel, and it started working again.

EDIT: Unfortunately, this patch does not apply cleanly on 3.19 anymore.

@zrav
Copy link

zrav commented Apr 18, 2015

Between 3.18 and >=3.19 the problem behaves differently, too:

  • On 3.18 the snapdirs are unable to be automounted
  • On 3.19+ they are automounted, but when you CD into a directory contained in a snapdir you get an unreachable path (no dmesg error). This causes a lot of problems for file tools.
Linux ubuntu1404 3.19.0-031900-generic #201504091832 SMP Thu Apr 9 17:35:46 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
root@ubuntu1404:~# cd /tank/data/.zfs/snapshot/mySnap/
root@ubuntu1404:/tank/data/.zfs/snapshot/mySnap# cd dir1/
root@ubuntu1404:(unreachable)/dir1#

@Bronek
Copy link

Bronek commented Apr 18, 2015

@Ringdingcoder thanks for the hint, I applied c143c2333c4..3ccb354d641 on top of vanilla 3.18.11 and it works again. It's only a workaround (unless upstream does the same - doubtful) but it works.

behlendorf added a commit to behlendorf/zfs that referenced this issue Apr 24, 2015
Commit torvalds/linux@bafc9b7 caused snapshots automounted by ZFS to
be immediately unmounted when the dentry was revalidated.  This was
a consequence of ZFS invaliding all snapdir dentries to ensure that
negative dentries didn't mask new snapshots.  This patch modifies the
behavior such that only negative dentries are invalidated.  This solves
the issue and may result in a performance improvement.

In addition, the MNT_SHRINKABLE flag is now set after the automount
succeeds by travesing down in to the snapshot.  This is much cleaner
than being forced to do it post in a getattr.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue openzfs#3030
@behlendorf
Copy link
Contributor

@janlam7 thanks for bisecting this! Narrowing it down the specific kernel commit helped tremendously. I've opened pull request #3344 to address this issue for 3.18 and newer kernels. I've verified it resolves the issue for both older and newer kernels but additional testing is always welcome.

@mgmartin
Copy link

Thanks Brian! The mounts are working now in a 3.18 kernel I'm testing. However, I can cause a NPE and subsequent kernel panic by:

  1. typing "mount" while in the .../.zfs/snapshot
  2. trying to ls a non-existent snapshot folder from any other folder (e.g ls /test/.zfs/snapshot/non-existant )

sorry no debug line numbers--still trying to figure out how to get more debug info. Hopefully, easy enough for you to re-produce.

[  155.476904] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
[  155.477010] IP: [<ffffffffa0481c21>] zfsctl_mount_snapshot+0x51/0x450 [zfs]
[  155.477010] PGD b8462067 PUD b841d067 PMD 0 
[  155.477010] Oops: 0000 [#1] SMP 
[  155.477010] Modules linked in: loop netconsole joydev mousedev hid_generic usbhid hid mac_hid cfg80211 rfkill crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel processor fbcon evdev aes_x86_64 psmouse lrw gf128mul serio_raw ppdev glue_helper pvpanic bitblit button ablk_helper cryptd microcode parport_pc parport pcspkr cirrus syscopyarea fbcon_rotate sysfillrect fbcon_ccw fbcon_ud fbcon_cw softcursor tileblit sysimgblt ttm drm_kms_helper drm i2c_piix4 fb fbdev intel_agp intel_gtt i2c_core sch_fq_codel zfs(PO) zunicode(PO) zcommon(PO) znvpair(PO) spl(O) zavl(PO) xfs libcrc32c sd_mod virtio_balloon sr_mod cdrom virtio_scsi virtio_net ata_generic pata_acpi atkbd ata_piix libps2 libata ehci_pci scsi_mod uhci_hcd crc32c_intel ehci_hcd floppy usbcore virtio_pci virtio_ring usb_common virtio i8042 serio
[  155.477010] CPU: 2 PID: 600 Comm: mount Tainted: P        W  O   3.18.10-mgm #2
[  155.477010] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140617_173321-var-lib-archbuild-testing-x86_64-tobias 04/01/2014
[  155.477010] task: ffff8800379c1a40 ti: ffff8800bb8fc000 task.ti: ffff8800bb8fc000
[  155.477010] RIP: 0010:[<ffffffffa0481c21>]  [<ffffffffa0481c21>] zfsctl_mount_snapshot+0x51/0x450 [zfs]
[  155.477010] RSP: 0018:ffff8800bb8ffba8  EFLAGS: 00010246
[  155.477010] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff8800379c1a40
[  155.477010] RDX: ffff88013946a2a0 RSI: ffffffffa04c5fa0 RDI: ffff8800bb8ffd50
[  155.477010] RBP: ffff8800bb4aef00 R08: 0000000000000000 R09: ffff88013fc9a820
[  155.477010] R10: ffffffffa03678df R11: ffffea0002d71500 R12: 0000000000000000
[  155.477010] R13: 0000000000000000 R14: ffff8800bb8ffd50 R15: ffff8800bb4aef00
[  155.477010] FS:  00007fda2a4e1780(0000) GS:ffff88013fc80000(0000) knlGS:0000000000000000
[  155.477010] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  155.477010] CR2: 0000000000000028 CR3: 00000000b844a000 CR4: 00000000001006e0
[  155.477010] Stack:
[  155.477010]  000000000251a855 ffffffff81177997 0000000000000000 0000000000000000
[  155.477010]  ffff8800bb4aef00 0000000000000000 ffffffffa04dfc34 ffffffffa04dfc3c
[  155.477010]  ffff8800379e8000 ffffffff8117892e 0000000000000000 ffffffff8117a13d
[  155.477010] Call Trace:
[  155.477010]  [<ffffffff81177997>] ? __d_instantiate+0x27/0xf0
[  155.477010]  [<ffffffff8117892e>] ? d_rehash+0x3e/0x50
[  155.477010]  [<ffffffff8117a13d>] ? d_splice_alias+0xcd/0x1c0
[  155.477010]  [<ffffffffa04b296b>] ? zpl_snapdir_automount+0xb/0x20 [zfs]
[  155.477010]  [<ffffffff8116ca30>] ? follow_managed+0x150/0x3a0
[  155.477010]  [<ffffffff8116ed6e>] ? lookup_slow+0x6e/0xb0
[  155.477010]  [<ffffffff8116fe43>] ? path_lookupat+0x713/0x850
[  155.477010]  [<ffffffff8117039f>] ? filename_lookup.isra.28+0x1f/0x70
[  155.477010]  [<ffffffff81172c86>] ? user_path_at_empty+0x56/0xc0
[  155.477010]  [<ffffffff8104889f>] ? kvm_clock_read+0x1f/0x30
[  155.477010]  [<ffffffff81015475>] ? sched_clock+0x5/0x10
[  155.477010]  [<ffffffff81167d8a>] ? vfs_fstatat+0x5a/0xc0
[  155.477010]  [<ffffffff81168430>] ? SyS_newlstat+0x20/0x50
[  155.477010]  [<ffffffff8101a265>] ? syscall_trace_enter_phase1+0x115/0x180
[  155.477010]  [<ffffffff8147bc65>] ? int_check_syscall_exit_work+0x34/0x3d
[  155.477010]  [<ffffffff8147ba09>] ? system_call_fastpath+0x12/0x17
[  155.477010] Code: 00 00 48 89 84 24 88 00 00 00 31 c0 48 c7 44 24 30 34 fc 4d a0 48 c7 44 24 38 3c fc 4d a0 48 c7 44 24 28 00 00 00 00 4c 8b 6d 30 <49> 8b 45 28 48 c7 44 24 40 00 00 00 00 48 c7 44 24 48 00 00 00 
[  155.477010] RIP  [<ffffffffa0481c21>] zfsctl_mount_snapshot+0x51/0x450 [zfs]
[  155.477010]  RSP <ffff8800bb8ffba8>
[  155.477010] CR2: 0000000000000028
[  155.495833] ---[ end trace e31e58e8003080e4 ]---
[  155.497570] Kernel panic - not syncing: Fatal exception
[  155.498557] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff)
[  155.498557] drm_kms_helper: panic occurred, switching back to text console
[  155.498557] Rebooting in 60 seconds..

@behlendorf
Copy link
Contributor

Thanks for the quick test. I thought there might be some rough edges still to address. I didn't get a chance to do much testing. When I get some time I'll run down those panics.

@mgmartin
Copy link

@behlendorf I tracked the issue down to the path->dentry->d_inode being null. The fault occurs when ITOZSB(ip) is called when ip is null . A quick fix was to put a check in zpl_ctldir.c/zpl_snapdir_automount to not call zfsctl_mount_snapshot if the inode is null.

+       if (path->dentry->d_inode == NULL)
+               return (NULL);
        error = -zfsctl_mount_snapshot(path, 0);

doing this spits out a "Too many levels of symbolic links error", but avoids the fault. I'm not very familiar with the code, so there's probably a cleaner way.

@behlendorf
Copy link
Contributor

@mgmartin returning EISDIR from zfsctl_mount_snapshot() should work as a stop gap for now. I'm working through a slightly cleaner fix now.

kernelOfTruth pushed a commit to kernelOfTruth/zfs that referenced this issue Apr 30, 2015
Commit torvalds/linux@bafc9b7 caused snapshots automounted by ZFS to
be immediately unmounted when the dentry was revalidated.  This was
a consequence of ZFS invaliding all snapdir dentries to ensure that
negative dentries didn't mask new snapshots.  This patch modifies the
behavior such that only negative dentries are invalidated.  This solves
the issue and may result in a performance improvement.

In addition, the MNT_SHRINKABLE flag is now set after the automount
succeeds by travesing down in to the snapshot.  This is much cleaner
than being forced to do it post in a getattr.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue openzfs#3030
@Bronek
Copy link

Bronek commented May 24, 2015

Ok so the new kernel 3.18.14 came out and I tried to apply patch made from git diff c143c2333c4..3ccb354d641 (on linux mainline) on top of this version, which failed. I found that this was due to following two changes in relase 3.18.14

linux-stable.git (git)-[master] % git log v3.18.13..v3.18.14 --reverse --stat --full-diff -- fs/namespace.c
commit 6b1353cb2664c93997561fa4f46f055da51f5ee7
Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Wed Dec 24 07:20:01 2014 -0600

    mnt: Improve the umount_tree flags

    [ Upstream commit e819f152104c9f7c9fe50e1aecce6f5d4bf06d65 ]

    - Remove the unneeded declaration from pnode.h
    - Mark umount_tree static as it has no callers outside of namespace.c
    - Define an enumeration of umount_tree's flags.
    - Pass umount_tree's flags in by name

    This removes the magic numbers 0, 1 and 2 making the code a little
    clearer and makes it possible for there to be lazy unmounts that don't
    propagate.  Which is what __detach_mounts actually wants for example.

    Cc: stable@vger.kernel.org
    Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
    Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

 fs/namespace.c | 31 ++++++++++++++++---------------
 fs/pnode.h     |  1 -
 2 files changed, 16 insertions(+), 16 deletions(-)

commit 1c1cf82e193a887f37672b0d7b2657c76613ee16
Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Wed Dec 24 07:35:10 2014 -0600

    mnt: Don't propagate umounts in __detach_mounts

    [ Upstream commit 8318e667f176f7ea34451a1a530634e293f216ac ]

    Invoking mount propagation from __detach_mounts is inefficient and
    wrong.

    It is inefficient because __detach_mounts already walks the list of
    mounts that where something needs to be done, and mount propagation
    walks some subset of those mounts again.

    It is actively wrong because if the dentry that is passed to
    __detach_mounts is not part of the path to a mount that mount should
    not be affected.

    change_mnt_propagation(p,MS_PRIVATE) modifies the mount propagation
    tree of a master mount so it's slaves are connected to another master
    if possible.  Which means even removing a mount from the middle of a
    mount tree with __detach_mounts will not deprive any mount propagated
    mount events.

    Cc: stable@vger.kernel.org
    Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
    Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

 fs/namespace.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Later of those two is interesting, and I gather that the bug it fixes could possibly be responsible for the issue reported here, and also possibly in #3257 . I will try to reproduce this issue on kernel 3.18.14 , also with patch #3344 and see how it works.

Will report back later

@Bronek
Copy link

Bronek commented May 24, 2015

FWIW, above two commits belong to merge torvalds/linux@8f502d5 . Other (aside from 3.18.14) stable Linux version where his has been merged is 4.0.2 .

@Bronek
Copy link

Bronek commented May 24, 2015

So, it is not enough to simply upgrade to version 3.18.14 (from 3.18.13) and give up on patch git diff c143c2333c4..3ccb354d641 ; I got this:

root@gdansk /data/.zfs/snapshot # ls
20150111-0915  GMT-20150322-1355  GMT-20150404-1400  GMT-20150411-2145  GMT-20150419-2125  GMT-20150428-2000  GMT-20150507-2055
20150113-2310  GMT-20150323-1950  GMT-20150405-1540  GMT-20150412-2020  GMT-20150425-1750  GMT-20150430-2210  GMT-20150514-2240
20150125-1915  GMT-20150328-2125  GMT-20150406-1715  GMT-20150415-1920  GMT-20150426-2215  GMT-20150501-2005  GMT-20150517-1740
20150205-2010  GMT-20150402-2200  GMT-20150408-0055  GMT-20150418-1240  GMT-20150427-1940  GMT-20150506-1925  GMT-20150524-1630
root@gdansk /data/.zfs/snapshot # ls -al GMT-20150524-1630
ls: cannot access GMT-20150524-1630/.: Input/output error
ls: cannot access GMT-20150524-1630/..: Input/output error
total 0
d????????? ? ? ? ?            ? .
d????????? ? ? ? ?            ? ..
root@gdansk /data/.zfs/snapshot # cd GMT-20150524-1630
root@gdansk . # ls -al
total 0
dr-xr-xr-x 1 root root 0 May 24 16:34 .
dr-xr-xr-x 2 root root 2 May 24 16:30 ..
root@gdansk . # cd ~
root@gdansk ~ # dmesg | tail -2
[  493.352597] ZFS: Unable to automount zdata@GMT-20150524-1630 at /data/.zfs/snapshot/GMT-20150524-1630: 512
[  497.596039] ZFS: Unable to automount zdata@GMT-20150524-1630 at /data/.zfs/snapshot/GMT-20150524-1630: 512

I will now try #3344

@mgmartin
Copy link

@Bronek I've been running with #3344 to 3.18.14 and snapshot mounting works for me.

You'll still get a panic if you try to access a non-existent snapshot, as I mentioned above. However, #3344 applied cleanly and compiled with the 3.18.14 kernel.

@Bronek
Copy link

Bronek commented May 24, 2015

Ok so #3344 kind of works for me:

root@gdansk ~ # ls -al /home/.zfs/snapshot/GMT-20150524-1630
total 42
drwxr-xr-x  5 root  root   5 Apr  8 00:57 .
dr-xr-xr-x  3 root  root   3 May 24 17:25 ..
drwxr-xr-x  2 root  root   5 Apr  8 00:58 LAN
drwxr-xr-x 14 maker maker 22 May 24 16:29 maker
drwxr-xr-x  2 maker maker 33 May 24 15:30 pkgs

However, when I replaced ls -al with cd I got the following:

[  754.272085] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
[  754.279971] IP: [<ffffffffa033d57c>] zfsctl_mount_snapshot+0x5c/0x4e0 [zfs]
[  754.286971] PGD a2afff067 PUD a25c7b067 PMD 0 
[  754.291469] Oops: 0000 [#1] PREEMPT SMP 
[  754.295456] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_i
pv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp xt_multiport ebtable_filter ebtables ip6table_filter ip6_tables
 iptable_filter ip_tables x_tables iTCO_wdt iTCO_vendor_support mxm_wmi ext4 crc16 mbcache jbd2 coretemp x86_pkg_temp_thermal intel_powerclamp k
vm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd p
cspkr serio_raw sb_edac joydev evdev mousedev edac_core mac_hid igb i2c_i801 ptp pps_core hwmon i2c_algo_bit i2c_core mei_me mei lpc_ich ioatdma
 dca tpm_tis tpm wmi processor shpchp button sch_fq_codel hid_generic usbhid hid sd_mod zfs(PO)
[  754.367616]  zunicode(PO) zcommon(PO) znvpair(PO) spl(O) zavl(PO) atkbd libps2 nvme megaraid_sas ahci libahci ehci_pci ehci_hcd isci libsas x
hci_pci xhci_hcd mpt2sas libata raid_class scsi_transport_sas usbcore scsi_mod usb_common i8042 serio bridge stp llc vhost_net tun vhost macvtap macvlan
[  754.392840] CPU: 1 PID: 3394 Comm: zsh Tainted: P           O   3.18.14-1-ARCH #1
[  754.400321] Hardware name: Supermicro X9DA7/E/X9DA7/E, BIOS 3.0a 07/02/2014
[  754.407283] task: ffff880a3247a840 ti: ffff880a25ce4000 task.ti: ffff880a25ce4000
[  754.414764] RIP: 0010:[<ffffffffa033d57c>]  [<ffffffffa033d57c>] zfsctl_mount_snapshot+0x5c/0x4e0 [zfs]
[  754.424182] RSP: 0018:ffff880a25ce7b58  EFLAGS: 00010246
[  754.429494] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff880df140f2e0
[  754.436629] RDX: ffff880a37f34840 RSI: ffffffffa0385300 RDI: ffff880a25ce7d20
[  754.443757] RBP: ffff880a25ce7c28 R08: ffff880a3247a840 R09: 0000000000000000
[  754.450883] R10: ffffffffa0226d42 R11: 0000000000000001 R12: ffff880a37f34840
[  754.458011] R13: ffff880a25ce7d20 R14: 0000000000000000 R15: ffff880a25ce7d20
[  754.465139] FS:  00007fcfed9bd700(0000) GS:ffff88103fc20000(0000) knlGS:0000000000000000
[  754.473227] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  754.478966] CR2: 0000000000000028 CR3: 0000000a2afee000 CR4: 00000000001427e0
[  754.486092] Stack:
[  754.488107]  ffff880a25ce7b88 ffffffff811e8d54 ffff880a25ce7b88 ffffffff811e74e4
[  754.495569]  0000000000000000 ffff880a37f34840 ffff880a25ce7bd8 0000000000000000
[  754.503025]  ffffffffa039ed59 ffffffffa039ed61 ffff880a37f34840 ffff880a3247a840
[  754.510481] Call Trace:
[  754.512936]  [<ffffffff811e8d54>] ? d_instantiate+0x54/0x80
[  754.518504]  [<ffffffff811e74e4>] ? d_rehash+0x54/0x60
[  754.523653]  [<ffffffffa0370d10>] ? zpl_snapdir_rename+0x6b0/0xec0 [zfs]
[  754.530361]  [<ffffffffa0370640>] zle_decompress+0x320/0x340 [zfs]
[  754.536538]  [<ffffffff811da65f>] follow_managed+0x13f/0x370
[  754.542198]  [<ffffffff811dc9b8>] lookup_slow+0x78/0xc0
[  754.547417]  [<ffffffff811ddf8a>] path_lookupat+0x75a/0x8c0
[  754.552986]  [<ffffffff81185c64>] ? do_wp_page+0xf4/0x9b0
[  754.558385]  [<ffffffff811e03d0>] ? getname_flags+0x30/0x130
[  754.564047]  [<ffffffff811de116>] filename_lookup.isra.7+0x26/0x80
[  754.570229]  [<ffffffff811e0e03>] user_path_at_empty+0x63/0xd0
[  754.576068]  [<ffffffff81060a74>] ? __do_page_fault+0x2e4/0x610
[  754.581985]  [<ffffffff811e0e81>] user_path_at+0x11/0x20
[  754.587304]  [<ffffffff811d566a>] vfs_fstatat+0x6a/0xd0
[  754.592531]  [<ffffffff811f0aec>] ? mntput_no_expire+0x2c/0x160
[  754.598450]  [<ffffffff811d5b83>] SyS_newstat+0x33/0x60
[  754.603678]  [<ffffffff81060dc2>] ? do_page_fault+0x22/0x30
[  754.609257]  [<ffffffff81557118>] ? page_fault+0x28/0x30
[  754.614570]  [<ffffffff81555249>] system_call_fastpath+0x12/0x17
[  754.620573] Code: a0 65 48 8b 04 25 28 00 00 00 48 89 45 c8 31 c0 48 c7 85 78 ff ff ff 61 ed 39 a0 48 c7 85 68 ff ff ff 00 00 00 00 4d 8b 74 24 30 <49> 8b 46 28 48 c7 45 80 00 00 00 00 48 c7 45 88 00 00 00 00 48 
[  754.640540] RIP  [<ffffffffa033d57c>] zfsctl_mount_snapshot+0x5c/0x4e0 [zfs]
[  754.647606]  RSP <ffff880a25ce7b58>
[  754.651093] CR2: 0000000000000028
[  754.654746] ---[ end trace b1fd8855c529028b ]---

This is on vanilla kernel 3.18.14 , zfs version 0.6.4.1 with only #3344 applied and nothing else. I can easily reproduce this "Oops".

Furthermore, on computer restart (after two "Oopses" like the above) I got the following

[ 1432.904647] BUG: Dentry ffff880a37f5bd40{i=0,n=.svn}  still in use (1) [unmount of zfs zfs]
[ 1432.913002] ------------[ cut here ]------------
[ 1432.917633] WARNING: CPU: 13 PID: 3851 at fs/dcache.c:1288 umount_check+0x7c/0x90()
[ 1432.925302] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_i
pv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp xt_multiport ebtable_filter ebtables ip6table_filter ip6_tables
 iptable_filter ip_tables x_tables iTCO_wdt iTCO_vendor_support mxm_wmi ext4 crc16 mbcache jbd2 coretemp x86_pkg_temp_thermal intel_powerclamp k
vm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd p
cspkr serio_raw sb_edac joydev evdev mousedev edac_core mac_hid igb i2c_i801 ptp pps_core hwmon i2c_algo_bit i2c_core mei_me mei lpc_ich ioatdma dca tpm_tis tpm wmi processor shpchp button sch_fq_codel hid_generic usbhid hid sd_mod zfs(PO) zunicode(PO) zcommon(PO) znvpair(PO) spl(O) zavl(PO) atkbd libps2 nvme megaraid_sas ahci libahci ehci_pci ehci_hcd isci libsas xhci_pci xhci_hcd mpt2sas libata raid_class scsi_transport_sas usbcore scsi_mod usb_common i8042 serio bridge stp llc vhost_net tun vhost macvtap macvlan
[ 1433.022812] CPU: 13 PID: 3851 Comm: umount Tainted: P      D    O   3.18.14-1-ARCH #1
[ 1433.030645] Hardware name: Supermicro X9DA7/E/X9DA7/E, BIOS 3.0a 07/02/2014
[ 1433.037613]  0000000000000000 000000006bbc4e9b ffff880a24bfbce8 ffffffff8154f71d
[ 1433.045092]  0000000000000000 0000000000000000 ffff880a24bfbd28 ffffffff810726e1
[ 1433.052557]  ffff880a24bfbd08 ffff880a37f5bd40 ffffffff811e8bf0 ffff880a37f348d0
[ 1433.060031] Call Trace:
[ 1433.062486]  [<ffffffff8154f71d>] dump_stack+0x4e/0x71
[ 1433.067636]  [<ffffffff810726e1>] warn_slowpath_common+0x81/0xa0
[ 1433.073649]  [<ffffffff811e8bf0>] ? d_invalidate+0x130/0x130
[ 1433.079314]  [<ffffffff810727fa>] warn_slowpath_null+0x1a/0x20
[ 1433.085154]  [<ffffffff811e8c6c>] umount_check+0x7c/0x90
[ 1433.090473]  [<ffffffff811e6731>] d_walk+0xc1/0x310
[ 1433.095367]  [<ffffffff811e89ca>] do_one_tree+0x2a/0x50
[ 1433.100598]  [<ffffffff811e956f>] shrink_dcache_for_umount+0x2f/0x90
[ 1433.106951]  [<ffffffff811d276c>] generic_shutdown_super+0x2c/0x100
[ 1433.113223]  [<ffffffff811d2b26>] kill_anon_super+0x16/0x30
[ 1433.118814]  [<ffffffffa03739ae>] zpl_vap_init+0xaae/0xbd0 [zfs]
[ 1433.124831]  [<ffffffff811d2f19>] deactivate_locked_super+0x49/0x60
[ 1433.131097]  [<ffffffff811d336c>] deactivate_super+0x6c/0x80
[ 1433.136764]  [<ffffffff811f0703>] cleanup_mnt+0x43/0xa0
[ 1433.141995]  [<ffffffff811f07b2>] __cleanup_mnt+0x12/0x20
[ 1433.147402]  [<ffffffff8108ee0c>] task_work_run+0xbc/0xe0
[ 1433.152811]  [<ffffffff81014d75>] do_notify_resume+0x95/0xa0
[ 1433.158484]  [<ffffffff815554c0>] int_signal+0x12/0x17
[ 1433.163627] ---[ end trace b1fd8855c529028d ]---
[ 1433.168253] BUG: Dentry ffff880a37f34840{i=0,n=.svn}  still in use (1) [unmount of zfs zfs]
[ 1433.176602] ------------[ cut here ]------------
[ 1433.181229] WARNING: CPU: 13 PID: 3851 at fs/dcache.c:1288 umount_check+0x7c/0x90()
[ 1433.188882] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp xt_multiport ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables iTCO_wdt iTCO_vendor_support mxm_wmi ext4 crc16 mbcache jbd2 coretemp x86_pkg_temp_thermal intel_powerclamp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd pcspkr serio_raw sb_edac joydev evdev mousedev edac_core mac_hid igb i2c_i801 ptp pps_core hwmon i2c_algo_bit i2c_core mei_me mei lpc_ich ioatdma dca tpm_tis tpm wmi processor shpchp button sch_fq_codel hid_generic usbhid hid sd_mod zfs(PO) zunicode(PO) zcommon(PO) znvpair(PO) spl(O) zavl(PO) atkbd libps2 nvme megaraid_sas ahci libahci ehci_pci ehci_hcd isci libsas xhci_pci xhci_hcd mpt2sas libata raid_class scsi_transport_sas usbcore scsi_mod usb_common i8042 serio bridge stp llc vhost_net tun vhost macvtap macvlan
[ 1433.286146] CPU: 13 PID: 3851 Comm: umount Tainted: P      D W  O   3.18.14-1-ARCH #1
[ 1433.293970] Hardware name: Supermicro X9DA7/E/X9DA7/E, BIOS 3.0a 07/02/2014
[ 1433.300937]  0000000000000000 000000006bbc4e9b ffff880a24bfbce8 ffffffff8154f71d
[ 1433.308409]  0000000000000000 0000000000000000 ffff880a24bfbd28 ffffffff810726e1
[ 1433.315884]  ffff880a24bfbd08 ffff880a37f34840 ffffffff811e8bf0 ffff880a25af58a0
[ 1433.323356] Call Trace:
[ 1433.325810]  [<ffffffff8154f71d>] dump_stack+0x4e/0x71
[ 1433.330951]  [<ffffffff810726e1>] warn_slowpath_common+0x81/0xa0
[ 1433.336959]  [<ffffffff811e8bf0>] ? d_invalidate+0x130/0x130
[ 1433.342623]  [<ffffffff810727fa>] warn_slowpath_null+0x1a/0x20
[ 1433.348465]  [<ffffffff811e8c6c>] umount_check+0x7c/0x90
[ 1433.353781]  [<ffffffff811e6731>] d_walk+0xc1/0x310
[ 1433.358663]  [<ffffffff811e89ca>] do_one_tree+0x2a/0x50
[ 1433.363899]  [<ffffffff811e956f>] shrink_dcache_for_umount+0x2f/0x90
[ 1433.370252]  [<ffffffff811d276c>] generic_shutdown_super+0x2c/0x100
[ 1433.376525]  [<ffffffff811d2b26>] kill_anon_super+0x16/0x30
[ 1433.382110]  [<ffffffffa03739ae>] zpl_vap_init+0xaae/0xbd0 [zfs]
[ 1433.388122]  [<ffffffff811d2f19>] deactivate_locked_super+0x49/0x60
[ 1433.394388]  [<ffffffff811d336c>] deactivate_super+0x6c/0x80
[ 1433.400053]  [<ffffffff811f0703>] cleanup_mnt+0x43/0xa0
[ 1433.405288]  [<ffffffff811f07b2>] __cleanup_mnt+0x12/0x20
[ 1433.410690]  [<ffffffff8108ee0c>] task_work_run+0xbc/0xe0
[ 1433.416090]  [<ffffffff81014d75>] do_notify_resume+0x95/0xa0
[ 1433.421754]  [<ffffffff815554c0>] int_signal+0x12/0x17
[ 1433.426891] ---[ end trace b1fd8855c529028e ]---
[ 1433.431879] VFS: Busy inodes after unmount of zfs. Self-destruct in 5 seconds.  Have a nice day...
[  OK  ] Unmounted /home.
[  OK  ] Unmounted /data/users.
         Unmounting /data...
[  OK  ] Unmounted /data.
[  OK  ] Reached target Unmount All Filesystems.
[  OK  ] Stopped target Local File Systems (Pre).
[  OK  ] Stopped Create Static Device Nodes in /dev.
         Stopping Create Static Device Nodes in /dev...
[  OK  ] Stopped Remount Root and Kernel File Systems.
         Stopping Remount Root and Kernel File Systems...
[  OK  ] Reached target Shutdown.

If I do not have any "Oopses" when running the computer (also after ls -al /home/.zfs/snapshot/GMT-20150524-1630 which worked fine) then computer shutdown/reboot looks normal

@Bronek
Copy link

Bronek commented May 25, 2015

I tried troubleshooting this "Oops" and found this to be result of shell trying to access a non-existent subdirectory of snapshot directory. I am using grml-zsh which, among other things, tries to find either .svn or .git subdirectory of current and parent directories when building command prompt. It will probe for these two inside snapshot subdirectory I changed into and, if not found, it will next probe inside parent directory (ie. .zfs/snapshot ) etc. This is causing an error.

Minimal test to reproduce this is /sbin/sh -c "ls /data/.zfs/snapshot/dummy-not-here"

It is the same under kernel 4.0.4 (didn't try 4.0.2) with #3344 applied. Which is actually a progress compared to behaviour without this patch, since for all "normal" cases automounting works and there is no kernel panic. The case described above is a bit of a corner case.

@mgmartin
Copy link

@Bronek the panic from accessing non-existent snapshots was seen and diagnosed earlier in these threads--see my comment and quick fix above posted on Apr 25

@Bronek
Copy link

Bronek commented May 25, 2015

@mgmartin I was probaly confused. The system continues to operate, that is not what I'd call "kernel panic". It just kills user process and registers kernel bug (and then again leaked dentry on shutdown). I thought that is different to your experience.

@mgmartin
Copy link

@Bronek no problem. I typically set my kernels to panic on oops and often interchange the two terms.

@janlam7
Copy link

janlam7 commented Jun 23, 2015

I just installed a 4.0.5 kernel and the snapshots seem to automount again without problems.
I'm running the master branch of zfs. Is that expected ?
Oh, after waiting a few minutes it panics when unmounting.

@Bronek
Copy link

Bronek commented Jun 23, 2015

@janlam7 Kernel panic is in #3257 , but this is (almost certainly) one and the same bug

kernelOfTruth pushed a commit to kernelOfTruth/zfs that referenced this issue Jul 25, 2015
Commit torvalds/linux@bafc9b7 caused snapshots automounted by ZFS to
be immediately unmounted when the dentry was revalidated.  This was
a consequence of ZFS invaliding all snapdir dentries to ensure that
negative dentries didn't mask new snapshots.  This patch modifies the
behavior such that only negative dentries are invalidated.  This solves
the issue and may result in a performance improvement.

In addition, the MNT_SHRINKABLE flag is now set after the automount
succeeds by travesing down in to the snapshot.  This is much cleaner
than being forced to do it post in a getattr.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue openzfs#3030
kernelOfTruth pushed a commit to kernelOfTruth/zfs that referenced this issue Aug 6, 2015
Commit torvalds/linux@bafc9b7 caused snapshots automounted by ZFS to
be immediately unmounted when the dentry was revalidated.  This was
a consequence of ZFS invaliding all snapdir dentries to ensure that
negative dentries didn't mask new snapshots.  This patch modifies the
behavior such that only negative dentries are invalidated.  This solves
the issue and may result in a performance improvement.

In addition, the MNT_SHRINKABLE flag is now set after the automount
succeeds by travesing down in to the snapshot.  This is much cleaner
than being forced to do it post in a getattr.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue openzfs#3030
behlendorf added a commit to behlendorf/zfs that referenced this issue Aug 30, 2015
Re-factor the .zfs/snapshot auto-mouting code to take in to account
changes made to the upstream kernels.  And to lay the groundwork for
enabling access to .zfs snapshots via NFS clients.  This patch makes
the following core improvements.

* All actively auto-mounted snapshots are now tracked in two global
trees which are indexed by snapshot name and objset id respectively.
This allows for fast lookups of any auto-mounted snapshot regardless
without needing access to the parent dataset.

* Snapshot entries are added to the tree in zfsctl_snapshot_mount().
However, they are now removed from the tree in the context of the
unmount process.  This eliminates the need complicated error logic
in zfsctl_snapshot_unmount() to handle unmount failures.

* References are now taken on the snapshot entries in the tree to
ensure they always remain valid while a task is outstanding.

* The MNT_SHRINKABLE flag is set on the snapshot vfsmount_t right
after the auto-mount succeeds.  This allows to kernel to unmount
idle auto-mounted snapshots if needed removing the need for the
zfsctl_unmount_snapshots() function.

* Snapshots in active use will not be automatically unmounted.  As
long as at least one dentry is revalidated every zfs_expire_snapshot/2
seconds the auto-unmount expiration timer will be extended.

* Commit torvalds/linux@bafc9b7 caused snapshots auto-mounted by ZFS
to be immediately unmounted when the dentry was revalidated.  This
was a consequence of ZFS invaliding all snapdir dentries to ensure that
negative dentries didn't mask new snapshots.  This patch modifies the
behavior such that only negative dentries are invalidated.  This solves
the issue and may result in a performance improvement.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue openzfs#3589
Issue openzfs#3344
Issue openzfs#3295
Issue openzfs#3257
Issue openzfs#3243
Issue openzfs#3030
Issue openzfs#2841
@behlendorf
Copy link
Contributor

I've posted a complete fix in issue #3718 for this issue. It's been tested with 4.1 kernels but additional testing on a wider range of kernels would be very helpful.

@mgmartin
Copy link

NICE! Working well for me after applying the diffs to the latest spl/zfs git on a 4.1.6 kernel. No issues looking for non-existent snapshots, and no issues mounting and looking through ~53 snapshots.

Thanks!

@Bronek
Copy link

Bronek commented Aug 31, 2015

#3718 works for me, too, thanks @behlendorf !

behlendorf added a commit to behlendorf/zfs that referenced this issue Aug 31, 2015
Re-factor the .zfs/snapshot auto-mouting code to take in to account
changes made to the upstream kernels.  And to lay the groundwork for
enabling access to .zfs snapshots via NFS clients.  This patch makes
the following core improvements.

* All actively auto-mounted snapshots are now tracked in two global
trees which are indexed by snapshot name and objset id respectively.
This allows for fast lookups of any auto-mounted snapshot regardless
without needing access to the parent dataset.

* Snapshot entries are added to the tree in zfsctl_snapshot_mount().
However, they are now removed from the tree in the context of the
unmount process.  This eliminates the need complicated error logic
in zfsctl_snapshot_unmount() to handle unmount failures.

* References are now taken on the snapshot entries in the tree to
ensure they always remain valid while a task is outstanding.

* The MNT_SHRINKABLE flag is set on the snapshot vfsmount_t right
after the auto-mount succeeds.  This allows to kernel to unmount
idle auto-mounted snapshots if needed removing the need for the
zfsctl_unmount_snapshots() function.

* Snapshots in active use will not be automatically unmounted.  As
long as at least one dentry is revalidated every zfs_expire_snapshot/2
seconds the auto-unmount expiration timer will be extended.

* Commit torvalds/linux@bafc9b7 caused snapshots auto-mounted by ZFS
to be immediately unmounted when the dentry was revalidated.  This
was a consequence of ZFS invaliding all snapdir dentries to ensure that
negative dentries didn't mask new snapshots.  This patch modifies the
behavior such that only negative dentries are invalidated.  This solves
the issue and may result in a performance improvement.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes openzfs#3589
Closes openzfs#3344
Closes openzfs#3295
Closes openzfs#3257
Closes openzfs#3243
Closes openzfs#3030
Closes openzfs#2841
behlendorf added a commit to behlendorf/zfs that referenced this issue Aug 31, 2015
Re-factor the .zfs/snapshot auto-mouting code to take in to account
changes made to the upstream kernels.  And to lay the groundwork for
enabling access to .zfs snapshots via NFS clients.  This patch makes
the following core improvements.

* All actively auto-mounted snapshots are now tracked in two global
trees which are indexed by snapshot name and objset id respectively.
This allows for fast lookups of any auto-mounted snapshot regardless
without needing access to the parent dataset.

* Snapshot entries are added to the tree in zfsctl_snapshot_mount().
However, they are now removed from the tree in the context of the
unmount process.  This eliminates the need complicated error logic
in zfsctl_snapshot_unmount() to handle unmount failures.

* References are now taken on the snapshot entries in the tree to
ensure they always remain valid while a task is outstanding.

* The MNT_SHRINKABLE flag is set on the snapshot vfsmount_t right
after the auto-mount succeeds.  This allows to kernel to unmount
idle auto-mounted snapshots if needed removing the need for the
zfsctl_unmount_snapshots() function.

* Snapshots in active use will not be automatically unmounted.  As
long as at least one dentry is revalidated every zfs_expire_snapshot/2
seconds the auto-unmount expiration timer will be extended.

* Commit torvalds/linux@bafc9b7 caused snapshots auto-mounted by ZFS
to be immediately unmounted when the dentry was revalidated.  This
was a consequence of ZFS invaliding all snapdir dentries to ensure that
negative dentries didn't mask new snapshots.  This patch modifies the
behavior such that only negative dentries are invalidated.  This solves
the issue and may result in a performance improvement.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes openzfs#3589
Closes openzfs#3344
Closes openzfs#3295
Closes openzfs#3257
Closes openzfs#3243
Closes openzfs#3030
Closes openzfs#2841
tomgarcia pushed a commit to tomgarcia/zfs that referenced this issue Sep 11, 2015
Re-factor the .zfs/snapshot auto-mouting code to take in to account
changes made to the upstream kernels.  And to lay the groundwork for
enabling access to .zfs snapshots via NFS clients.  This patch makes
the following core improvements.

* All actively auto-mounted snapshots are now tracked in two global
trees which are indexed by snapshot name and objset id respectively.
This allows for fast lookups of any auto-mounted snapshot regardless
without needing access to the parent dataset.

* Snapshot entries are added to the tree in zfsctl_snapshot_mount().
However, they are now removed from the tree in the context of the
unmount process.  This eliminates the need complicated error logic
in zfsctl_snapshot_unmount() to handle unmount failures.

* References are now taken on the snapshot entries in the tree to
ensure they always remain valid while a task is outstanding.

* The MNT_SHRINKABLE flag is set on the snapshot vfsmount_t right
after the auto-mount succeeds.  This allows to kernel to unmount
idle auto-mounted snapshots if needed removing the need for the
zfsctl_unmount_snapshots() function.

* Snapshots in active use will not be automatically unmounted.  As
long as at least one dentry is revalidated every zfs_expire_snapshot/2
seconds the auto-unmount expiration timer will be extended.

* Commit torvalds/linux@bafc9b7 caused snapshots auto-mounted by ZFS
to be immediately unmounted when the dentry was revalidated.  This
was a consequence of ZFS invaliding all snapdir dentries to ensure that
negative dentries didn't mask new snapshots.  This patch modifies the
behavior such that only negative dentries are invalidated.  This solves
the issue and may result in a performance improvement.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes openzfs#3589
Closes openzfs#3344
Closes openzfs#3295
Closes openzfs#3257
Closes openzfs#3243
Closes openzfs#3030
Closes openzfs#2841
JKDingwall pushed a commit to JKDingwall/zfs that referenced this issue Aug 11, 2016
Re-factor the .zfs/snapshot auto-mouting code to take in to account
changes made to the upstream kernels.  And to lay the groundwork for
enabling access to .zfs snapshots via NFS clients.  This patch makes
the following core improvements.

* All actively auto-mounted snapshots are now tracked in two global
trees which are indexed by snapshot name and objset id respectively.
This allows for fast lookups of any auto-mounted snapshot regardless
without needing access to the parent dataset.

* Snapshot entries are added to the tree in zfsctl_snapshot_mount().
However, they are now removed from the tree in the context of the
unmount process.  This eliminates the need complicated error logic
in zfsctl_snapshot_unmount() to handle unmount failures.

* References are now taken on the snapshot entries in the tree to
ensure they always remain valid while a task is outstanding.

* The MNT_SHRINKABLE flag is set on the snapshot vfsmount_t right
after the auto-mount succeeds.  This allows to kernel to unmount
idle auto-mounted snapshots if needed removing the need for the
zfsctl_unmount_snapshots() function.

* Snapshots in active use will not be automatically unmounted.  As
long as at least one dentry is revalidated every zfs_expire_snapshot/2
seconds the auto-unmount expiration timer will be extended.

* Commit torvalds/linux@bafc9b7 caused snapshots auto-mounted by ZFS
to be immediately unmounted when the dentry was revalidated.  This
was a consequence of ZFS invaliding all snapdir dentries to ensure that
negative dentries didn't mask new snapshots.  This patch modifies the
behavior such that only negative dentries are invalidated.  This solves
the issue and may result in a performance improvement.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes openzfs#3589
Closes openzfs#3344
Closes openzfs#3295
Closes openzfs#3257
Closes openzfs#3243
Closes openzfs#3030
Closes openzfs#2841

Conflicts:
	config/kernel.m4
	module/zfs/zfs_ctldir.c
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Building Indicates an issue related to building binaries
Projects
None yet
Development

No branches or pull requests

9 participants