Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

silent corruption for thousands files gives input/output error but cannot be detected with scrub - at least for openzfs 2.0.0 #11443

Closed
shuther opened this issue Jan 8, 2021 · 42 comments
Labels
Status: Stale No recent activity for issue Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@shuther
Copy link

shuther commented Jan 8, 2021

System information

Type Version/Name
Distribution Name Freenas/Freebsd and Arch Linux
Distribution Version
Linux Kernel
Architecture
ZFS Version 2.0.0
SPL Version

Describe the problem you're observing

The issue raised for a previous version of zfs (#10697) is still there.

Issue appears in Freenas (Freebsd 12) using openzfs 2.0; problem was confirmed using SystemRescue (https://github.com/nchevsky/systemrescue-zfs) that is using a more recent version of zfs. This sound more a zfs issue.
The impacted files are read-only (I never changed them (write once, read many times), should have been part of the pool for at least one year). The files (about 17,000) are all part of the same dataset (mainly .jpg files but not all while the dataset contains mostly music)

I have many files that I can't see (cat, less, ... reports: Input/output error); ls is reporting the file howver. I replaced a disk recently, not sure if I should try a resilver or if it was the reason of the problem. I checked old snapshots, and the problem is also there.
I tried to see the content of the file using zdb and it seems successful

cp returns: Bad Address, and dmesg reports the error (nothing for cat, ....):
vm_fault: pager read error

zfs send
returns also Input/output error; easy way to spot the problem.

Pool is healthy, scrub did not report any issue. However after executing the zfs send, one error appeared connected to a snapshot (the one I used with zfs send); I guess it stopped after the 1st error.

root@freenas:~ # zpool status -v
  pool: freenas-boot
 state: ONLINE
  scan: scrub repaired 0B in 00:00:56 with 0 errors on Tue Jan  5 03:45:56 2021
config:

	NAME          STATE     READ WRITE CKSUM
	freenas-boot  ONLINE       0     0     0
	  ada0p2      ONLINE       0     0     0

errors: No known data errors

  pool: vollog
 state: ONLINE
  scan: scrub repaired 0B in 00:00:43 with 0 errors on Sun Dec 20 00:00:44 2020
config:

	NAME        STATE     READ WRITE CKSUM
	vollog      ONLINE       0     0     0
	  ada1p2    ONLINE       0     0     0

errors: No known data errors

  pool: voluz
 state: ONLINE
status: One or more devices has experienced an error resulting in data
	corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
	entire pool from backup.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
  scan: scrub repaired 0B in 10:39:18 with 0 errors on Wed Jan  6 01:16:40 2021
config:

	NAME        STATE     READ WRITE CKSUM
	voluz       ONLINE       0     0     0
	  raidz2-0  ONLINE       0     0     0
	    da0p2   ONLINE       0     0     0
	    da1p2   ONLINE       0     0     0
	    da2p2   ONLINE       0     0     0
	    da3p2   ONLINE       0     0     0
	    da4p2   ONLINE       0     0     0
	logs	
	  nvd0p2    ONLINE       0     0     0
	cache
	  nvd0p3    ONLINE       0     0     0

errors: Permanent errors have been detected in the following files:

        voluz/media/music@auto-20210104.0900-2w:/EAC-Pending/The Who - Quadrophenia [MFSL]/Disc 2/02 - Sea And Sand.flac

Note that the slog/cache were added very recently (the issue was likely present before I added them).

Files are pretty old as per below, but it is as if the inode changed recently (I replaced one disk with a larger one) around that time (mid-November I think, not sure how to check it as I don't see it under zpool history) and if it relates.

root@freenas:/mnt/voluz # stat "./media/music/APE/Aerosmith/1974_20-_20Get_20Your_20Wings/CD.jpg"
1334434238368977443 25594 -rwxrwxrwx 1 webdav webdav 18446744073709551615 99445 "Sep 20 09:26:08 2012" "Jun  8 19:11:23 2012" "Oct 19 18:14:21 2020" "Sep 20 09:26:08 2012" 65536 242 0x800 ./media/music/APE/Aerosmith/1974_20-_20Get_20Your_20Wings/CD.jpg

Describe how to reproduce the problem

zfs send or cat file

Include any warning/errors/backtraces from the system logs

On Freebsd, nothing except if I am using cp (vm_fault: pager read error)

@shuther shuther added Status: Triage Needed New issue which needs to be triaged Type: Defect Incorrect behavior (e.g. crash, hang) labels Jan 8, 2021
@shuther shuther changed the title silent corruption gives input/output error but cannot be detected with scrub - at least for openzfs 2.0.0 silent corruption for thousands files gives input/output error but cannot be detected with scrub - at least for openzfs 2.0.0 Jan 8, 2021
@ghost ghost removed the Status: Triage Needed New issue which needs to be triaged label Jan 8, 2021
@ghost
Copy link

ghost commented Jan 8, 2021

We're currently investigating this issue here: https://jira.ixsystems.com/browse/NAS-108627

@jstenback
Copy link
Contributor

jstenback commented Jan 9, 2021

FWIW I'm seeing this on one of my pools as well. So far I've only ever seen the errors in snapshots, never on actual file data. No signs of any actual HW problems, and this all started after upgrading to 2.0. The pattern I'm seeing is that after a system boot things are fine for a number of days, then the errors start showing up. This is a system with ECC ram, 8 disks in 4x2 mirrored configuration, was solid for ~4 years before the upgrade to 2.0. My other pool is also running 2.0 with no signs of this issue. Also ECC ram, the two differences that come to mind is # of disks (11 disk RAIDZ-3 config) and no 2.0 features enabled (i.e. I didn't run zpool upgrade -a on that system).

Scrub has never detected anything on the pool showing these errors, but does clear them once I delete the snapshots where they happen. I scrub weekly. These are both Linux systems.

@aerusso
Copy link
Contributor

aerusso commented Jan 9, 2021

@shuther Could you please confirm that the bug here is the same as in #10697 ? I.e., are you also experiencing this same issue on the 0.8.x branch?

@jstenback Could you provide some more details about the corruption you're experiencing (i.e., what is your workload, how frequently are you taking snapshots, what are the exact errors you are getting, relevant logs, etc.) ? And can you also confirm that the errors you are getting only showed up after upgrading to 2.0? If this is indeed the case, could you please open another issue, since it sounds like there may be two underlying problems, and it will make diagnosing the problem easier if we can figure out what is a regression, and what is a preexisting bug.

@shuther
Copy link
Author

shuther commented Jan 10, 2021

For me, it is different to 10697. Files were saved using freenas/samba/windows back in 2012 (as per the last modified time); iscsi was not involved. The files were changed from a VM (windows) running on the same server (so I guess no network involved). Given the number of files (>17,000, only from one dataset), I don't think the problem happened when I stored the files, but I expect zfs to have caused the corruption later on (resilver, scrub, ....) I did not write on the files using zfs 2.0.
The issue may have started with 0.8.x but I can confirm it on 2.0 on Freebsd/Linux versions. I upgraded the pool, so I can't go back. Corruption is on the original data as well as on the snapshots (so could be a different issue than @jstenback). Again, no issue detected from scrub (so quite a concern)

@jstenback
Copy link
Contributor

Sure. The system I'm seeing these errors on is my main home system that runs a number of misc things, but nothing particularly heavy. I snapshot hourly, and for one dataset I snapshot every 10 min. The second system I mentioned does little other than replicate the main one as a backup. Thus far every error I've seen has come from the backup system issuing zfs send commands on the main one. The nature of what I see in the logs is that zfs send fails, and in the few instances where I've been able to capture zpool events output when this has happened I see:

        class = "ereport.fs.zfs.data"
        ena = 0x8349236911f00401
        detector = (embedded nvlist)
                version = 0x0
                scheme = "zfs"
                pool = 0x943e97b2e4488663
        (end detector)
        pool = "storage"
        pool_guid = 0x943e97b2e4488663
        pool_state = 0x0
        pool_context = 0x0
        pool_failmode = "wait"
        zio_err = 0x5
        zio_flags = 0x180
        zio_stage = 0x1000000
        zio_pipeline = 0x1780002
        zio_delay = 0x0
        zio_timestamp = 0x0
        zio_delta = 0x0
        zio_priority = 0x2
        zio_objset = 0x7bcd
        zio_object = 0x0
        zio_level = 0x0
        zio_blkid = 0x2822c
        time = 0x5fe81594 0x1cd7f772 
        eid = 0x1089a

The output in zpool status -v lists errors in storage/home/blah@xyz:<0x0> (or <0x2> or <0x20>). If I delete said snapshot, the error changes to something like <0xba8e>:<0x0>, and zfs send of snapshots spanning the failed one start succeeding again. No other errors of any kind spotted so far.

I did indeed only start seeing this behavior with 2.0. Happy to open a new issue if folks believe there's enough information here to move forward on what I have alone.

@aerusso
Copy link
Contributor

aerusso commented Jan 10, 2021

@jstenback Thanks for the follow up. Are you also getting any errors reported from the zpool status command? Are there any read errors on any of your files? You can test this relatively easily:

find . -type f -exec sh -c 'cat "$0" >/dev/null || printf "%s\n" "$0"' {} \;

(That's a mostly untested command.)

You might also do that in a healthy snapshot and an unhealthy snapshot, and see if they give errors.

I would think the above command is almost equivalent to a (partial) scrub, but apparently that is part of the reported bug.

@aerusso
Copy link
Contributor

aerusso commented Jan 10, 2021

Thanks @shuther, the background information is certainly helpful for everyone. What was the order of events---i.e., how long after the zpool upgrade did it take before you noticed any problems? Did the zpool replace occur under 2.0 or 0.8? Did the upgrade occur before or after the upgrade?

@shuther
Copy link
Author

shuther commented Jan 11, 2021

@aerusso I just noticed the issue now, a bit by chance, so I can't confirm when it happened the 1st time. The last resilver/replace was under 2.0. The upgrade was also under 2.0; I added a slog/cache at that time also. What is strange for me is that the problem is only under one dataset covering 20% of the used pool (I have about 50 different datasets); mainly pictures (in fact very small files?) And it is not an active one (only used in read). I don't use dedup; the only thing a bit particular is that I have nfsv4 ACL. Maybe we can focus on why scrub is not detecting the problem; let me know if I can help in any way?

NB. I just checked some files in my backup; the files were fine on 2017; and they were not any more in December 2020; the last change (inode related) reported by stats (October 19th 2020) seems to make sense (zfs 8.x) as I upgraded freenas to version 12-0-RELEASE on Oct 27th from Freenas 11.3-U5 on Oct 1st.

@ghost
Copy link

ghost commented Jan 11, 2021

The issue may have started with 0.8.x but I can confirm it on 2.0 on Freebsd/Linux versions.

Can you elaborate on the FreeBSD/Linux versions you are using? Are these both TrueNAS (CORE and SCALE?) or have you used ZFS from upstream FreeBSD/Linux distros?

@shuther
Copy link
Author

shuther commented Jan 12, 2021

I have FreeNas (only released version, I used to keep upgrading when a new release shows up within 2 weeks) as the main platform (up to TrueNAS Core 12) - no development version.
I tried ZFS 2.0.0 on Arch Linux (live system from https://github.com/nchevsky/systemrescue-zfs/releases/tag/v7.01%2B2.0.0)

@ghost
Copy link

ghost commented Jan 12, 2021

If I understand correctly the silent corruption is caused by the "async dmu" patch shipped with freenas v12 but not yet merged in openzfs 2.0 (#10377)

I don't know why freenas maintainers decided to ship code marked "work in progess" (according to github tags) in a RELEASE version.

@ghost
Copy link

ghost commented Jan 12, 2021

@shuther can you boot the debug kernel in TrueNAS and see if you can get a panic out of it?

@ghost
Copy link

ghost commented Jan 12, 2021

@emichael @willymilkytron iXsystems are working on upstreaming the async DMU and async CoW features. The code in our branch is regularly rebased to keep the upstream commit history clean, that is why the commit in our repo appears to have only been a few days ago. The pull requests to land these features upstream are from us and that integration effort is still in progress, but the code itself has been around for years. We ported the features to OpenZFS 2.0 for TrueNAS 12.

Given the report in this issue that the problem also is present with OpenZFS 2.0 on Arch Linux (SystemRescue-ZFS), it doesn't seem to me that the async DMU/async CoW code is to blame. We did initially suspect it, but now it's not clear where the problem lies.

@ghost
Copy link

ghost commented Jan 12, 2021

The trouble we're having right now is trying to reproduce the problem. If anyone manages to come up with a simple test case that will reproduce this issue from scratch it will be a great help.

@beren12
Copy link
Contributor

beren12 commented Jan 12, 2021

were these files accessible until the upgrade? When were they last backed up so you could have an idea when the corruption happened?

@shuther
Copy link
Author

shuther commented Jan 13, 2021

@freqlabs With the debug kernel, it crashes during the boot (when it imports the pool), and it just reboots. I may need some guidance to capture the error messages (and avoid the automatic reboot).

Somebody else (https://www.truenas.com/community/threads/freenas-now-truenas-is-no-longer-stable.89445/post-624526) is facing what seems a corrupted superblock. Not sure if it is related.

The corruption happened between October and last week.

@ghost
Copy link

ghost commented Jan 13, 2021

@shuther Please download the debug archive from System -> Advanced -> Save Debug, and provide the contents of ixdiagnose/textdump/panic.txt and ixdiagnose/textdump/ddb.txt here.

@ghost
Copy link

ghost commented Jan 14, 2021

freqlabs commented 2 days ago (edited)
Given the report in this issue that the problem also is present with OpenZFS 2.0 on Arch Linux (SystemRescue-ZFS), it doesn't seem to me that the async DMU/async CoW code is to blame. We did initially suspect it, but now it's not clear where the problem lies.

It may be unclear where the problem lies, but yesterday someone called "Ryan Moeller" attached a new version of openzfs-debug.ko compiled without the "async dmu" patch in https://jira.ixsystems.com/browse/NAS-108627 and was asking the reporter to boot their production system with it.

image

Who do we trust here? It still seems to me that it seems to you the async DMU/async CoW code is to blame here.

@shuther
Copy link
Author

shuther commented Jan 14, 2021

@freqlabs see attached. I did not find any ddb.txt, could you confirm the path?
I used as a reference: [/var/db/system/ixdiagnose/ixdiagnose]

debug.zip

@rbrewer123
Copy link

I'm seeing the same symptom as @jstenback: zpool status -x -v shows a corrupted snapshot, but no read, write, nor checksum errors. I ran a scrub, which never fully completed. It was stuck for 1-2 hours showing 2 minutes remaining in the scrub. No errors were reported by the scrub. But curiously, zpool status was now reporting corruption in a second snapshot in addition to the first. I rebooted, but systemd hung waiting for zfs to stop. I finally power cycled.
After booting, the original corrupted snapshot was still listed as corrupted, but the second was no longer listed at all.

I was able to destroy the offending snapshot, but now zpool status shows corruption in <0x425f>:<0x0>.

The pool was created with openzfs 2.0 about 2 weeks ago on 2 brand new 2 TB drives. It is a single mirrored vdev with encryption and lz4 compression. No other devices are on that pool.

I use pyznap to take snapshots every 15 minutes on the pool.

The machine is my home desktop computer and is mostly lightly loaded aside from doing its zfs send/recv backups and backing up one other computer from over the network.

My versions are:
AMD Ryzen with 16 GB RAM (not ECC)
NixOS 20.09
Linux kernel 5.4.85
zfs-2.0.0-1
zfs-kmod-2.0.0-1

I have the following traceback in my logs which may be related:

Jan 13 22:41:02 ryz kernel: VERIFY3(0 == remove_reference(hdr, NULL, tag)) failed (0 == 1)
Jan 13 22:41:02 ryz kernel: PANIC at arc.c:3790:arc_buf_destroy()
Jan 13 22:41:02 ryz kernel: Showing stack for process 7666
Jan 13 22:41:02 ryz kernel: CPU: 1 PID: 7666 Comm: z_rd_int Tainted: P           OE     5.4.85 #1-NixOS
Jan 13 22:41:02 ryz kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./B450 Gaming-ITX/ac, BIOS P1.70 12/17/20
18
Jan 13 22:41:02 ryz kernel: Call Trace:
Jan 13 22:41:02 ryz kernel:  dump_stack+0x66/0x90
Jan 13 22:41:02 ryz kernel:  spl_panic+0xd4/0xfc [spl]
Jan 13 22:41:02 ryz kernel:  ? __wake_up_common_lock+0x8a/0xc0
Jan 13 22:41:02 ryz kernel:  ? __cv_broadcast+0x26/0x30 [spl]
Jan 13 22:41:02 ryz kernel:  ? zei_add_range+0x140/0x140 [zfs]
Jan 13 22:41:02 ryz kernel:  ? zfs_zevent_post+0x24d/0x280 [zfs]
Jan 13 22:41:02 ryz kernel:  arc_buf_destroy+0xf4/0x100 [zfs]
Jan 13 22:41:02 ryz kernel:  arc_read_done+0x242/0x480 [zfs]
Jan 13 22:41:02 ryz kernel:  zio_done+0x403/0x1030 [zfs]
Jan 13 22:41:02 ryz kernel:  zio_execute+0x86/0xe0 [zfs]
Jan 13 22:41:02 ryz kernel:  taskq_thread+0x2ca/0x4b0 [spl]
Jan 13 22:41:02 ryz kernel:  ? wake_up_q+0x60/0x60
Jan 13 22:41:02 ryz kernel:  ? zio_taskq_member.isra.0.constprop.0+0x60/0x60 [zfs]
Jan 13 22:41:02 ryz kernel:  kthread+0xfb/0x130
Jan 13 22:41:02 ryz kernel:  ? task_done+0x90/0x90 [spl]
Jan 13 22:41:02 ryz kernel:  ? kthread_park+0x90/0x90
Jan 13 22:41:02 ryz kernel:  ret_from_fork+0x22/0x40
Jan 13 22:41:02 ryz zed[29943]: eid=25260 class=authentication pool='rpool' priority=4 err=0 flags=0x10000a0 bookmark=50886:0:-1:0

I have several more tracebacks from hung tasks. I can get those together too if they might be helpful.

@ghost
Copy link

ghost commented Jan 14, 2021

@shuther Thanks, I was able to get the information I need from that.

panic:

VERIFY((dc->dc_flags & DMU_CTX_FLAG_ASYNC) || zfs_refcount_count(&dc->dc_holds) == 1) failed

backtrace:

db:0:kdb.enter.default>  bt
Tracing pid 1645 tid 100759 td 0xfffff8020b6c7740
kdb_enter() at kdb_enter+0x37/frame 0xfffffe011b6a90c0
vpanic() at vpanic+0x197/frame 0xfffffe011b6a9110
spl_panic() at spl_panic+0x3a/frame 0xfffffe011b6a9170
dmu_issue() at dmu_issue+0x163/frame 0xfffffe011b6a91b0
dmu_read_uio_dbuf() at dmu_read_uio_dbuf+0x79/frame 0xfffffe011b6a92b0
zfs_freebsd_read() at zfs_freebsd_read+0x552/frame 0xfffffe011b6a9360
VOP_READ_APV() at VOP_READ_APV+0x75/frame 0xfffffe011b6a9390
nfsvno_read() at nfsvno_read+0x36d/frame 0xfffffe011b6a9460
nfsrvd_read() at nfsrvd_read+0x579/frame 0xfffffe011b6a96c0
nfsrvd_dorpc() at nfsrvd_dorpc+0x122e/frame 0xfffffe011b6a9890
nfssvc_program() at nfssvc_program+0x5d3/frame 0xfffffe011b6a9a60
svc_run_internal() at svc_run_internal+0xd99/frame 0xfffffe011b6a9ba0
svc_thread_start() at svc_thread_start+0xb/frame 0xfffffe011b6a9bb0
fork_exit() at fork_exit+0x7e/frame 0xfffffe011b6a9bf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe011b6a9bf0
--- trap 0xc, rip = 0x8002e1b2a, rsp = 0x7fffffffd578, rbp = 0x7fffffffd810 ---

@ghost
Copy link

ghost commented Jan 14, 2021

@willymilkytron like I said, it's not clear yet what the problem is. We're seeing that reverting the async DMU and async CoW patch seems to be helping in some cases, but we're also seeing that people like @rbrewer123 are experiencing similar issues without the async patches.

@behlendorf
Copy link
Contributor

@rbrewer123 thanks for the additional information. Thus far I haven't been able to reproduce the issue but I'll setup a system and workload similar to yours to try and reproduce it. Is there anything else special about your configuration, perhaps a non-default module option? For your backups are you doing normal or raw zfs sends?

At a minimum the stack trace does indicate and issue in the ARC reference counting. If you're in a position to do so rebuilding ZFS with debugging enabled will enable additional sanity checks throughout the code and may help us isolate the issue.

./configure --enable-debug && make

@rbrewer123
Copy link

@behlendorf I don't think I have any special module options. For backups, pyznap takes snapshots every 15 minutes on the affected pool, and once per day sends them to another pool (mounted locally on the same machine). Both pools are encrypted and I don't use raw sends.

As for workload, I've been fiddling with my backups in the last few days, so it's possible I was either running a deduplication program on my main pool (the affected pool) and the backup pool (a read-heavy workload), or backing up the networked machine (a write-heavy load, though it's over a wifi link that only gets about 3 MBps).

I'll try to build zfs with debugging to see if anything else turns up.

@rbrewer123
Copy link

@behlendorf according to my notes I'm pretty sure this is exactly how I created my pool:

zpool create -o ashift=12 -O mountpoint=none -O relatime=on -O acltype=posixacl -O xattr=sa -O compression=lz4 -O encryption=aes-256-gcm -O keyformat=passphrase rpool mirror <dev1_last_component> <dev2_last_component>

The devices are a partition on each of 2 2TB consumer-market SATA HDDs at 7200RPM.

After creating the pool, I used something like:

zfs send -L -R oldpool@xfer-2020-12-30 | zfs recv -v -u -x encryption rpool/new

to load it up with a replication stream from my previous non-encrypted iteration of the pool which I think was created under zfs 0.7.x or 0.8.x and was working fine for a couple years. To clarify, all datasets are now encrypted on the new pool. The old pool contained a bunch of snapshots of its own, but the corrupted snapshot(s) were created a week or two after moving to the new pool.

@mabod
Copy link

mabod commented Jan 15, 2021

according to my notes I'm pretty sure this is exactly how I created my pool:

FYI: The zpool create command that was used is the first entry in zpool history

@rbrewer123
Copy link

@mabod, thanks for the tip! Apparently I forgot that this pool started life as a single device, and then I attached the second device a day later after loading it up and scrubbing.

@rbrewer123
Copy link

@behlendorf is there an easy way to verify on a running system if my zfs kernel module was compiled with --enable-debug?

@behlendorf
Copy link
Contributor

@rbrewer123 yes, there is. When you load the kmod it'll print (DEBUG mode) after the loaded module line in dmesg output. Something like this:

ZFS: Loaded module v2.0.1-1 (DEBUG mode), ZFS pool version 5000, ZFS filesystem version 5

@ghost
Copy link

ghost commented Jan 22, 2021

FYI for the TrueNAS users, 12.0-U1.1 was released with a workaround for this issue.

@shuther
Copy link
Author

shuther commented Jan 25, 2021

@freqlabs I upgraded to 12.0-U1.1; while my files still report input/output error, are you saying that I should not see any corruption any more (but my 17,000 are corrupted and I should delete them)? Is there a recommended approach to recover the files (maybe using zdb) as my guess (but I could be wrong) is that the content of the files are ok?

@ghost
Copy link

ghost commented Jan 25, 2021

@shuther OK, then unfortunately your issue must be different from the one we fixed. There should not have been any actual corruption on disk, only in the buffers passed to applications. This may still be the case but for a different reason. I would like to believe the scrub is correct. Can you not read the files or detect errors via scrub with an older version of ZFS?

@shuther
Copy link
Author

shuther commented Jan 25, 2021

@freqlabs scrub returns no issue. I can't read the files (input/output error); I can't downgrade zfs (I enabled some features). Not sure how I can help further.

@IvanVolosyuk
Copy link

IvanVolosyuk commented Jan 25, 2021 via email

@shuther
Copy link
Author

shuther commented Jan 26, 2021

@IvanVolosyuk I am able to see the content of the file (partially) as per #10697 (comment). Also, changing the ACL does not help (getfacl was returning the same for every file)
zfssetget acltype=none voluz/media/music did not help neither.
My guess is that it is something else (I am quite concerned that the file is "corrupted" in the middle and zpool status is not returning any error.

@rbrewer123
Copy link

@behlendorf thanks for the tip. A quick update: I was able to upgrade my ZFS kernel module to 2.0.1 a week or so ago. My system got another corrupted snapshot a few days ago that is also permanent like the first, so (as expected) it's still happening with 2.0.1. It's possible I was lightly using the system at the time of that corruption, I'm not sure. I finally wrangled NixOS into building the module with debugging and am running with that as of today.

@behlendorf
Copy link
Contributor

@rbrewer123 thanks for the update. Definitely let me know if you encounter any problems. Unfortunately, I've thus far not be able to reproduce the issue.

@rbrewer123
Copy link

@behlendorf here's a quick update. I'm currently running openzfs 2.0.1 with --enable-debug for the kernel module. The corrupted snapshot issue has happened 2 more times in the last couple of days while I was running with the debug module. Alas, the module did not emit any debug messages for the 2 newest occurrences.

I'm noticing that the zfs userspace command dumps core during my backup process when it encounters a corrupted snapshot. I haven't figured out if these coredumps are actually going anywhere and if so where to find them. If I find one I'll see what I can do, though I imagine a smoking gun from the kernel module would be of more use.

@jstenback
Copy link
Contributor

FWIW I also see core dumps from zfs send when attempting to send a corrupted snapshot. Unfortunately my core files get auto purged so I don't have one handy right now, but if I see one again I can attempt to investigate. I unfortunately am not running a debug build here :(

@rbrewer123
Copy link

@behlendorf I got another corrupted snapshot, noticed during my backup procedure. I ran zpool status -xv and the process hung. I looked in my logs to find this:

Feb 10 12:21:15 ryz kernel: VERIFY(HDR_EMPTY_OR_LOCKED(hdr)) failed
Feb 10 12:21:15 ryz kernel: PANIC at arc.c:1649:arc_hdr_clear_flags()
Feb 10 12:21:15 ryz kernel: Showing stack for process 471
Feb 10 12:21:15 ryz kernel: CPU: 1 PID: 471 Comm: z_rd_int Tainted: P           O      5.4.92 #1-NixOS
Feb 10 12:21:15 ryz kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./B450 Gaming-ITX/a>
Feb 10 12:21:15 ryz kernel: Call Trace:
Feb 10 12:21:15 ryz kernel:  dump_stack+0x66/0x81
Feb 10 12:21:15 ryz kernel:  spl_panic+0xd4/0xfc [spl]
Feb 10 12:21:15 ryz kernel:  ? _cond_resched+0x15/0x30
Feb 10 12:21:15 ryz kernel:  ? mutex_lock+0xe/0x30
Feb 10 12:21:15 ryz kernel:  ? zfs_refcount_remove_many+0x1cd/0x250 [zfs]
Feb 10 12:21:15 ryz kernel:  ? dbuf_rele_and_unlock+0x30d/0x7c0 [zfs]
Feb 10 12:21:15 ryz kernel:  ? _cond_resched+0x15/0x30
Feb 10 12:21:15 ryz kernel:  ? mutex_lock+0xe/0x30
Feb 10 12:21:15 ryz kernel:  ? _cond_resched+0x15/0x30
Feb 10 12:21:15 ryz kernel:  ? mutex_lock+0xe/0x30
Feb 10 12:21:15 ryz kernel:  ? _cond_resched+0x15/0x30
Feb 10 12:21:15 ryz kernel:  arc_hdr_l2hdr_destroy+0x13f/0x1a0 [zfs]
Feb 10 12:21:15 ryz kernel:  arc_hdr_destroy+0x44e/0x460 [zfs]
Feb 10 12:21:15 ryz kernel:  arc_read_done+0x644/0xb80 [zfs]
Feb 10 12:21:15 ryz kernel:  zio_done+0x36e/0x1eb0 [zfs]
Feb 10 12:21:15 ryz kernel:  ? spa_config_exit+0xc2/0x180 [zfs]
Feb 10 12:21:15 ryz kernel:  zio_execute+0xde/0x290 [zfs]
Feb 10 12:21:15 ryz kernel:  taskq_thread+0x2cd/0x5c0 [spl]
Feb 10 12:21:15 ryz kernel:  ? wake_up_q+0x60/0x60
Feb 10 12:21:15 ryz kernel:  ? zio_taskq_member.isra.0.constprop.0+0x60/0x60 [zfs]
Feb 10 12:21:15 ryz kernel:  kthread+0xfb/0x130
Feb 10 12:21:15 ryz kernel:  ? taskq_thread_should_stop+0x70/0x70 [spl]
Feb 10 12:21:15 ryz kernel:  ? kthread_park+0x90/0x90
Feb 10 12:21:15 ryz kernel:  ret_from_fork+0x22/0x40
Feb 10 12:21:15 ryz zed[12329]: eid=28194 class=authentication pool='rpool' priority=0 err=5 flags=0x80 bo>
Feb 10 12:21:15 ryz zed[12333]: eid=28195 class=data pool='rpool' priority=0 err=5 flags=0x80 bookmark=926>
Feb 10 12:23:34 ryz kernel: INFO: task z_rd_int:471 blocked for more than 122 seconds.
Feb 10 12:23:34 ryz kernel:       Tainted: P           O      5.4.92 #1-NixOS
Feb 10 12:23:34 ryz kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Feb 10 12:23:34 ryz kernel: z_rd_int        D    0   471      2 0x80004000
Feb 10 12:23:34 ryz kernel: Call Trace:
Feb 10 12:23:34 ryz kernel:  __schedule+0x266/0x6c0
Feb 10 12:23:34 ryz kernel:  schedule+0x39/0xa0
Feb 10 12:23:34 ryz kernel:  spl_panic+0xfa/0xfc [spl]
Feb 10 12:23:34 ryz kernel:  ? _cond_resched+0x15/0x30
Feb 10 12:23:34 ryz kernel:  ? mutex_lock+0xe/0x30
Feb 10 12:23:34 ryz kernel:  ? zfs_refcount_remove_many+0x1cd/0x250 [zfs]
Feb 10 12:23:34 ryz kernel:  ? dbuf_rele_and_unlock+0x30d/0x7c0 [zfs]
Feb 10 12:23:34 ryz kernel:  ? _cond_resched+0x15/0x30
Feb 10 12:23:34 ryz kernel:  ? mutex_lock+0xe/0x30
Feb 10 12:23:34 ryz kernel:  ? _cond_resched+0x15/0x30
Feb 10 12:23:34 ryz kernel:  ? mutex_lock+0xe/0x30
Feb 10 12:23:34 ryz kernel:  ? _cond_resched+0x15/0x30
Feb 10 12:23:34 ryz kernel:  arc_hdr_l2hdr_destroy+0x13f/0x1a0 [zfs]
Feb 10 12:23:34 ryz kernel:  arc_hdr_destroy+0x44e/0x460 [zfs]
Feb 10 12:23:34 ryz kernel:  arc_read_done+0x644/0xb80 [zfs]
Feb 10 12:23:34 ryz kernel:  zio_done+0x36e/0x1eb0 [zfs]
Feb 10 12:23:34 ryz kernel:  ? spa_config_exit+0xc2/0x180 [zfs]
Feb 10 12:23:34 ryz kernel:  zio_execute+0xde/0x290 [zfs]
Feb 10 12:23:34 ryz kernel:  taskq_thread+0x2cd/0x5c0 [spl]
Feb 10 12:23:34 ryz kernel:  ? wake_up_q+0x60/0x60
Feb 10 12:23:34 ryz kernel:  ? zio_taskq_member.isra.0.constprop.0+0x60/0x60 [zfs]
Feb 10 12:23:34 ryz kernel:  kthread+0xfb/0x130
Feb 10 12:23:34 ryz kernel:  ? taskq_thread_should_stop+0x70/0x70 [spl]
Feb 10 12:23:34 ryz kernel:  ? kthread_park+0x90/0x90
Feb 10 12:23:34 ryz kernel:  ret_from_fork+0x22/0x40

There are a bunch more hung tasks following that in the logs. The next morning I had to power cycle because my machine was unresponsive.

@behlendorf
Copy link
Contributor

@rbrewer123 thanks! Unfortunately, that stack is most likely unrelated since it's something we have observed rarely with prior releases.

@stale
Copy link

stale bot commented Feb 17, 2022

This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the Status: Stale No recent activity for issue label Feb 17, 2022
@stale stale bot closed this as completed Jun 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Stale No recent activity for issue Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests

8 participants