-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Zfs incremental recv fails with dataset is busy error #1761
Comments
Oh, forgot to mention. The 'dataset is busy' seems to occur (usually, not sure if it's 100% of the time) when the server is in the middle of a zfs send on another zpool / zvol. In this case, it was running an incremental send of arch/archive to another server when the recv failed on pool/detroitrepl. I've tried to set them up to run at different times, but occasionally they run over. |
More verbose error message on the receiving side. This is after the 'dataset is busy' error noted above. Manually created an incremental snapshot file (chicago_snap1.snap), copied it over and attempted to apply it locally to the receive side. |
@olw2005 It would be interesting to see your experiment above without the "-F" option on the receive side. If the zvol on the receive side has been touched in any way since the last receive, the -F causes an automatic rollback to be attempted and that may be what's failing. It would also be interesting to see "zpool history -i" on the receiving side because that will show the underlying operations of an automatic rollback if it was required. |
@olw2005 Do you run automatic snapshots on the pool receiving the snapshot? |
@dweeezil I have since rebooted the node so I will try it without -F next time it errors out. May take a few days. It's there "just in case", but the zvol on the receiving end doesn't get touched at all. Was not aware of the -i option for history. 38k lines of output, but here is a small subset from around the time of the error (---> indicates when the buffer error pops in the log, the zfs send side fails, and the 'dataset is busy' error pops for the pool/detroitrepl zvol): ARCH zpool: POOL zpool: Shortly after I restarted the machine on the morning of 1 Oct, the cron job on the other server ran a "zfs send -I" with multiple snapshots and the server in question had the same error (buffer i/o in the dmesg / var/log/messages and 'dataset is busy' error from zfs) part way through the transmission. Interestingly, it did receive and process about 3 of the incremental snapshots before it locked up. Here's the history -i from that timeframe: ARCH: POOL: |
@prakashsurya Yes we are running automatic snapshots on both ends. We're doing cross-zfs-send/recv's between servers in Chicago and Detroit (ctc-san2 and dtc-san2). ctc-san2 contains pool/chicago and pool/detroitrepl. dtc-san2 contains pool/detroit and pool/chicagorepl. (It also contains another zvol on a separate pool arch/archive.) The non-repl shares are running automated snapshots in conjunction with zfs send / receive via cron. The "repl" shares are the targets of the send/recv. A little complicated, I know... Make sense? |
My guess is you're running into the same issue I was, the snapshot creation is causing the |
@prakashsurya Thanks for the response! I actually read that bug report earlier, but assumed it applied to inbound (zfs recv'd) snapshots to the same dataset that was being "zfs sent". To be clear, are you saying that applies to any dataset in the same pool? E.g. zfs send of pool/zvolA in conjunction with zfs recv of pool/zvolB. |
@olw2005 From what I understand, #1590 can be triggered by creating a snapshot on pool/zvolB before or during the receive of pool/zvolA, that is not in the receive stream. So, if you have automatic snapshots occurring on pool/zvolB, then I would not be surprised if you hit that issue. I don't know for certain (I would need to read the code), but my guess is the snapshot is manipulating the on-disk structures such that the incremental receive stream cannot apply cleanly onto pool/zvolB (because pool/zvolA does not contain the same snapshot that was created independently on pool/zvolB). |
Another occurence, this time on the Chicago server (ctc-san2). Log files and other sorted output below. I tried doing a manual "zfs recv" without the -F, no joy. Three items of note:
I have to reboot the host ctc-san2 now so it can catch up with the Detroit side. But appreciate any other suggestions to try. Assorted Log File spewage: Tue Oct 15 20:46:01 EDT 2013 -> pool/detroit@hour-20131015204601 Snapshot creation. Tue Oct 15 21:46:01 EDT 2013 -> pool/detroit@hour-20131015214601 Snapshot creation. Wed Oct 16 16:42:02 EDT 2013 -> pool/detroit@4hour-20131016164202 Snapshot creation. Wed Oct 16 16:46:01 EDT 2013 -> pool/detroit@hour-20131016164601 Snapshot creation. CHICAGO SEND-SIDE TIMESTAMPED LOGS FROM REPLICATION SCRIPT Tue Oct 15 20:01:01 CDT 2013 -> pool/chicago@hour-20131015200101 Snapshot creation. CTC-SAN2 /var/log/messages -- Snapshot @4hour-20131015204201 received, detroit@hour-20131015204601 is not. Oct 15 19:59:19 ctc-san2 kernel: zd16: CTC-SAN2 zpool history -i 2013-10-15.19:46:22 [internal destroy txg:4657566] dataset = 58075 2013-10-15.20:46:36 [internal replay_inc_sync txg:4658302] dataset = 60901 2013-10-15.21:46:37 [internal replay_inc_sync txg:4659034] dataset = 60996 2013-10-16.15:46:20 [internal destroy txg:4672181] dataset = 60398 TRYING ZFS RECV MANUALLY (w/o -F)... [root@dtc-san2 ~]# zfs send -i pool/detroit@4hour-20131015204201 pool/detroit@hour-20131015204601 > detroit_inc_snap1.zfssnap [root@ctc-san2 ~]# gzip -d detroit_inc_snap1.zfssnap.gz |
Looks like this is similar to issue #2104 which mentions a fix in the latest master. Just compiled and installed the latest master of spl & zfs from git this morning on one our secondary nodes. (Note, I ran into the uname-r build problem referenced in another recent issue when trying to compile zfs. [Still running Centos 6.5.] Fixed by commenting out the four require lines in zfs/scripts/kmodtool. The spl rpm's compiled w/o incident.) Afterward the system survived three consecutive zfs incremental sends (followed by zfs destroy operations on old snapshots) running concurrent with a zfs recv (and again followed by zfs destroy of old snaps). Previously this almost certainly would have resulted in a 'dataset busy' lockup and/or a kernel oops. I'd like to wait a few days before declaring it resolved, but it certainly looks encouraging. Well done, gents! |
@olw2005 Thanks for following up with the positive feedback. OK, well then is I don't hear anything back in the next few weeks we'll treat this issue as resolved. |
I got a "cannot receive incremental strem dataset is busy" on my VMwarepool. The pool consists of about 25 zvols, each with 5 snapshots. The zvols are sparse and compression is on. No Dedup. I created a zfs send -R pool@snap > file.zfs . About 1.5 TB. When receiving the file.zfs with zfs recv -vF newpool - i get persistent these "dataset is busy" error with a full stop, but sometimes it restores 3 zvols, sometimes 5. ZFS Version is HEAD today. Kernel 3.11, opensuse 13.1. There is no output in messages or any kernelstack. What can i do to obtain more detailed error details ? zpool history -i - there are no errors mentioned. Any ideas ? |
@pyavdr I have an educated guess as to what might be causing the EBUSY errors on ZVOLs. When ZVOLs are created entries for them are created in sysfs causing udev to run If you set the |
@pyavdr Did you set only zvol_inihibit_dev=1 or "options zfs zvol_inihibit_dev=1" ? |
After setting "options zvol_inhibit_dev=1" ( thanks to sopmot and Brian) i see zfs recv process running, but after about 160GB it really slows down under the 1 MB/s ( max was 150 MB/s i/o read from disk) and stays there for long time. The zfs process hangs at 12% cpu, with about 5 KB/s i/o from time to time. The command is lzop < file.zfs | zfs recv -vF newpool ( full stream) , so pretty easy. Where file.zfs lays on a single hardisk (1.4TB, compressed, ext4) and pool is a raidz1 with 5 disks. Data is sparse zvols of Vm OS, compressed, no dedup, ZFS vers. is today HEAD. After 80 minutes there is activity again. More then 100 MB/s. After a short period of time it stalls again. Looks like it it runs without any dataset is busy error, but performance is down to 3,5 MB/s for some zvols, others are running at 130 MB/s. There are 10s of minutes with no i/o activitiy. Besides that long times of inactivitiy in runs through and gets a correct finish. @dweeezil Today i got no chance to catch some stack traces, but i will try that again. |
@pyavdr When the process is blocked, you should try to get some stack traces. Start with Also, while the process is blocked, you might gather some of the kstats in |
@dweeezil linux-ts3r:~ # cat /proc/4936/stack linux-ts3r:~ # cat /proc/4936/stack linux-ts3r:~ # cat /proc/4936/stack linux-ts3r:~ # cat /proc/4936/stack linux-ts3r:~ # cat /proc/4936/stack linux-ts3r:~ # cat /proc/4936/stack linux-ts3r:~ # cat /proc/4936/stack linux-ts3r:~ # cat /proc/4936/stack linux-ts3r:~ # cat /proc/4936/stack linux-ts3r:~ # cat /proc/4936/stack linux-ts3r:~ # cat /proc/4936/stack linux-ts3r:~ # cat /proc/4936/stack linux-ts3r:~ # cat /proc/4936/stack linux-ts3r:~ # cat /proc/4936/stack linux-ts3r:~ # cat /proc/4936/stack linux-ts3r:~ # cat /proc/4936/stack linux-ts3r:~ # cat /proc/4936/stack linux-ts3r:~ # cat /proc/4936/stack linux-ts3r:~ # cat /proc/4936/stack linux-ts3r:~ # cat /proc/4936/stack linux-ts3r:~ # cat /proc/4936/stack linux-ts3r:~ # cat /proc/4936/stack |
linux-ts3r:~ # cat /proc/4936/stack linux-ts3r:~ # cat /proc/4936/stack linux-ts3r:~ # cat /proc/4936/stack linux-ts3r:~ # cat /proc/4936/stack linux-ts3r:~ # cat /proc/4936/stack linux-ts3r:~ # cat /proc/4936/stack linux-ts3r:~ # cat /proc/4936/stack linux-ts3r:~ # cat /proc/4936/stack linux-ts3r:~ # cat /proc/4936/stack linux-ts3r:~ # cat /proc/4936/stack linux-ts3r:~ # cat /proc/4936/stack linux-ts3r:~ # cat /proc/4936/stack linux-ts3r:~ # cat /proc/4936/stack linux-ts3r:~ # cat /proc/4936/stack linux-ts3r:~ # cat /proc/4936/stack linux-ts3r:~ # cat /proc/4936/stack linux-ts3r:~ # cat /proc/4936/stack linux-ts3r:~ # cat /proc/4936/stack |
I can't speak to the above issues, but running the code version(s) outlined in #1795 I am no longer getting the "dataset is busy" error [or associated kernel panics]. |
Setting |
Yes. This should be fixed, I'll close it out. |
I am having the same problem. For now i use zvol_inhibit_dev=1 as a workaround, but it is only acceptable because I dont need to actually use the zvols on the destination server. |
Dear All,
First let me say, keep up the great work! Happy to see my tax dollars funding something useful.
Brief problem description:
We have two servers setup to send rotating incremental snapshots of the local zvol to each other (zfs send -I). In the main this works well, but about once a week the zfs recv fails with a 'dataset is busy' error.
Dmesg output is:
Buffer I/O error on device zd16, logical block 0
Buffer I/O error on device zd16, logical block 0
Buffer I/O error on device zd16, logical block 0
Expected dmesg output:
zd16:
GPT:Primary header thinks Alt. header is not at the end of the disk.
GPT:34358689719 != 34359738367
GPT:Alternate GPT header not at the end of the disk.
GPT:34358689719 != 34359738367
GPT: Use GNU Parted to correct GPT errors.
p1
(The zvol is used as the disk for a drbd resource. Drbd uses the last bit of the disk to store its own metadata, presenting a slightly smaller disk to layers above it. Thus the spurious GPT warning is normal / expected.)
After this error, all subsequent zfs recv's on that particular zvol [pool/chicagorepl] with the same error. Only way to clear the error is to reboot the server. Meanwhile, the two other zvols [arch/archive and pool/detroit] on the server continue to work fine, creating / sending / destroying snapshots w/o issue.
Server Configuration / Setup Info:
HP DL380p Gen8, 192 GB RAM, four external disk chassis [48 disks], two ssd's setup as l2arc cache disks
CentOS 6.4 with recent updates
zfs & spl 0.6.2
[root@dtc-san2 ~]# uname -a
Linux dtc-san2.stc.ricplc.com 2.6.32-358.18.1.el6.x86_64 #1 SMP Wed Aug 28 17:19:38 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
[root@dtc-san2 ~]# cat /etc/modprobe.d/zfs.conf
options zfs zfs_nocacheflush=1 zfs_arc_max=154618822656 zfs_arc_min=1073741824
[root@dtc-san2 ~]# zpool status
pool: arch
state: ONLINE
scan: none requested
config:
errors: No known data errors
pool: pool
state: ONLINE
scan: none requested
config:
errors: No known data errors
[root@dtc-san2 ~]# zfs list
NAME USED AVAIL REFER MOUNTPOINT
arch 19.8T 4.20T 65.8K none
arch/archive 19.8T 4.20T 19.1T -
pool 20.8T 40.6T 64.5K none
pool/chicagorepl 5.98T 40.6T 4.77T -
pool/detroit 14.6T 40.6T 13.8T -
pool/vdistore 177G 40.6T 177G -
Abbreviated zpool history:
History for 'arch':
...
2013-09-30.14:16:07 zfs snapshot arch/archive@hour-20130930141601
2013-09-30.14:16:33 zfs send -I arch/archive@hour-20130930131601 arch/archive@hour-20130930141601
2013-09-30.14:28:10 zfs destroy arch/archive@hour-20130930101601
2013-09-30.15:16:07 zfs snapshot arch/archive@hour-20130930151601
2013-09-30.15:16:31 zfs send -I arch/archive@hour-20130930141601 arch/archive@hour-20130930151601
History for 'pool':
...
2013-09-30.12:05:11 zfs recv -F pool/chicagorepl
2013-09-30.12:05:33 zfs destroy pool/chicagorepl@hour-20130930070101
2013-09-30.12:14:13 zfs destroy pool/detroit@hour-20130930064601
2013-09-30.12:14:20 zfs destroy pool/detroit@hour-20130930074601
2013-09-30.12:42:06 zfs snapshot pool/detroit@4hour-20130930124201
2013-09-30.12:42:32 zfs destroy pool/detroit@4hour-20130929124201
2013-09-30.12:46:06 zfs snapshot pool/detroit@hour-20130930124601
2013-09-30.12:46:49 zfs send -I pool/detroit@hour-20130930114601 pool/detroit@hour-20130930124601
2013-09-30.12:54:15 zfs destroy pool/detroit@hour-20130930084601
2013-09-30.13:20:01 zfs recv -F pool/chicagorepl
2013-09-30.13:20:20 zfs destroy pool/chicagorepl@hour-20130930080101
2013-09-30.13:46:07 zfs snapshot pool/detroit@hour-20130930134601
2013-09-30.13:46:53 zfs send -I pool/detroit@hour-20130930124601 pool/detroit@hour-20130930134601
2013-09-30.14:00:29 zfs destroy pool/detroit@hour-20130930094601
2013-09-30.14:01:23 zfs destroy pool/chicagorepl@4hour-20130929125601
----> 2013-09-30.14:02:40 zfs recv -F pool/chicagorepl
2013-09-30.14:46:06 zfs snapshot pool/detroit@hour-20130930144601
2013-09-30.14:46:47 zfs send -I pool/detroit@hour-20130930134601 pool/detroit@hour-20130930144601
Line marked with ----> is where it went south. Note the subsequent snapshot and send operations of "pool/detroit" and "arch/archive" had no errors...
If I can provide more information, please let me know. I'm an engineer / IT guy not a developer, but I can read and follow instructions well. =) I have purposefully left the system as-is. It is non-primary storage, so it can be rebooted / recompiled / whatever as needed. Looking forward to hear from you!
Regards,
Owen L. Wieck
Ricardo, Inc.
The text was updated successfully, but these errors were encountered: