-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ZFS stuck #1620
Comments
@dweeezil i did of course pull dweeezil/zfs@a0dc667 regards, |
@edillmann I've been meaning to try running the Illumos 2882/2883/2900 with some of those recent changes but haven't had a chance to yet. I'm leaving for vacation today so I'm going to be off the grid for a week. I've been trying to keep the patch rebased to current master as well as possible and maybe by the time I get back, some of those other things will have been committed. Taking a quick look at your stacks, I'll take a shot in the dark and ask whether you've got all your devices using the noop scheduler? Presumably this hang is hit-or-miss to reproduce? |
In fact it seems to be the deadline scheduler :-( !!! |
@edillmann De-cloaking on my vacation for a moment: The one thing I think is a pain in the butt when using partitions (I always use my own GPT partitions) is to set noop scheduler on the underlying disks. I have a feeling it would be very difficult or hack-ish to get ZFS to do that job automatically. |
@edillmann Does this bug also happen without pulling these changes? Since there are multiple commits involved perhaps you can use git bisect to locate the specific commit causing this issue. |
@casualfish yes, in fact this bug first appears under last master (without patches applied) |
@edillmann The interaction between vdevs and linux kernel block layer is in file vdev_disk.c. You needn't have to manually set elevator=noop since it's the default scheduler when opening a vdev(disk type) https://github.com/zfsonlinux/zfs/blob/master/module/zfs/vdev_disk.c#L312. According to your kernel traces it looks like deadlock has happened. Could you describe your workload and the exact steps to trigger this bug? |
Out of curiosity, are you running a pre-emptible kernel? |
@atonkyra no for now i'am using "No Forced Preemption (Server)" Preemption model. |
@edillmann I think trying it out might be worth the shot. |
@atonkyra i will give it a try |
@edillmann Okay, please report back with any results you get :) |
Any updates @edillmann ? |
Hi, for now (5 days), changing preemption model to Voluntary Kernel Preemption did the trick, but for how long, who knows, let wait severals days more |
For now 19 days uptime, i'm closing this one. |
@edillmann Thanks for the update. |
Hi,
I'm did pull the following pull-requests over current master
#1610
#1612
#1614
#1496
zfs build fine but after some time (several hours). zfs is stuck (no io), and I get the following kernel traces
http://pastebin.com/jv3qMVv9
ps|grep zfs show's only a zfs snapshot command
kernel is 3.9.2
The text was updated successfully, but these errors were encountered: