-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
3525 persistent l2arc. #2672
3525 persistent l2arc. #2672
Conversation
For details, see: https://www.illumos.org/issues/3525 v2: Change two KM_SLEEP to KM_PUSHPAGE. v3: Change one more KM_SLEEP. v4: Fix log buffer alignment in l2arc_dev_log_commit. v5: Fix style. v6: l2arc vdev can go away, remove ASSERT in l2arc_spa_rebuild_start. Close #925 Ported-by: Yuxuan Shui <yshuiv7@gmail.com>
The failed test, is it a deadlock during module loading? The log is a little bit confusing.. |
@yshui Sometimes the buildbot results can sometimes be a little vague when there's a failure. For these cases usually the best place to look is in the dmesg log. This is always run as the last build step so you can effectively get the console log from the system. In this case we seem to have hit an unexpected memory allocation failure which prevented the modules from loading. That's almost certainly unrelated this this patch. Do you know what the status is for this change on the Illumos side? |
@behlendorf The status on the issue tracker is 'Feedback', although that has been the same for nearly a year now... |
works fine here, after a reboot I usually have to manually mount the zpool containing my /home and after the zpool was imported - the state of L2ARC has been preserved Thanks a lot ! |
This seems to have some issues with #2484 as evidenced by http://pastebin.com/1xuCjyi9. |
FYI: after every few boots l2arc gets corrupted ("faulted") and the cache needs to be removed and re-added haven't figured out the pattern yet this didn't happen before - some new changes from the upstream master or some of the additional added patches seem to break it patches used before: #2351 #2672 #2753 #2484 ( #2484 now replaced by #2129 ) additional added patches recently: #2784 #2786 could also be that the order in which the partitions are getting unmounted and/or closed during shutdown has changed ... (not very probable - but this already has led to issues with minor checksum issues in the past on a mirrored zpool) |
I have this happen as well. "kernelOfTruth aka. kOT, Gentoo user" notifications@github.com wrote:
|
seems like even only booting up Windows without logging in or doing any administration (from time to time I'm dual-booting into Windows 8.1) is enough to corrupt the partition (!) so Windows is to blame in my case :/ l2arc here is seated on an luks-encrypted partition on the SSD I'll observe if there are other causes to this ... |
@kernelOfTruth Can you post the content of /proc/spl/kstat/zfs/arcstats when zfs failed to rebuild l2arc? |
@kernelOfTruth --- zfs.make.vmalloc.log 2014-10-17 13:36:41.487493228 +0200 |
@yshui sure will do, but currently it seems to work fine I'm glad it does - besides that this SSD doesn't survive that huge amount of write/read circles (it's already 1/3 of what's allowed - thus I'm concerned if it has to be wiped when l2arc is failing/faulting) seems like I can trigger it with the following (will have to try after next reboot):
will observe more ... thanks |
@yshui Yes, I decided to skip #2672 for now, being more interested in getting ZoL to work with 32-bit. I guess depending on which patchset gets mainlined first, the other one will have to be adjusted. |
@algragon whats your specific usecase for 32-bit anyway? just curious. |
@maci0 I got a little fileserver at home with an Atom D525, which only has 32-bit, and I would like to convert the mdadm-raid1/ext4 to zfs. When I tried the mainline 0.6.3 module for ubuntu 14.04, I got lots of vmalloc-related stalls/hangs, so I'm testing a patched version of ZoL. |
I would like to try this out. So I cloned and made a local branch from 0.6.3, I then fetched and merged yshui:illumos-3525 into this branch, and built with the following commands: ./configure --prefix=/usr --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --mandir=/usr/share/man --infodir=/usr/share/info --datadir=/usr/share --sysconfdir=/etc --localstatedir=/var/lib --libdir=/usr/lib64 --disable-silent-rules --disable-dependency-tracking --docdir=/usr/share/doc/zfs-0.6.3 --enable-shared --disable-static --bindir=/bin --sbindir=/sbin --with-config=user --with-linux= --with-linux-obj= --with-udevdir=/lib/udev --with-blkid --disable-debug L2ARC does not seem to be saved between reboots, not even when it exceeds 350 MB. Do I have to re-create my pool? Removing the cache device and re-adding it did not solve this issue. |
@josla972 just go with (zfs) master and/or 0.6.3 and apply the following patch to it: both with
and
l2arc's state is preserved |
The result of what I did is the same code-wise. zpool export will not work very well since its my root system we are talking about. I am unsure if systemd does an export or not on reboot, but I think so since zfs services are enabled. How do you verify that preservation works? Looking at zpool iostat -v, I notice that the allocated amount of MiB is reset to 43.9M (probably the files needed for boot are cached) after a reboot: Then after playing around a bit and observing that the allocated cache has increased to, say, 350 M, I do a reboot and notice that it still is reset to 43.9M. |
before a reboot - it e.g. is at:
after a reboot (zfs mount -a)
and/or grows when the l2arc fails and I had to remove & re-add it it's usually at
or something similar @josla972 what you're currently seeing - that's the behavior I had before applying this patch |
Since you seem to be a Gentoo user just like me, I could try to reproduce the way you installed the patch. I fear that the paths are not correct when I build with make inside a development directory instead of letting emerge handle it all. Did you let emerge handle it (overlay/edit ebuilt etc) or did you just manually apply the patch and built and installed zfs from a separate dir? |
I could have let portage handle it but the other patches I use have rejects - so generally doing it manually directly via portage-tree and /var/tmp/portage:
(in another terminal window)
not sure if it's really necessary - but in case scripts change - also do the same for sys-fs/zfs |
Thank you. I got it working now with the zfs-kmod-9999! I failed to apply the patch to 0.6.3 though. I hope this is stable enough. |
illumos 3525 is now in code review. |
Closing. We'll pick up this change once it's merged in to illumos. |
@kpande: do you have a branch up with the rebase somewhere? Seems a bit less than trivial at first glance :) |
I put it into which compiles and runs - however, there are two calls that needed to be changed into abd* calls that I was not sure of. If someone with more abd knowledge than I could check it over... |
I wanted to experiment with this, so I rebased it on current master. It compiles and runs, no errors. Also no errors/warnings in dmesg. |
@behlendorf What's the upstream status of l2arc right now? Is there any news you can share? |
Big thank you for doing this! Having a cold L2ARC after kernel upgrades/reboots has been the biggest pain point for my setup for a while, so your work is greatly appreciated. Is there a bug/feature bounty tracker for ZOL? I'd like to post a bounty to prioritize getting this feature integrated, if it will help. |
Although my rebase runs with no apparent problems, it doesn't make the L2ARC persistent. |
I think I am making slow progress on this. Pushed new commits in: Further testing would be greatly appreciated. |
@yshui You are in luck: Illumnos has been cordially trashed as upstream, so you would very well be able to finish this now :) |
You probably need to free this. |
Each abd_get_from_buf() call allocates memory that is not freed by zio_wait() later on. There are several in this patch, 2 in blk_commit(), 1 in dev_hdr_read() and dev_hdr_update(), 1 in blk_read() and 1 in blk_prefetch. In each one of these functions, the abd_t has to be freed properly. This is not trivial to do as it requires a proper zio structure with a callback function. And then of course, encryption would have to be implemented... |
I just pushed a new commit addressing those issues, in the quickest way. No error messages anymore. |
I implemented encryption, seems to be working fine. Just pushed the new commit. I would appreciate testing! |
@gamanakis If you are the one actually developing it, you can make your own PR here. Great work btw, how much do you think you have left? |
Will happen soon, I am working on cancelling the rebuild in l2arc_remove_vdev. |
@gamanakis Awesome, love your dev attitude! :) |
New pull request submitted: |
Superseded by #9582 |
This commit makes the L2ARC persistent across reboots. We implement a light-weight persistent L2ARC metadata structure that allows L2ARC contents to be recovered after a reboot. This significantly eases the impact a reboot has on read performance on systems with large caches. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: George Wilson <gwilson@delphix.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Co-authored-by: Saso Kiselkov <skiselkov@gmail.com> Co-authored-by: Jorgen Lundman <lundman@lundman.net> Co-authored-by: George Amanakis <gamanakis@gmail.com> Ported-by: Yuxuan Shui <yshuiv7@gmail.com> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes #925 Closes #1823 Closes #2672 Closes #3744 Closes #9582
This commit makes the L2ARC persistent across reboots. We implement a light-weight persistent L2ARC metadata structure that allows L2ARC contents to be recovered after a reboot. This significantly eases the impact a reboot has on read performance on systems with large caches. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: George Wilson <gwilson@delphix.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Co-authored-by: Saso Kiselkov <skiselkov@gmail.com> Co-authored-by: Jorgen Lundman <lundman@lundman.net> Co-authored-by: George Amanakis <gamanakis@gmail.com> Ported-by: Yuxuan Shui <yshuiv7@gmail.com> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes openzfs#925 Closes openzfs#1823 Closes openzfs#2672 Closes openzfs#3744 Closes openzfs#9582
@Ornias1993 Apologies to anyone upset, was hoping it wouldn't be a nuisance. I had posted in a few tickets where "Bounty" had been discussed in a positive light ( @tyco ), thinking there would be interest in participating in the crowd funding issue I linked. |
For details, see: https://www.illumos.org/issues/3525
Close #925
Ported-by: Yuxuan Shui yshuiv7@gmail.com