-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
compression=gzip on FreeBSD leaks memory #10225
Comments
Please try with the latest version of the ports (202004150x). We were not initializing the arc free target in earlier versions. |
Your "how to reproduce" steps imply that the problem is with fragmentation causing memory use. The most plausible route for that is loaded metaslabs, which consume from the Am I reading correctly that you have 45 separate storage pools? That should work, but it is not a use case that has received much scrutiny. Maybe you're hitting some unknown suboptimal behavior due to having so many storage pools? |
Sorry, haven't noticed this came out. Trying, thanks! I will report back with the findings. |
Hm, sorry, after that long thread I put a bookmark into my head with that I should watch out for the related work and haven't checked the outputs. Well, I'm not even sure the original conclusion is (still) standing.
Yes, that's what I have. I have redundancy between the hosts, so I don't need local redundancy, but would like to use ZFS features. |
No I suppose I didn't pay enough attention after "old version of port eats all memory" to notice the ARC is relatively small. Good to update anyway though! :) |
That's fine, but I think you'd be much better off with a single zpool. You can still have no redundancy, as you do now. i.e. |
I have 44-60 disks in a machine. Rebuilding 43-59 times the amount needed because of one dying seems to be somewhat excess. :) |
Upgrading to openzfs version 2020041502 (commit a7929f3) caused no change: vmstat output now:
Any ideas on what would be useful to debug this? |
I see an important change here -- system no longer goes to swap. |
Yes, because with FreeBSD's in-tree ZFS I got reboots and I also expected that with the openzfs port, so I've configured a swap and a dump device. |
OK. After another look on |
I would think it's clearly ZFS-related and @ahrens' explanation in the linked thread seems to be the problem. If I rewrite the zpools and fragmentation decreases, this kind of problem disappears. Here's another memory graph from a machine, which is exactly the same as this, but got rewritten zpools: This has 69G wired, but out of it 42G is ARC, which is fine. On the problematic machine, the wired memory grows even during importing the zpools and it's even worse when a crash occurred (so it fits the explanation around ZIL playback as well). Anyways, here's the output:
|
I'm sorry, I wanted to say |
Oh, I didn't understand netstat, should've corrected...
|
So indeed looks like ZFS: |
No, it's not ARC. It's really like what's been described, with the metaslabs. |
@bra-fsn How did you conclude that it's caused by metaslabs? The most common cause for metaslab memory usage is zfs_btree_leaf_cache, which stores the in-memory version of loaded metaslabs' spacemaps. If I'm reading correctly, that is using 5GB of RAM in your latest comment, which is considerable, but much less than the 158GB of "solaris" or 154GB in the "32768" cache. I'm not super familiar with the FreeBSD diagnostics here, but it sounds like something is doing a lot of kmem_alloc(32K). It's definitely possible that this is related to ZFS and to metaslabs, but I don't have a guess as to what that would be, specifically. Maybe you could use dtrace to see what stacks are doing these allocations most often? (note, it looks like allocations of size > 16384 and <= 32768 will use this cache). |
It wasn't me, but you :) Of course that was before the AVL->btree change, but it made perfect sense. Everything you wrote there turned out to be correct, the memory usage is proportional to the level of fragmentation (well at least I could successfully and drastically reduce it by rewriting the pools). At least with the in-tree ZFS version, which has the AVL stuff.
Could you please help with that? |
The important difference there is that in that case, the space was directly attributable to the range_seg_cache, which was the precursor to the zfs_btree_leaf_cache. In your case, the zfs_btree_leaf_cache is only using 5GB of RAM, so the high memory usage isn't caused by loading the spacemaps into memory. As for the dtrace script, you want something that triggers on kmem_alloc when the size is > 16384 and <= 32768, and you probably want to do |
@pcd1193182 Understood, I'm just saying the effect is very similar. I've restarted the machine and let it run for a while.
The attached file: has the I'm not sure how useful this will be though. |
I see a bunch of stacks like this:
I think the bug is:
I am guessing that should be calling something like |
(incidentally, I noticed that the no-op functions zlib_workspace_alloc() and zlib_workspace_free() could be removed) |
I've also looked at those, but this is so basic and used everywhere, I couldn't think it's the cause. Why is this FreeBSD related? |
I've brought up the zlib workspace parts with @mattmacy before. It's code we may want to implement in the future, so he left the stubs in place for now. I'll have a look at the leaky bits. Thanks for helping troubleshoot this! |
Fixes openzfs#10225 zlib_inflateEnd was accidentally a wrapper for inflateInit instead of inflateEnd, and hilarity ensues. Fix the typo so we free memory instead of allocating more. Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Thanks a lot guys, building the new module and trying it out! |
Memory usage is constant after the change, I think this is solved with #10252: Thanks and sorry for misleading the topic with the metaslabs-related problem (also good to see it's solved!). |
This bug still exists, it will be closed by PR #10252. |
zlib_inflateEnd was accidentally a wrapper for inflateInit instead of inflateEnd, and hilarity ensues. Fix the typo so we free memory instead of allocating more. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10225 Closes #10252
zlib_inflateEnd was accidentally a wrapper for inflateInit instead of inflateEnd, and hilarity ensues. Fix the typo so we free memory instead of allocating more. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes openzfs#10225 Closes openzfs#10252
System information
Describe the problem you're observing
Detailed description is here: https://openzfs.topicbox.com/groups/developer/T10533b84f9e1cfc5
I think the optimization work is now done/merged (#9181) and my openzfs version contains it.
I can acknowledge that it makes the machine more stable. It can now survive for around 1.5 days with 192G RAM.
Memory usage is like this:
top shows ATM:
vmstat -z output:
@ahrens, @pcd1193182 do you have any further ideas on how to improve this situation?
I've already started rewriting the pools with ashift=12, but it takes ages to complete...
Describe how to reproduce the problem
See the above mailing list thread. Basically:
(some disks are rewritten on this machine)
The text was updated successfully, but these errors were encountered: