Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High Memory Usage when Using ZNC Clientbuffer Module #309

Closed
andymandias opened this issue Apr 1, 2024 · 12 comments
Closed

High Memory Usage when Using ZNC Clientbuffer Module #309

andymandias opened this issue Apr 1, 2024 · 12 comments

Comments

@andymandias
Copy link
Collaborator

When using Halloy connected to ZNC with the Clientbuffer module memory usage grows continually during use. When otherwise running Halloy I see memory usage start around 100MB and grow up to around 150MB, but when connected to ZNC with the Clientbuffer module memory grows unbounded. In around an hour memory is often up to 250MB and the UI slows down, but memory use will continue to grow until the application stalls and must be force quit. The largest memory use I've seen so far is 3.5GB, but usually I don't let it get that high.

It doesn't matter whether I use the new nickname format or not (i.e. using an @ in my username to identify the client is not necessary to produce the behavior). It seems to progress faster when two clients are connected to the ZNC bouncer, but it still happens when only one client is connected. Happens on Linux (both x86_64 and aarch64, though it progresses faster on the latter). I tried reproducing it under DHAT (valgrind), but annoying it does not seem to reproduce no matter how long I wait.

One thing I have noticed, is that the server buffer gets a lot of RPL_WHOREPLY messages in it. I think this is somewhat expected when two (or more) clients are connected, since last_who will not (to my knowledge) reflect the WHO requests from another client (i.e. all the WHO polls sent from another client will show up in the server buffer). For some reason this also happens when only one client is connected. Since the server buffer history is capped at 10000 messages (as far as I know) I don't expect it to causing the memory leak, but it's the only different behavior I've been able to spot so far.

@tarkah
Copy link
Member

tarkah commented Apr 1, 2024

Can you try running it with heaptrack? And debug mode might make the stack traces make more sense. Or compile release mode w/ debug info enabled.

You're right that we cap how many messages we store in memory and how many messages we render to the scrollable. I'll try to think if there's anywhere we allow memory to grow unbounded.

@andymandias
Copy link
Collaborator Author

andymandias commented Apr 1, 2024

Thank you for the suggestion. heaptrack crashes on aarch64 when trying to run it against anything, so I set it aside when I thought the leak was only happening on aarch64 and promptly forgot about it completely. I've started a test run on x86_64 and will report back with what I find (after memory usage grows sufficiently).

The issue with running under valgrind is unfortunately not that the stack traces make sense, the traces I got do make sense and is why I tried looking at the history (which was the largest non-graphical allocation reported), but memory usage does not seem to grow. valgrind changes the memory usage profile, so maybe I'm just not fulling understanding how valgrind operates, but it starts out at about 400MB of memory used and did not significantly change after ~8 hours of use. Everything runs slower through valgrind, so it's possible I just needed to wait even longer, but normally memory would have grown past 400MB in that time span.

Edit: heaptrack looks like it's successfully capturing the memory growth, so it will probably be usable in stead of valgrind. I primarily mention valgrind particulars on the off chance it provides a hint as to what is going on.

@andymandias
Copy link
Collaborator Author

Running with a cargo build executable from built from main, running with heaptrack then waiting for memory usage to grow to ~200MB, then quitting Halloy normally. heaptrack is reporting memory leaks, but the biggest one is alloc::alloc::alloc clocking in at 18.2MB. But following it up the backtrace just leads to iced load_font, so I'm guessing it's not a real leak just the font not being explicitly unloaded.

There are apparently ~88MB of Vulkan allocations at peak memory usage (where heaptrack reports the total memory at peak as 149MB). I guess that might be a likely candidate, but I don't know if there's any useful information in the related backtrace:

<unresolved function> in ?? (libvulkan_lvp.so)
<unresolved function> in ?? (libvulkan_lvp.so)
<unresolved function> in ?? (libvulkan.so.1)
<unresolved function> in ?? (libvulkan.so.1)
<unresolved function> in ?? (libvulkan.so.1)
vkCreateDevice in ?? (libvulkan.so.1)
ash::instance::Instance::create_device::h070ec83f36636be2 in instance.rs:359 (halloy)
wgpu_hal::vulkan::adapter::_$LT$impl$u20$wgpu_hal..Adapter$LT$wgpu_hal..vulkan..Api$GT$$u20$for$u20$wgpu_hal..vulkan..Adapter$GT$::open::haf1b19323e42a24a in adapter.rs:1,628 (halloy)
wgpu_core::instance::Adapter$LT$A$GT$::create_device_and_queue::h50f0213ff4b1bc99 in instance.rs:371 (halloy)
wgpu_core::instance::_$LT$impl$u20$wgpu_core..global..Global$LT$G$GT$$GT$::adapter_request_device::h99720bbb5eb53896 in instance.rs:1,084 (halloy)
_$LT$wgpu..backend..wgpu_core..ContextWgpuCore$u20$as$u20$wgpu..context..Context$GT$::adapter_request_device::h68ffefca46a2c51b in wgpu_core.rs:587 (halloy)
_$LT$T$u20$as$u20$wgpu..context..DynContext$GT$::adapter_request_device::hf01c7beea0c85537 in context.rs:2,019 (halloy)
wgpu::Adapter::request_device::h316dcc296c77559d in lib.rs:2,119 (halloy)
iced_wgpu::window::compositor::Compositor::request::_$u7b$$u7b$closure$u7d$$u7d$::h414b481a64f09df1 in compositor.rs:122 (halloy)
iced_wgpu::window::compositor::new::_$u7b$$u7b$closure$u7d$$u7d$::h01986c124560d5dc in compositor.rs:168 (halloy)
iced_renderer::compositor::Candidate::build::_$u7b$$u7b$closure$u7d$$u7d$::hbc2e31f88ce2acda in compositor.rs:260 (halloy)
_$LT$iced_renderer..compositor..Compositor$u20$as$u20$iced_graphics..compositor..Compositor$GT$::new::_$u7b$$u7b$closure$u7d$$u7d$::h1fcb82e211113dd9 in compositor.rs:37 (halloy)
iced_winit::application::run::_$u7b$$u7b$closure$u7d$$u7d$::h4b2036de9f82e73e in application.rs:222 (halloy)
futures_executor::local_pool::block_on::_$u7b$$u7b$closure$u7d$$u7d$::ha6f08ed10bfa9a13 in local_pool.rs:317 (halloy)
futures_executor::local_pool::run_executor::_$u7b$$u7b$closure$u7d$$u7d$::hbfc91ba05faa794d in local_pool.rs:90 (halloy)
std::thread::local::LocalKey$LT$T$GT$::try_with::ha1b1a1432a773c3f in local.rs:270 (halloy)
std::thread::local::LocalKey$LT$T$GT$::with::h223f288c51abeb37 in local.rs:246 (halloy)
futures_executor::local_pool::run_executor::h3aaec55bcaec5bbf in local_pool.rs:86 (halloy)
futures_executor::local_pool::block_on::h0ae5c216429e467a in local_pool.rs:317 (halloy)
iced::application::Application::run::h4f38cc3451ec42c3 in application.rs:230 (halloy)
halloy::main::hbfce31b6a8867035 in main.rs:67 (halloy)
core::ops::function::FnOnce::call_once::hecd162500297d0ef in function.rs:250 (halloy)
std::sys_common::backtrace::__rust_begin_short_backtrace::h5d10b8faa226da9f in backtrace.rs:155 (halloy)
std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::h6e1803b99e1d628c in rt.rs:166 (halloy)
core::ops::function::impls::_$LT$impl$u20$core..ops..function..FnOnce$LT$A$GT$$u20$for$u20$$RF$F$GT$::call_once::h37600b1e5eea4ecd in function.rs:284 (halloy)
std::panicking::try::do_call::hb4bda49fa13a0c2b in panicking.rs:552 (halloy)
std::panicking::try::h8bbf75149211aaaa in panicking.rs:516 (halloy)
std::panic::catch_unwind::h8c78ec68ebea34cb in panic.rs:142 (halloy)
std::rt::lang_start_internal::_$u7b$$u7b$closure$u7d$$u7d$::hffdf44a19fd9e220 in rt.rs:148 (halloy)
std::panicking::try::do_call::hcb3194972c74716d in panicking.rs:552 (halloy)
std::panicking::try::hcdc6892c5f0dba4c in panicking.rs:516 (halloy)
std::panic::catch_unwind::h4910beb4573f4776 in panic.rs:142 (halloy)
std::rt::lang_start_internal::h6939038e2873596b in rt.rs:148 (halloy)
std::rt::lang_start::h66ebcf788fd88f3c in rt.rs:165 (halloy)
main in ?? (halloy)

@andymandias
Copy link
Collaborator Author

Ran heaptrack again without connecting to ZNC and to utilize heaptrack's comparison feature. The main difference appears to be ~40MB of allocations attributed to Vulkan, so maybe there's something to that.

@tarkah
Copy link
Member

tarkah commented Apr 3, 2024

We really need to recreate the above 3.5GB and / or UI slows down along w/ heaptrack to hopefully see what's going on.

It's worth noting that Halloy used iced commit 3d915d3cb30e5d08829aa2928676a53c505a601e which is before this upstream change iced-rs/iced#2357.

Can you please test w/ #317 as this updates iced to include that fix as well as reduces some unnecessary text allocations?

@andymandias
Copy link
Collaborator Author

andymandias commented Apr 3, 2024

Unfortunately x86_64 grows significantly slower than aarch64. And the x86_64 machine has 4x as much memory as the aarch64 machine, so it doesn't slow down as quickly (it was running at 1GB without the stalls that I usually see at 400MB on aarch64). Running under heaptrack also seems to slow down the memory growth rate as well (Edit: on second thought, this might just be that the growth rate varies for reasons I don't know yet and heaptrack has been unluckily timed with slow growth periods). I'm starting a long-running heaptrack process on my x86_64 now, but it will probably be a few days before I'll have anything to report.

When I get some time I'll spend some effort trying to get heaptrack working on aarch64, where memory growth has been faster paced.

Running #317 on aarch64 now, will report back after giving some time for memory use to grow (or not).

@andymandias
Copy link
Collaborator Author

#317 still results in high memory usage on aarch64, unfortunately. To put some numbers on the growth rate seen: memory grew to ~450MB twice since my last post (once extra because I initially ran the wrong executable 😓), which was sufficient to get occasional stalls. In that same time x86_64 has grown to ~260MB of memory usage (not running #317, just running main) with no stalls manifesting. The growth rate might be a bit slower with #317, but not so much slower that I can say for certain.

@hecrj
Copy link
Contributor

hecrj commented Apr 4, 2024

@andymandias So you are experiencing high CPU usage as well, I gather? Leaked memory should not generally have an impact on performance, unless you are running out.

If there is a high CPU usage, then we know it's most likely not a memory leak. Something is trying to perform a lot of work.

Could you share a bit more about your hardware? Just to discard any potential driver / OS issues.

@andymandias
Copy link
Collaborator Author

andymandias commented Apr 4, 2024

@hecrj I did not pay close attention to the CPU usage since this machine is frequently RAM starved, and oomd is not always proactive enough (sometimes resulting in the DE locking up for 10s of minutes as it slowly resolves the issue). Looking at it now, while memory usage is high the CPU usage remains relatively low (uses 100% of one core on average) while it is stalled. The stall takes 10s of seconds, after which I would typically either quit or kill it to avoid potential DE locking. Continuing to run it now, performance appears to be fine after the long stall has resolved. Like it's waiting on a blocking process (reading from swap?), then once that's completed everything appears to run fine (until the next stall occurs, which I haven't been keeping rigorous track of, but I think only happens after I've been away from the workspace with Halloy for a while).

Quickly grabbing potentially relevant specs from neofetch for the aarch64 machine (which grows in memory use fairly quickly):

OS: Fedora Linux Asahi Remix 39 (Thirty Nine) aarch64 
Host: Apple MacBook Pro (16-inch, M1 Max, 2021) 
Kernel: 6.6.3-414.asahi.fc39.aarch64+16k 
Resolution: 3456x2234 
DE: GNOME 45.5 
WM: Mutter 
CPU: (10) @ 2.064GHz 
Memory: 28011MiB / 31570MiB 

And for the x86_64 machine (which grows in memory usage as well, but growth is significantly slower):

OS: Ubuntu 20.04.6 LTS x86_64 
Host: XPS 8950 
Kernel: 5.15.0-92-generic 
Resolution: 3456x2234 
WM: Mutter 
CPU: 12th Gen Intel i9-12900K (24) @ 5.100GHz 
GPU: NVIDIA 01:00.0 NVIDIA Corporation Device 2504 
Memory: 28241MiB / 128533MiB

If there's anything missing there I'll be happy to provide it.

@andymandias
Copy link
Collaborator Author

Some progress on getting heaptrack working on aarch64; heaptrack runs successfully for some programs now, unfortunately Halloy is not one of them. There are some more threads to pull on there, so will continue trying to get it running as time permits.

Running on x86_64 under heaptrack, the memory use grew to ~800MB (i.e. ~940MB resident, 140MB shared) as of yesterday. I quit Halloy there since the heaptrack output file was over 200GB. Here are some of the results:

Screenshot from 2024-04-11 14-47-51
Screenshot from 2024-04-11 14-49-56
Screenshot from 2024-04-11 14-52-06
Screenshot from 2024-04-11 14-50-53
Screenshot from 2024-04-11 14-51-36

I'm just picking out what seems like it might be useful from the reports, but I'm pretty new to heaptrack. Let me know if there's anything further info it would be helpful to collect.

@andymandias
Copy link
Collaborator Author

A short update to report results from using ICED_BACKEND=tiny-skia. This slows down the memory growth significantly on aarch64; where leaving Halloy running overnight would often see 1.5GB+ memory use reports via System Monitor, when using the tiny-skia backend the growth appears to wind up around 250MB. Unfortunately memory use does not stop there, and continues to grow. The largest I've seen with the tiny-skia backend is only ~500MB, but I have been restarting Halloy quite often recently as I was working on a couple PRs.

@andymandias
Copy link
Collaborator Author

Going to close this now that #340 is merged. Thanks again for the extensive, detailed work from everyone to resolve this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants