-
Notifications
You must be signed in to change notification settings - Fork 885
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about the external fragmentation in a long running process #632
Comments
Thanks @romange for the detailed feedback (and apologies for the late reply).
So, for your measurements it is best to distinguish between the unused but untouched parts (the difference between the capacity and total reserved space), and the unused but touched part (the blocks in the free lists). I suspect the graph with the "unused but untouched" should have much less unused memory? (let me know :-) ) |
Hi @daanx, thanks for following up on my questions. Meanwhile, I've been studying the domain myself. As you said yourself in #645 , any improvements on the allocator side will be incremental (aka reduce probability of, reduce the acuteness etc), because there is nothing we can do with an under-utilized and non-empty page, since the allocator can not move its contents somewhere else. Therefore, it seems that the only solution (if someone really needs to solve the external fragmentation issue) would be - app-aware defragmentation. And this is what I described in dragonflydb/dragonfly#448 Redis, btw, implemented something similar with jemalloc. They call it "active-defrag". They introduced a function that tells how much a page is utilized: redis/redis@e8099ca and use it to move memory around. We are taking a similar approach with mimalloc. This is the function that I believe will help us to determine whether we need to realloc the ptr that belongs to the underutilized page: https://github.com/romange/helio/pull/27/files#diff-a9d9d3912eb3a1eeced1a0726da5a12f80f764b999f683c40d96a45c8c31884b What do you think about it? |
Ah I see -- I agree that an allocator can not generally "solve" the fragmentation problem as we cannot move objects. I understand now that you would like to give the application itself the opportunity to reallocate objects that are in fragmented page.
Very interesting to work on this |
@daanx We ended up patching mimalloc with the following patch: https://github.com/romange/helio/blob/master/patches/mimalloc-v2.0.5.patch seems that it's doing its job. We reduce the |
@daanx I am hitting the same issue as the original poster, and I tried to use an extra |
Hello Daan,
The project I am working on uses mimalloc underneath.
A user reported the following problem:
Dragonfly reports 80GiB used memory (tracked by aggregating
mi_usable_size
) but mi_malloc stats show 90GiB committed. He also says that the backend adds up roughly 0.5% committed memory each day whileused
does not change. Assuming that my "used" accounting is correct, it means the backend does not have leakage (otherwise "used" would grow as well). Dragonfly has 16 threads.mimalloc stats output:
mi_heap_visit_blocks(.., false, ...)
) from a single heap - from a single thread out of 16. That heap has5.9 GiB
of committed memory and 5GiB used. When looking at the unused memory I see lots of areas that have unused blocks. See the chart below. Why does it happen? Specifically, can mi_malloc create a new area of block size 80 if there is another one with unused memory?"Unused" in this context is
committed - used
.I am also attaching the detailed table of areas from which the chart was created. The first column in the table is the number of occurrences of triple
<reserved, committed, used>
.sorted-malloc.txt
The text was updated successfully, but these errors were encountered: