-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LLP: mmap labels instead of loading them in memory #117
Conversation
LLP on SWH's graph now needs more than 3TB, which makes it crash in this step when there is anything else running on our 4TB machine that needs significant memory, like a previous version of the graph.
997fd89
to
90fe626
Compare
At that point in the code, we allocate On the other hand, we are not dropping the graph, which would be the obvious thing to do, and would help if the graph is not memory mapped (I know it is in the CLI tool tho). So I pushed a small commit that makes the argument a graph or a reference to a graph, and in the CLI we pass the graph. Just to undertand whether the memory usage is sensible, how many nodes are we talking here? |
Incidentally, with more knowledge of Rust I'm unhappy with the fact that we save arrays as |
Quoting @progval from elsewhere:
|
So it's 400GB per array (gulp). Still, 4 arrays is 1.6 TB. Unless Rust is doing something weird, there's nothing else in memory at that point. Where does the 3TB come from? |
I don't know. LLP logged this:
and then the machine's memory use kept growing, up to 2.91TB, where LLP crashed and total memory use fell to 20GB, as you can see here: https://grafana.softwareheritage.org/goto/qRAeEdSNz?orgId=1 |
I'm trying to understand where the memory occupancy comes from. The algorithm uses currently 33 bytes per node. So that does account for roughly 1.5 TB. There's a missing half terabyte. Unfortunately I don't remember whether The other suspicious thing is that the label store (which accounts for 16 of these bytes) should be automatically dropped before the "Elapsed: 1d 5h 48m 11s" line. So if this is happening the 2.1 TB look weird. If this is not happening there's something going on. I tried to add a manual drop (on MacOS) and the occupancy does not change. So my guess is that the memory includes the graph. How large is the graph + ef + dcf? Do you set |
8MB × ~250 threads, so that's not it |
Mmmhhh. No. That would mean 2.5, not 2.1. Another weird thing is
So, in theory between the first and the last line the label stores stop being used, and NLL should deallocate it. But the memory is the same. |
Sorry, my bad, it's 25 bytes per node. I don't know how I computed 8 ⨉ 3 = 32, which of course is not true. So the arrays are 1.25 TB. The graph 800 GB. The rest of the structures 150 GB. I think we have our 2.11 GB occupancy. With the additional array, we'll get to 1.65 TB. So the thing is crashing for 1.65 TB—the rest is memory-mapped. However, if the label store is not dropped, we have an additional 800 GB which explains the 2.91 TB occupancy. So the next question is: why it has not been dropped? |
Note: memory usage does not go down even with an explicit drop after the main loop. |
Could it be the memory allocator keeping it in its pool? |
Well, then the promise of Rust is quite bogus. It is a behavior comparable to a garbage collector. It could be a delay in the detection by I'll do more test during the week—these two days I have my last 8 hours of teaching. But, definitely, this thing shouldn't crash because it uses too much memory. Unless other processes are using > 2.5 TB of memory. |
not exactly, a memory allocator does not need to be aware of references between objects. also I just freed 400GB on our machine (by removing an old graph from tmpfs), and LLP passed this time. Logs of the end:
graph of memory usage: https://grafana.softwareheritage.org/goto/fSxfx5IHz?orgId=1 |
LLP on SWH's graph now needs more than 3TB, which makes it crash in this step when there is anything else running on our 4TB machine that needs significant memory, like a previous version of the graph.
I did not try to benchmark this yet