-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compress link_ids lists #48279
Compress link_ids lists #48279
Conversation
This reduces the size of our precompile cache files, using run-length encoding (RLE) to represent the module of external linkages. Most linkages seem to be against the sysimg itself, and RLE allows long stretches of such linkages to be encoded compactly. Closes #48218
Since most are 0 currently, should we make that a special RefTags of its own, so this structure is also very rarely needed at all? We could give it a sized reftag structure of:
So that on 64 bit, we use some of the spare bits to encode ids up to 2 million packages and sizes up to 1TB each, without needing the side table And on 32 bit, we only have enough spare bits to do this meaningfully for image 0, so we still need to support the side table, but hopefully with much less content |
Just to check that I understand, you're proposing to encode the module/buildid and offset in the same 64-bit field, right? Do you have a specific proposal for how we represent the buildid in 29 bits? With respect to a constant list of dependency-buildids, perhaps? (Essentially the same role of And also to check, you're proposing to add this as a second type of external linkage to
|
An array of modules is already an argument to the deserializer, so it could be an index into that array. I gave it 21 bits in the encoding above for the index. |
OK, good. And the second part? A new tag, right? |
Yep |
I have a mostly-working implementation of your proposal in the
So I'm back to thinking this is the better approach. Thoughts? The figure below displays the total size of the |
We can recover more reftags whenever we want, by making them slightly more complicated. Only 4 of them are fixed now, the other 4 could be combined into 1 |
OK. What do you think about the approach in |
This reduces the size of our precompile cache files, using run-length encoding (RLE) to represent the module of external linkages. Most linkages seem to be against the sysimg itself, and RLE allows long stretches of such linkages to be encoded compactly.
Closes #48218
With a fairly minimal default environment (42 packages including dependencies), here were the sizes of
.julia/compiled/v1.10
in bytes:for a savings of about 6%.