-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remove extra word from object header (better approach to alignment) #10898
Comments
If this plays into the C interop of structures, then there's an exception for x86 Linux--doubles are only aligned 4 by default on that platform & architecture. If you need to reliably match native alignment, |
the C-interop story here would be based on the semantics of malloc, which are as Jeff described above. |
i disagree that this is priority or v0.4 target. v0.4.x seems more reasonable to me |
This has to be fixed immediately. You can't just add an extra word to every object and then shrug and say we don't have time to fix it. |
It's particularly egregious that boxed Float64s and Int64s (etc.) are now 50% bigger for no reason, because they don't end up 16 aligned anyway, and they don't need to be. This is a major performance regression that affects a large amount of code, to fix a relatively narrow issue. |
So all those ABI considerations are still pretty vague for me. Given 64bit arch, and the fact that type tags and data have to be contiguous, we in fact are forced to waste 8 bytes per 16 bytes object to satisfy those constraints right ? |
(I meant 16 bytes objects as 16 bytes without the tag) |
Yes, we would still waste 8 bytes for some objects, but that's a lot better than wasting 8 bytes for all objects. It would be fine with me to fix this as narrowly as possible, and only ensure alignment for jmp_buf where it is needed. We can leave vector type alignment for another day. |
this probably should've been fixed before #10579 (comment) was merged at all |
So for example, on my linux/x64 box, a jmp_buf is 200 bytes, which bring a jl_task_t to 320 bytes (including tag). 320 is both a multiple of 16 and (conveniently) an available pool size, so even if we apparently don't need it on this ABI, we actually are guaranteed to have task->ctx be 16 bytes aligned (since its offset inside the struct is 80 bytes). |
the compiler will not make that easy, since it is already guaranteeing that jmp_buf is at a 16-byte offset from the start of the struct |
Oh. Now I get it. So with careful packing pragma and manual padding this should be doable right ? |
I would also agree with @JeffBezanson that this should be fixed for 0.4.0, not some 0.4.x. |
as it turns out, this was required by one of the dsp tests. i'm just waiting for travis to greenlight the i686 code to merge this. |
…her than every object (fix JuliaLang#10898)
I've been having some crazy ideas, about alignment and allocation in Julia... may be all wet... |
what is a box? why does your example not seem to have the data in the box? modern allocators have generally found that it is more efficient to segregate allocations by size. this wastes some space on odd-sized allocations, but is generally much faster at allocation and walking the pool (since it is constant size). and it saves a byte on each allocation to store the size of the subsequent data field. if you've hit the memory allocator, you've already missed the fast-path of staying entirely in registers / on the stack with an extra couple function calls, data copies. |
OK, I'm still learning about julia's allocator, but it looked like for many things like strings at least, there was 8 bytes of "box"ing, and then either a value or 8-byte pointer. With my idea for intertwined pools, depending on how they are allocated, the pool still looks like it is a constant size, it simply has an offset to the next one larger than the element size. Staying in registers/stack is great, for the objects you are currently working on, yes... but if you've got lots of 128 bit or 256 bit fields or even 512 bit fields, you really want those to be optimally aligned |
Currently every object has an extra word (added by 0d8cec3) to make the data area 16-byte aligned. This is not ok. The extra word should be removed, and alignment should instead be done by offsetting the first object in a page, and putting extra space between objects as necessary.
I believe the alignment rules should be (sizes not including tag):
cc @vtjnash @carnaval
Clarification: this issue asks that we fix the win64 issue without adding a word to every object. Implementing the above alignment scheme can be done if convenient, but is not itself required to close this.
The text was updated successfully, but these errors were encountered: