Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Respect alignments above alignof(Void*) inside union values #14279

Merged

Conversation

HertzDevil
Copy link
Contributor

@HertzDevil HertzDevil commented Jan 30, 2024

Fixes the codegen issue in #14277 (comment).

If alignof(Int128) == 16 (due to #13609 this is not true for x86-64 targets before LLVM 18), An Int32 | Int128 will now place its type ID and data at offsets 0 and 16 respectively, including on AArch64, where alignof(Int128) == 16 has always been the case. This also means that if, say, a 64-byte-aligned value becomes available in Crystal (e.g. for AVX-512), then a union of that value with anything else requires at least 128 bytes of storage, with 60 bytes of padding after the type ID.

This is an ABI breaking change. It might be worth mentioning this even though Crystal makes no guarantees about ABI stability and most users won't be affected.

This doesn't cover the interpreter yet.

@HertzDevil
Copy link
Contributor Author

HertzDevil commented Jan 30, 2024

This patch is currently breaking upcasts like (1 || 1_i64).as(Int32 | Int64 | Int128) and downcasts like (1 || 1_i64 || 1_i128).as(Int32 | Int64), which are used in both the DWARF reader and the Crystal::System.print_error specs. The following is how a union-to-union upcast used to be done, using Crystal::CodeGenVisitor#assign_distinct_union_types:

x = 1 || 1_i64
y = x.as(Int32 | Int64 | Int128)
%x = alloca %"(Int32 | Int64)", align 8
%y = alloca %"(Int128 | Int32 | Int64)", align 8
%1 = alloca %"(Int128 | Int32 | Int64)", align 8
; ...
%5 = load %"(Int32 | Int64)", ptr %x, align 8
store %"(Int32 | Int64)" %5, ptr %1, align 8
%6 = load %"(Int128 | Int32 | Int64)", ptr %1, align 8
store %"(Int128 | Int32 | Int64)" %6, ptr %y, align 8
%7 = load %"(Int128 | Int32 | Int64)", ptr %1, align 8

In short, x and y had identical leading binary representations, and y = pointerof(x).as((Int32 | Int64 | Int128)*).value would have equally worked, since the 8 trailing bytes in y's representation have no effect. (#unsafe_as will not work since it triggers a dynamic dispatch.) Now that y's value offset is shifted, merely changing the alignments of the larger unions isn't going to work anymore.

Downcasts fail for a similar reason. It seems sidecasts such as (1 || 1_i64).as(Int32 | Int128) aren't affected.

@HertzDevil HertzDevil marked this pull request as ready for review January 30, 2024 21:33
Copy link
Member

@straight-shoota straight-shoota left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't dive into the details but as long as it works... looks good 👍
CI is failing on aarc64 though.

@HertzDevil HertzDevil marked this pull request as draft January 31, 2024 17:54
@HertzDevil
Copy link
Contributor Author

HertzDevil commented Feb 1, 2024

Given:

a = uninitialized Int32 | Int64 | Int128
b = 1.as(Int32 | Int64)
a = b

Master produces:

%4 = load %"(Int32 | Int64)", ptr %b, align 8
store %"(Int32 | Int64)" %4, ptr %a, align 8

This PR currently produces:

%4 = getelementptr inbounds %"(Int32 | Int64)", ptr %b, i32 0, i32 0
%5 = load i32, ptr %4, align 4
%6 = getelementptr inbounds %"(Int32 | Int64)", ptr %b, i32 0, i32 1
%7 = getelementptr inbounds %"(Int128 | Int32 | Int64)", ptr %a, i32 0, i32 0
store i32 %5, ptr %7, align 4
%8 = getelementptr inbounds %"(Int128 | Int32 | Int64)", ptr %a, i32 0, i32 1
%9 = getelementptr inbounds %"(Int32 | Int64)", ptr %b, i32 0, i32 1
%10 = load [1 x i64], ptr %9, align 8
store [1 x i64] %10, ptr %8, align 8

That last store doesn't look right; it probably needs to be align 16, but then there is no easy way to cast the [1 x i64] into an appropriate target type. Instead I am now trying the below:

; ...
call void @llvm.memcpy.p0.p0.i64(ptr align 16 %8, ptr align 8 %9, i64 8, i1 false)

@HertzDevil HertzDevil marked this pull request as ready for review February 1, 2024 13:04
@straight-shoota straight-shoota added this to the 1.12.0 milestone Feb 1, 2024
@straight-shoota straight-shoota merged commit b655c0f into crystal-lang:master Feb 2, 2024
57 checks passed
@HertzDevil HertzDevil deleted the bug/overaligned-union branch February 2, 2024 15:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants