WIP: make `reinterpret` work on structs #32660

andyferris · 2019-07-23T12:27:14Z

Currently, reinterpret works on primitive types through the bitcast intrinsic and as a (inner!) constructor for ReinterpretArray. In both older array element type casting and the newer ReinterpretArray implementation, it's possible to reinterpret arrays with isbits struct fields as other isbits struct fields, making it more generic than the non-array version.

This PR is my attempt to copy the way Keno implemented getindex on ReinterpretArray to make reinterpret also work on structs. Semantically, we copy the value to a Ref and copy the bits to a Ref of a different type and load that. Fortnately there is enough codegen optimization such that, like the ReinterpretArray case, the overhead is (mostly) removed in the resultant LLVM/native code.

This is work in progress - but I would appreciate early feedback from knowing people on the approach taken in the new reinterpret method.

Work around bootstrap issues so we don't overwrite the basic reinterpret method.
Add some unit tests
News/documentation

chethega · 2019-07-24T12:41:15Z

base/reinterpretarray.jl

+    if isprimitivetype(Out) && isprimitivetype(In)
+        return bitcast(Out, x)
+    elseif struct_subpadding(Out, In)
+        in = Ref{In}(x)


Couldn't we simply

GC.@preserve in begin return unsafe_load(unsafe_convert(Ptr{Out}, in)) end

Yes, that’s probably better

BioTurboNick · 2021-11-11T04:20:31Z

base/reinterpretarray.jl

+        ptr_in = unsafe_convert(Ptr{In}, in)
+        out = Ref{Out}()
+        ptr_out = unsafe_convert(Ptr{Out}, out)
+        GC.@preserve in out begin


Shouldn't in and out be preserved before getting the pointer?

vtjnash · 2021-11-11T18:10:56Z

Since struct layout is not generally defined (other than for C-compatible isbits types), we've previously felt that reinterpret should not be exposing that detail to the user.

andyferris · 2021-11-11T22:04:34Z

In that case, is an isbitstype check on In and Out sufficient to ensure correctness?

BioTurboNick · 2021-11-11T22:17:13Z

~~isbitstype types can still have padding, I believe that just refers to the presence or not of references.~~ Are some Julia isbits types not C-compatible?

I guess you'd need to check that both types have the same padding? Though I see you have a note about not being able to check the input's padding, can you say more about that?

andyferris · 2021-11-11T22:52:40Z

Sorry, yes I remember the issue now. We need to check that any input padding bits are also output padding bits. Maybe even vice-versa, just in case there is some UB there. I’m not sure if there’s any function lying around that can do that or to help see the padding bits?

BioTurboNick · 2021-11-12T00:19:58Z

I think amending padding in reinterpretarray.jl to this works:

import Base.padding
import Base.Padding
function padding(T)
    padding = Padding[]
    last_end::Int = 0
    for i = 1:fieldcount(T)
        offset = fieldoffset(T, i)
        fT = fieldtype(T, i)
        if offset != last_end
            push!(padding, Padding(offset, offset-last_end))
        end
        last_end = offset + sizeof(fT)
    end
    if 0 < last_end < sizeof(T)
        push!(padding, Padding(last_end + 1, sizeof(T) - last_end))
    end
    padding
end

struct WithPadding1
                  z::Bool
                  y::Int16
                  x::Int8
                  end

padding(WithPadding1)
#=
2-element Vector{Padding}:
 Padding(2, 1)
 Padding(6, 1)
=#

The only consumer of padding is CyclePadding, which may add another Padding line, but I'm not certain in what situations it would be the case that the minimum alignment is greater than the size of type, to test whether anything cares that there are contiguous Padding entries at the end. reinterpretarray tests pass, at least.

BioTurboNick · 2021-11-12T01:18:57Z

I think this works?

@inline function reinterpret(::Type{Out}, x::In) where {Out, In}
    isbitstype(Out) || error("reinterpret target type must be isbits")
    isbitstype(In) || error("reinterpret source type must be isbits")
    sizeof(Out) == sizeof(In) || error("types must be the same size; got $(sizeof(Out)), $(sizeof(In))")
    isprimitivetype(Out) && isprimitivetype(In) && return bitcast(Out, x)
    struct_subpadding(Out, In) || throw(PaddingError(Out, In))
    in = Ref{In}(x)
    out = Ref{Out}()
    GC.@preserve in out begin
        ptr_in = unsafe_convert(Ptr{In}, in)
        ptr_out = unsafe_convert(Ptr{Out}, out)
        _memcpy!(ptr_out, ptr_in, sizeof(Out))
    end
    return out[]
end

There'd still be the Bool trap representation issue from #43035 .

I don't think unsafe_load can work as suggested in the code comments, as the unsafe_convert can only turn a ref to type T into a pointer of type T.

This PR actually handles what I was trying to do in #43035 but better. You can even do:

reinterpret(Int64, (Int32(1), Int16(1), Int8(1), Int8(1)))

and just get the Int64 directly.

BioTurboNick · 2021-11-12T01:40:31Z

I suppose one issue might be that reinterpret for arrays currently doesn't consider end-padding when deciding if a reinterpretation is valid. Is this a bug?

reinterpret(Int32, [(Int32(1), Int16(1), Int8(1))]) # works
reinterpret(Int32, [(Int32(1), Int8(1), Int16(1))]) # PaddingError

Changing the padding function as shown above would be a breaking change for reinterpret arrays if any code is currently relying on trailing padding being ignored. I suppose that's related to #41071? Though being able to convert an array to pure bytes is a useful operation...

It could be that reinterpreting to something with less padding should be always allowed (as the current PR mentions in a comment), but it would still be useful to have a potential round-trip available. Perhaps it could check that the bytes that would go into the padding are all zeroed?

tkf · 2021-11-12T04:45:13Z

I believe we need an unsafe (narrow contract API), lower-level, and less magic version of reinterpret that can be used for arbitrary pointer-free types. We need to allow structs of Union fields and padded fields. This is required for supporting rich pointer-free immutable types in GPU and lock-free data structures. The idea is to support something like unsafe_bitcast(NTuple{4,Int32}, Some{Union{Float64,Missing}}(0)) to get an opaque sequence of bytes that can be used with low-level atomics (e.g., torn relaxed store / load) and GPU APIs (e.g., shfl). The only thing we need in such use case is a round trip within a single process.

For such use case, we need a very predictable behavior and need to avoid the "magic behavior" of reinterpret(T, x) where T may act on elements when x is an array (or tuple as was suggested in #43035). For example, it's reasonable to try type-pun a static array.

Here's a sketch of the API

Base.unsafe_bitcast(T::Type, x::S) -> y::T
Interpret the bit representation of x of type S as the bit representation of type T.

The unsafe_ prefix reflects that:

The invariance imposed by the constructor of T is not imposed for the value y.

If S fields are padded, values of some fields of T may be undefined.

The bit representation y may not be usable across Julia processes. Invoking unsafe_bitcast(S, y) when y is produced in a different Julia process is undefined behavior unless S and its-subcomponents do not contain Union fields.

The roundtrip is guaranteed to be ===-identical:
unsafe_bitcast(typeof(x), unsafe_bitcast(T, x)) === x
The target type T and the source type S both must be concrete and have identical size. Both T and S must not contain boxed Julia values. They can contain Union fields.

Examples
julia> x = Some{Union{UInt64,Nothing}}(0);

julia> y = Base.unsafe_bitcast(UInt128, x);

julia> Base.unsafe_bitcast(typeof(x), y) === x
true

Implementation is trivial since I'm not suggesting to detect padding inconsistency:

function unsafe_bitcast(::Type{T}, x::S) where {T,S}
    isconcretetype(T) || throw(ArgumentError("output type $T is not concrete"))
    datatype_pointerfree(T) ||
        throw(ArgumentError("output type $T may contain a boxed object"))
    datatype_pointerfree(S) ||
        throw(ArgumentError("input type $S may contain a boxed object"))
    sizeof(T) == sizeof(S) || throw(ArgumentError("different input and output sizes"))
    output = Ref{T}()
    input = Ref{S}(x)
    GC.@preserve output input begin
        po = Ptr{UInt8}(pointer_from_objref(output))
        pi = Ptr{UInt8}(pointer_from_objref(input))
        _memcpy!(po, pi, sizeof(S))
    end
    return output[]
end

Seelengrab · 2021-11-12T10:44:12Z

We need to allow structs of Union fields and padded fields. This is required for supporting rich pointer-free immutable types in GPU and lock-free data structures.

Could you expand on that a little? I'm not familiar with GPU programming and I don't immediately see why Union fields follow from that. Also, if only one of the union fields is allowed to be active at one time (it's not a C union, where memory is aliased, right?), then how/why does reinterpreting/bitcasting that make sense?

BioTurboNick · 2021-11-12T16:20:37Z

If unsafe_bitcast exists for exact preservation, which does seem like a useful function, what if reinterpret skipped over padding, instead of needing them to be compatible?

Should reinterpret be able to take the 56 meaningful bits in (Int32(1), Int16(1), Int8(1)) and create a Tuple{Int32, Int8, Int16}?

tkf · 2021-11-13T02:56:40Z

We need to allow structs of Union fields and padded fields. This is required for supporting rich pointer-free immutable types in GPU and lock-free data structures.

Could you expand on that a little? I'm not familiar with GPU programming and I don't immediately see why Union fields follow from that.

The emphasis is on "supporting rich pointer-free immutable types" not on GPU. What I wanted to say was that, if I need to use objects containing Union values on GPU, I need something like unsafe_bitcast. If you want concrete (but messay ATM) code, there's https://github.com/JuliaFolds/FoldsCUDA.jl/blob/master/src/unionvalues.jl. (But my code is likely to contain some undefined behavior. Trying to do it the "right way" points me to the memcpy-based type punning as I suggested above.)

Also, if only one of the union fields is allowed to be active at one time (it's not a C union, where memory is aliased, right?), then how/why does reinterpreting/bitcasting that make sense?

Julia's Union is more like C++'s variant than union when stored somewhere. So, it makes sense to bundle the active type tag and the value of the active type in a single chunk of opaque bytes.

what if reinterpret skipped over padding, instead of needing them to be compatible?

Do you mean to pack bytes when the in-memory representation contain pads? I can't think of the usage of this for me personally. But, if it has some use somewhere, I can imagine having it in Base make sense since it's not clear how to access all these data layout information without touching the internals.

BioTurboNick · 2021-11-13T03:29:56Z

what if reinterpret skipped over padding, instead of needing them to be compatible?

Do you mean to pack bytes when the in-memory representation contain pads? I can't think of the usage of this for me personally. But, if it has some use somewhere, I can imagine having it in Base make sense since it's not clear how to access all these data layout information without touching the internals.

Yeah. There were a couple people who raised a desire to convert tuples of various isbitstypes to tuples of other isbitstypes, which is how I got interested in this PR.

tkf · 2021-11-13T06:45:23Z

OK, since it sounds like the use case of reinterpret somewhat extends beyond what I had in mind, I opened #43065 to focus discussion on the API I proposed.

BioTurboNick · 2021-11-13T21:29:21Z

I believe I have a solution that implements my more general proposal.

Is it possible to do a pull request to a pull request?

https://github.com/BioTurboNick/julia/tree/reinterpret-all

What I've added:
If padding is mismatched, then it breaks up the copying operations by field, using only the non-padding bytes.
padding function now traverses padding of nested isbitstypes. e.g. Tuple{Tuple{Int8, Int16}, Int8} and gains an entry for trailing padding.

Simple reinterpretations are still ~1 ns; packed-to-padded and vice versa ~14+ ns; mismatched padded-to-padded is 40+ ns, all without allocations.

A remaining issue is Bool trap representations; however, the current reintrepret allows these trap representations via bitcast, so it seems like that should be a separate issue, if that needs to be hardened.

The rationale I'm working with is that the high-level representation of data should be agnostic to the implementation details of the type. So there shouldn't be any logical barrier to converting a bitstype tuple or struct with fields that add up to 64 bits to any other tuple or struct with fields that add up to 64 bits, for example.

Seelengrab · 2021-11-13T22:37:04Z

A remaining issue is Bool trap representations; however, the current reintrepret allows these trap representations via bitcast, so it seems like that should be a separate issue, if that needs to be hardened.

Since "trap" values are not exclusive to booleans (as the proposed docstring by @tkf mentions, no constructor guarantees are observed), booleans could serve as a good example in the documentation to illustrate why people still have to check whether their desired bitcast is valid for their usecase or not. I still don't believe it should check anything/error here, since we have convert for that behavior (well or we change booleans to care about more than just the LSB, though I'm fairly certain that's more controversial than having the behavior as-is).

BioTurboNick · 2021-11-14T00:53:32Z

A remaining issue is Bool trap representations; however, the current reintrepret allows these trap representations via bitcast, so it seems like that should be a separate issue, if that needs to be hardened.

Since "trap" values are not exclusive to booleans (as the proposed docstring by @tkf mentions, no constructor guarantees are observed), booleans could serve as a good example in the documentation to illustrate why people still have to check whether their desired bitcast is valid for their usecase or not. I still don't believe it should check anything/error here, since we have convert for that behavior (well or we change booleans to care about more than just the LSB, though I'm fairly certain that's more controversial than having the behavior as-is).

Gotcha. Looks like the bool issue is best left for #34909

jenkspt · 2021-11-28T21:57:38Z

Not sure if this is still useful, but #43035 was asking for invalid floats from reinterpret. Here are some examples:

julia> typemax(Int64)
9223372036854775807
julia> reinterpret(Float64, typemax(Int64))
NaN
julia> reinterpret(Int64, NaN64)
9221120237041090560
julia> reinterpret(Float64, typemax(Int64)-1)
NaN
julia> reinterpret(Float64, typemax(Int64)-100)
NaN

BioTurboNick · 2021-11-28T22:02:13Z

Not sure if this is still useful, but #43035 was asking for invalid floats from reinterpret. Here are some examples:

julia> typemax(Int64)
9223372036854775807
julia> reinterpret(Float64, typemax(Int64))
NaN
julia> reinterpret(Int64, NaN64)
9221120237041090560
julia> reinterpret(Float64, typemax(Int64)-1)
NaN
julia> reinterpret(Float64, typemax(Int64)-100)
NaN

Thanks, but by invalid, we mean something that would have undefined behavior because the bit representation was not considered valid. NaNs are all valid bit patterns for Float64 and are processed as such.

BioTurboNick · 2021-11-28T22:04:56Z

That's in contrast with Bools, which are only considered valid (in Julia/LLVM) if just the first bit is set or all bits are zero. Any other combination of values can lead to unexpected behavior because the compiler assumes the other bits in a Bool are all 0s.

BioTurboNick · 2022-06-17T14:08:21Z

A remaining issue is Bool trap representations; however, the current reintrepret allows these trap representations via bitcast, so it seems like that should be a separate issue, if that needs to be hardened.

Since "trap" values are not exclusive to booleans (as the proposed docstring by @tkf mentions, no constructor guarantees are observed), booleans could serve as a good example in the documentation to illustrate why people still have to check whether their desired bitcast is valid for their usecase or not. I still don't believe it should check anything/error here, since we have convert for that behavior (well or we change booleans to care about more than just the LSB, though I'm fairly certain that's more controversial than having the behavior as-is).

Gotcha. Looks like the bool issue is best left for #34909

Looks like that bool issue is about to be fixed by #45689 ?

Seelengrab · 2022-06-17T15:27:36Z

Looks like that bool issue is about to be fixed by #45689 ?

It's only booleans though that are affected by this - other kinds of constructor constraints still don't get checked. In effect, the fix just looses us a convenient example with a builtin datatype 🤷

BioTurboNick · 2022-06-17T21:38:05Z

Looks like that bool issue is about to be fixed by #45689 ?

It's only booleans though that are affected by this - other kinds of constructor constraints still don't get checked. In effect, the fix just looses us a convenient example with a builtin datatype 🤷

I think it's clear that the compiler considers the high 7 bits to be padding and is free to assume they're 0s, so arguably it's more in line with padding concerns than invalid bit patterns within valid bits. This creates undefined behavior that works sometimes and not in others.

That is:

struct Foo
x::Int
function Foo(x::Int)
    x < 0 || x > 1 && throw(ArgumentException())
    new(x)
end
end

This struct considers any other values to be invalid Foos, and it's possible to make an invalid Foo by reinterpreting 4 as a Foo, but the effect of doing that will be defined.

Can we construct an example where behavior would be undefined, and not just wrong?

WIP: make reinterpret work on structs

cc96c9b

andyferris requested a review from Keno July 23, 2019 12:27

andyferris mentioned this pull request Jul 23, 2019

reinterpret SVector as Vector{SVector} JuliaArrays/StaticArrays.jl#634

Open

chethega reviewed Jul 24, 2019

View reviewed changes

BioTurboNick reviewed Nov 11, 2021

View reviewed changes

BioTurboNick mentioned this pull request Nov 11, 2021

Reinterpret code for tuples of bitstypes #43035

Closed

tkf mentioned this pull request Nov 13, 2021

RFC: unsafe_bitcast #43065

Closed

BioTurboNick mentioned this pull request Jun 17, 2022

[Add-on to] WIP: make reinterpret work on structs #45723

Closed

BioTurboNick mentioned this pull request Oct 9, 2022

Round-trip reintrepretation of all bits types #47116

Merged

Seelengrab mentioned this pull request Jan 5, 2023

Negative-length arrays are empty #48133

Closed

vtjnash closed this Jun 22, 2023

vtjnash deleted the ajf/reinterpret branch June 22, 2023 15:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: make `reinterpret` work on structs #32660

WIP: make `reinterpret` work on structs #32660

andyferris commented Jul 23, 2019 •

edited

Loading

chethega Jul 24, 2019 •

edited

Loading

andyferris Jul 24, 2019

BioTurboNick Nov 11, 2021

vtjnash commented Nov 11, 2021

andyferris commented Nov 11, 2021

BioTurboNick commented Nov 11, 2021 •

edited

Loading

andyferris commented Nov 11, 2021

BioTurboNick commented Nov 12, 2021

BioTurboNick commented Nov 12, 2021

BioTurboNick commented Nov 12, 2021 •

edited

Loading

tkf commented Nov 12, 2021 •

edited

Loading

Seelengrab commented Nov 12, 2021

BioTurboNick commented Nov 12, 2021

tkf commented Nov 13, 2021

BioTurboNick commented Nov 13, 2021 •

edited

Loading

tkf commented Nov 13, 2021

BioTurboNick commented Nov 13, 2021 •

edited

Loading

Seelengrab commented Nov 13, 2021 •

edited

Loading

BioTurboNick commented Nov 14, 2021

jenkspt commented Nov 28, 2021

BioTurboNick commented Nov 28, 2021

BioTurboNick commented Nov 28, 2021

BioTurboNick commented Jun 17, 2022

Seelengrab commented Jun 17, 2022

BioTurboNick commented Jun 17, 2022

WIP: make reinterpret work on structs #32660

WIP: make reinterpret work on structs #32660

Conversation

andyferris commented Jul 23, 2019 • edited Loading

chethega Jul 24, 2019 • edited Loading

Choose a reason for hiding this comment

andyferris Jul 24, 2019

Choose a reason for hiding this comment

BioTurboNick Nov 11, 2021

Choose a reason for hiding this comment

vtjnash commented Nov 11, 2021

andyferris commented Nov 11, 2021

BioTurboNick commented Nov 11, 2021 • edited Loading

andyferris commented Nov 11, 2021

BioTurboNick commented Nov 12, 2021

BioTurboNick commented Nov 12, 2021

BioTurboNick commented Nov 12, 2021 • edited Loading

tkf commented Nov 12, 2021 • edited Loading

Seelengrab commented Nov 12, 2021

BioTurboNick commented Nov 12, 2021

tkf commented Nov 13, 2021

BioTurboNick commented Nov 13, 2021 • edited Loading

tkf commented Nov 13, 2021

BioTurboNick commented Nov 13, 2021 • edited Loading

Seelengrab commented Nov 13, 2021 • edited Loading

BioTurboNick commented Nov 14, 2021

jenkspt commented Nov 28, 2021

BioTurboNick commented Nov 28, 2021

BioTurboNick commented Nov 28, 2021

BioTurboNick commented Jun 17, 2022

Seelengrab commented Jun 17, 2022

BioTurboNick commented Jun 17, 2022

WIP: make `reinterpret` work on structs #32660

WIP: make `reinterpret` work on structs #32660

andyferris commented Jul 23, 2019 •

edited

Loading

chethega Jul 24, 2019 •

edited

Loading

BioTurboNick commented Nov 11, 2021 •

edited

Loading

BioTurboNick commented Nov 12, 2021 •

edited

Loading

tkf commented Nov 12, 2021 •

edited

Loading

BioTurboNick commented Nov 13, 2021 •

edited

Loading

BioTurboNick commented Nov 13, 2021 •

edited

Loading

Seelengrab commented Nov 13, 2021 •

edited

Loading