Proposal: initial @bitCast
semantics (packed + vector + array)
#19755
Labels
accepted
This proposal is planned.
backend-c
The C backend (CBE) outputs C source code.
backend-self-hosted
breaking
Implementing this issue could cause existing code to no longer compile or have different behavior.
frontend
Tokenization, parsing, AstGen, Sema, and Liveness.
proposal
This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone
This is a partial resurrection of #10547 with an initially reduced scope and taking into account the packed struct changes since then.
The status quo implementation of
@bitSizeOf
and@bitCast
are inconsistent across different types and fairly unimplementable across the various backends. According to the original rejected proposal,@bitCast
now has the semantics of loading from a@ptrCast
ed pointer, however this is plainly not true for status quo due to #17802 changing a@ptrCast
to a@bitCast
with the explicit goal of fixed undefined behavior (related to load/store sizes). I'm also not convinced that this would be a usable definition anyway, since even a simple value likevar x: u20 = 0xABCDE;
may be represented in memory in many different ways, depending on the target and backend:DE BC XA
(current behavior of little-endian targets with the llvm backend)EX CD AB
DE BC XA XX
(current behavior of little-endian targets with the c backend, and on the x86_64 backend)EX CD AB XX
XX DE BC XA
XX EX CD AB
AB CD EX
XA BC DE
(current behavior of big-endian targets with the llvm backend)AB CD EX XX
XA BC DE XX
XX AB CD EX
XX XA BC DE
(current behavior of big-endian targets with the c backend)However I think an intrinsic like
@bitCast
should be defined in a way that does not invoke this complexity, whereas it seems perfectly reasonable and necessary to define pointer casting in terms of the target- and backend-specific memory layout. Additionally, the fact that pointer casting is already legal, makes adding an intrinsic defined precisely in terms of it not add any additional functionality to the language. It could be argued that two things having the same semantics also violatesOnly one obvious way to do things.
This means that
@bitCast
actually needs a specific definition (such as in a language spec 🙄), but since it currently doesn't, it has different semantics for different types and is implemented inconsistently across the compiler. By defining@bitCast
in a target and backend agnostic way, this operation becomes "safer" in some sense than@ptrCast
since you don't have to worry about it behaving differently on a big endian target, for example. I believe this leads to a clear delineation of use cases that makes@bitCast
worth having in the language as a separate concept.The main motivation for resurrecting this proposal, and an argument that was not explored in the original proposal is the effect of
@bitCast
on vectors. With vectors rightly not having well-defined memory layout (given the wide variety of vector semantics across architectures) we lose the ability to convert between differently packed vectors, or even just between@Vector(8, bool)
,@Vector(8, u1)
,@Vector(8, i1)
,u8
, andi8
. While@bitCast
could be defined elementwise on vectors and it's possible to convert frombool
with@select
and tobool
with comparisons, that doesn't solve the use case of converting a vector to an integer.I am going to start off with the reasonable assumptions that
@bitSizeOf
should work for all types that are allowed for@bitCast
, and that@as(To, @bitCast(@as(From, from)))
requires that@bitSizeOf(To) == @bitSizeOf(From)
and performs a copy of that number of bits. The open question is what types should be allowed and how the order of these bits is defined for each of those types. I propose starting off with a limited, fairly uncontroversial set and to leave more complicated cases for a future proposal, in order to unblock progress on the backends more quickly.The proposed types to be allowed initially, along with the value that
@bitSizeOf
would return:packed struct
field)void
:0
bitsbool
:1
bituN
:N
bitsiN
:N
bitsfN
:N
bits*T
,?*T
,[*]T
,?[*]T
,[*c]T
,usize
,isize
, for runtime-allowedT
:@bitSizeOf(usize)
bits (note that this is not allowed as the type of a@bitCast
in favor of@ptrFromInt
,@intFromPtr
, and@ptrCast
)enum (T)
:@bitSizeOf(T)
bits (note that this is not allowed as the type of a@bitCast
in favor of@enumFromInt
and@intFromEnum
)packed struct (T)
:@bitSizeOf(T)
bitspacked union
:comptime size: { var size = 0; for (@typeInfo(U).Union.fields) |field| size = @max(size, @bitSizeOf(field.type)); break :size size; }
(note that Proposal: don't allow unused bits in packed unions #19754 (comment) will vastly simplify this to just@bitSizeOf(T)
as in the previous case)[N]T
, for runtime-allowedT
:N * @bitSizeOf(T)
bits@Vector(N, T)
, for runtime-allowedT
:N * @bitSizeOf(T)
bits (note that this is currently a packable type, but I don't think it should be if given that arrays aren't allowed)If you number bits from lsb to msb starting at the first field of a packed struct, or the first element of an array or vector, for two types, then
@bitCast
would copy numbered bits of one type to the same numbered bit of another type. This matches the waypacked struct
orders bits and is meant to be consistent with that.Types to consider for future proposals:
Related:
@bitCast
allowed to cast to/from vectors of pointers #18936packed
semantics #19660The text was updated successfully, but these errors were encountered: