Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect layout in extern struct #16633

Closed
menduz opened this issue Jul 31, 2023 · 4 comments
Closed

Incorrect layout in extern struct #16633

menduz opened this issue Jul 31, 2023 · 4 comments
Labels
bug Observed behavior contradicts documented or intended behavior

Comments

@menduz
Copy link

menduz commented Jul 31, 2023

Zig Version

0.11.0-dev.4308+417b92f08

Steps to Reproduce and Observed Behavior

Proof of concept and reproduction code: https://zig.godbolt.org/z/34vx4MGbe

Based on my understanding of the semantics of extern struct, I'd expect to have a C compatible memory layout. Paddings are being added as it were no extern.

Context

// The C api I'm integrating (Steamworks) provides a pointer to a data 
// structure and a type id, based on the type ID, we can derive an extern type `T`.

// To create a copy of the value T on the stack, my initial approach was to use a pointer
// and copy the value
var value: T = @as(*T, @ptrCast(@alignCast(slice))).*; // ❌

// Instead, using the type information, I can get the compiler to read the data field by field
var value: T = std.mem.zeroes(T);
const struct_info = @typeInfo(T).Struct;
var start: usize = 0;

inline for (struct_info.fields) |field| {
    if (!field.is_comptime) {
        const end = start + @sizeOf(field.type);
        if (end > slice.len) @panic("overflow");

        // copy the slice of memory corresponding to the field
        @memcpy(std.mem.asBytes(&@field(ret, field.name)), slice[start..end]);
        
        start = end;
    }
}

Which partially solves the problem of reading the data, but it doesn't solve the interoperability. More importantly, the reader is not recursive, making nested structs not work either.

Here is a C memory dump of a struct 0100000052b0554e00008601, the PoC code outputs the following:


pub const LobbyCreated_t = extern struct {
    m_eResult: c_int = 0,
    m_ulSteamIDLobby: u64 = 0,
};

Will create a type example.LobbyCreated_t using the bytes 0100000052b0554e00008601:
  ret[m_eResult] = slice[0..4 (4)] = 01000000 offset_in_c=0 offset_in_zig=0
  ret[m_ulSteamIDLobby] = slice[4..12 (8)] = 52b0554e00008601 offset_in_c=4 offset_in_zig=8
                                 there is a 4 byte difference ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  ret = example.LobbyCreated_t{ .m_eResult = 1, .m_ulSteamIDLobby = 109775242231394386 }
memory layout(zig): [c_int ]........[u64           ]
 memory dump (zig): 010000000000000052b0554e00008601
   memory dump (c): 01000000        52b0554e00008601
                            ^^^^^^^^ the zig code is adding 4 bytes of padding to the `extern struct`
pub const LobbyEnter_t = extern struct {
    m_ulSteamIDLobby: u64 = 0,
    m_rgfChatPermissions: u32 = 0,
    m_bLocked: bool = false,
    m_EChatRoomEnterResponse: u32 = 0,
};

Will create a type example.LobbyEnter_t using the bytes 52b0554e000086010000000000f5452f01000000:
  ret[m_ulSteamIDLobby] = slice[0..8 (8)] = 52b0554e00008601 offset_in_c=0 offset_in_zig=0
  ret[m_rgfChatPermissions] = slice[8..12 (4)] = 00000000 offset_in_c=8 offset_in_zig=8
  ret[m_bLocked] = slice[12..13 (1)] = 00 offset_in_c=12 offset_in_zig=12
  ret[m_EChatRoomEnterResponse] = slice[13..17 (4)] = f5452f01 offset_in_c=13 offset_in_zig=16
                                                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  ret = example.LobbyEnter_t{ .m_ulSteamIDLobby = 109775242231394386, .m_rgfChatPermissions = 0, .m_bLocked = false, .m_EChatRoomEnterResponse = 19875317 }
memory layout(zig): [u64           ][u32   ][]......[u32   ]........
 memory dump (zig): 52b0554e000086010000000000000000f5452f0100000000
   memory dump (c): 52b0554e000086010000000000      f5452f01000000
                                              ^^^^^^ three extra bytes

Expected Behavior

I'd expect to have compatible memory layouts between C abi and extern struct

@menduz menduz added the bug Observed behavior contradicts documented or intended behavior label Jul 31, 2023
@rohlem
Copy link
Contributor

rohlem commented Jul 31, 2023

Zig gives each type a "default"/"native"/"natural" alignment per target configuration. (C compilers do the same.)
For u64 (on the target chosen in the linked Godbolt example) that seems to be 8 (bytes), which is why 4 bytes of padding are inserted after the (4-byte) c_int field in LobbyCreated_t for example.
You can override a field's alignment using align(n) syntax for struct fields: field_name: type = default_value align(alignment_bytes).
(This is only mentioned in the test_aligned_struct_fields.zig example under packed struct in the language reference, but available for any struct and union field.)

It's possible that Zig doesn't mimic your system C compiler with these alignments (making it C-ABI incompatible), which would be a bug.
However, for LobbyEnter_t, the u32 is actually on align(1), which I find unlikely to be the "default"/"native"/"natural" alignment of u32 chosen by your compiler.

Therefore I think it's more likely that the structs are actually defined differently in C, with something like #pragma pack.
In this case the fields are actually under-aligned (below their "default"/"native"/"natural" alignment) to be byte-aligned (which would be align(1)).
If you define all of the fields to be align(1), you'll find all the layouts matching up your current expectations. (For now there's no shorter way to write this - you have to mark every field individually.)
(save for LobbyDataUpdate_t, which C seems to have padded to include 3 garbage bytes - or maybe your C dumping code was inexact there).

@menduz
Copy link
Author

menduz commented Jul 31, 2023

@rohlem many many thanks. I was omitting the code was packed with #pragma pack.

I can confirm this is not a bug. Sadly, some alignments are platform dependent as well, and some structs have their own alignment. A sad afternoon waits for me.

The extra bytes are emitted by the library. I assume the structs are not up to date with the internal data structures.

@menduz menduz closed this as completed Jul 31, 2023
@menduz
Copy link
Author

menduz commented Aug 9, 2023

For future reference, after very long and expensive days of trying too hard, I couldn't find/brute force an elegant way to derive the same alignment of the C code from Zig.

My stop-loss insurance was to create a file with lots of std::fprintf(stdout, alignof(...)) and run it once per platform. Yielding the following

https://github.com/menduz/zig-steamworks/blob/158780e84088b742894da9a4b0e778a12cb470e6/steamworks/align-info-windows.json#L102-L118

https://github.com/menduz/zig-steamworks/blob/158780e84088b742894da9a4b0e778a12cb470e6/steamworks/align-info-linux.json#L102-L118

Then I used the differences in the alignment to explicitly set the alignment for each field for each struct of the library https://github.com/menduz/zig-steamworks/blob/158780e84088b742894da9a4b0e778a12cb470e6/src/main.zig#L963

@rohlem
Copy link
Contributor

rohlem commented Aug 9, 2023

@menduz Interesting finds. If @cImport + @cInclude / zig translate-c works on those headers, then that should result in ABI-compatible types (which may differ per target platform / configuration, so provide -target= when using translate-c).
If those types still have different layout / alignment, then that would be worth a continuation or follow-up bug report.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Observed behavior contradicts documented or intended behavior
Projects
None yet
Development

No branches or pull requests

2 participants