Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

is_always_lock_free Returns Incorrect Value For 128 Bit Struct On ARM64 #75081

Open
insertinterestingnamehere opened this issue Dec 11, 2023 · 4 comments
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema"

Comments

@insertinterestingnamehere

With clang 17.1 on ARM64 I'm seeing std::atomic<T>::is_always_lock_free evaluate to true even though the member function std::atomic<T>::is_lock_free() evaluates to false for the same type.

Here's a minimal example:

#include <atomic>
#include <cstdint>
#include <iostream>

struct bigint_equiv {
  uint64_t first, second;
};

int main() {
  std::atomic<bigint_equiv> a;
  std::cout << a.is_lock_free() << " " << decltype(a)::is_always_lock_free << std::endl;
}

Possibly related: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70814 where the gcc devs discuss why libatomic doesn't allow 128 bit atomics to be lock-free on ARM.

When I compile the same code with gcc 13.2 is_always_lock_free is false which is strange since I've confirmed that the same libstdc++ is being included and linked. This also appears to be what happens on godbolt too though. With gcc the assembly (as best I can tell) shows a 0 being written out (https://godbolt.org/z/T7493K8xP) while with clang the assembly shows a 1 (https://godbolt.org/z/Gb35jMfjr).

@insertinterestingnamehere
Copy link
Author

It looks like libstdc++ falls back to the compiler built-in __atomic_always_lock_free, so maybe the bug is in clang's implementation of that built-in function. Alternatively, if clang is actually using 128 bit atomics on ARM, std::atomic<T>::is_lock_free() may just be misreporting that. I'm not sure.

@EugeneZelenko EugeneZelenko added clang:frontend Language frontend issues, e.g. anything involving "Sema" and removed new issue labels Dec 11, 2023
@llvmbot
Copy link
Member

llvmbot commented Dec 11, 2023

@llvm/issue-subscribers-clang-frontend

Author: Ian Henriksen (insertinterestingnamehere)

With clang 17.1 on ARM64 I'm seeing `std::atomic<T>::is_always_lock_free` evaluate to `true` even though the member function `std::atomic<T>::is_lock_free()` evaluates to `false` for the same type.

Here's a minimal example:

#include &lt;atomic&gt;
#include &lt;cstdint&gt;
#include &lt;iostream&gt;

struct bigint_equiv {
  uint64_t first, second;
};

int main() {
  std::atomic&lt;bigint_equiv&gt; a;
  std::cout &lt;&lt; a.is_lock_free() &lt;&lt; " " &lt;&lt; decltype(a)::is_always_lock_free &lt;&lt; std::endl;
}

Possibly related: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70814 where the gcc devs discuss why libatomic doesn't allow 128 bit atomics to be lock-free on ARM.

When I compile the same code with gcc 13.2 is_always_lock_free is false which is strange since I've confirmed that the same libstdc++ is being included and linked. This also appears to be what happens on godbolt too though. With gcc the assembly (as best I can tell) shows a 0 being written out (https://godbolt.org/z/T7493K8xP) while with clang the assembly shows a 1 (https://godbolt.org/z/Gb35jMfjr).

@jyknight
Copy link
Member

There's a whole chain of issues here:

  1. Clang does indeed emit native 128-bit atomic instructions on ARM64, despite that it turns a load into a compare-and-swap.
  2. You're using Clang compiler with GCC's libatomic runtime library, I presume. That is intended to work properly, but, here it does not, because the latter does not use 128-bit atomics on ARM64 (and is arguably correct in not doing so).
  3. libstdc++'s std::atomic's implementation of is_lock_free() calls the compiler builtin __atomic_is_lock_free, which should be evaluated at compile-time when the compiler knows the answer. But, libstdc++ uses return __atomic_is_lock_free(sizeof(_M_i), reinterpret_cast<void *>(-_S_alignment));, and Clang doesn't handle known integer values, and thus ends up making a runtime call into libatomic (which should only ever return true in more situations than the compiler, but, due to the above, does not).

So, firstly, we ought to add code to handle constant integer values of a pointer in __atomic_is_lock_free -- in which case we we'd evaluate it to true in the compiler, instead of making a runtime call. That fixes the surface symptom -- is_lock_free() will start returning true here.

But there's still an incompatibility, because the underlying assumption is that if the compiler ever uses inline "lock-free" atomics for a particular pointer/size, libatomic must also do so. Without that guarantee, libatomic may use a spinlock, which is not actually atomic with regards to native atomic instructions. So, secondly, we ought to stop emitting inline 128-bit atomics inline unless LSE2 is available (LSE providees actual atomic 128-bit load/store instructions). But, when we do this, we probably need to put it behind an option, and continue to use it in libatomic, or else Clang will be incompatible with older versions of itself.

@Logikable FYI.

@insertinterestingnamehere
Copy link
Author

Interesting! Thanks for clarifying what's going on.

I also just tested this on x86_64 and, surprisingly, this issue isn't unique to ARM. Compiling with -march=x86-64-v2 or really any similarly recent revision of x86-64 shows the same result, but the original -march=x86-64 does not.

jyknight added a commit to jyknight/llvm-project that referenced this issue Jul 17, 2024
…is_lock_free`.

The second argument passed to these builtins is used to validate
whether the object's alignment is sufficient for atomic operations of
the given size.

Currently, the builtins can be folded at compile time only when the
argument is 0/nullptr, or if the _type_ of the pointer guarantees
appropriate alignment.

This change allows the compiler to also evaluate non-null constant
pointers, which enables callers to check a specified alignment,
instead of only the type or an exact object. E.g.:
 `__atomic_is_lock_free(sizeof(T), (void*)4)`
can be potentially evaluated to true at compile time, instead of
generating a libcall. This is also supported by GCC, and used by
libstdc++.

Related to (but doesn't fix) issue llvm#75081.
jyknight added a commit to jyknight/llvm-project that referenced this issue Jul 17, 2024
…is_lock_free`.

The second argument passed to these builtins is used to validate
whether the object's alignment is sufficient for atomic operations of
the given size.

Currently, the builtins can be folded at compile time only when the
argument is 0/nullptr, or if the _type_ of the pointer guarantees
appropriate alignment.

This change allows the compiler to also evaluate non-null constant
pointers, which enables callers to check a specified alignment,
instead of only the type or an exact object. E.g.:
 `__atomic_is_lock_free(sizeof(T), (void*)4)`
can be potentially evaluated to true at compile time, instead of
generating a libcall. This is also supported by GCC, and used by
libstdc++.

Related to (but doesn't fix) issue llvm#75081.
jyknight added a commit that referenced this issue Jul 22, 2024
…is_lock_free`. (#99340)

The second argument passed to these builtins is used to validate whether
the object's alignment is sufficient for atomic operations of the given
size.

Currently, the builtins can be folded at compile time only when the
argument is 0/nullptr, or if the _type_ of the pointer guarantees
appropriate alignment.

This change allows the compiler to also evaluate non-null constant
pointers, which enables callers to check a specified alignment, instead
of only the type or an exact object. E.g.:
 `__atomic_is_lock_free(sizeof(T), (void*)4)`
can be potentially evaluated to true at compile time, instead of
generating a libcall. This is also supported by GCC, and used by
libstdc++, and is also useful for libc++'s atomic_ref.

Also helps with (but doesn't fix) issue #75081.

This also fixes a crash bug, when the second argument was a non-pointer
implicitly convertible to a pointer (such as an array, or a function).
sgundapa pushed a commit to sgundapa/upstream_effort that referenced this issue Jul 23, 2024
…is_lock_free`. (llvm#99340)

The second argument passed to these builtins is used to validate whether
the object's alignment is sufficient for atomic operations of the given
size.

Currently, the builtins can be folded at compile time only when the
argument is 0/nullptr, or if the _type_ of the pointer guarantees
appropriate alignment.

This change allows the compiler to also evaluate non-null constant
pointers, which enables callers to check a specified alignment, instead
of only the type or an exact object. E.g.:
 `__atomic_is_lock_free(sizeof(T), (void*)4)`
can be potentially evaluated to true at compile time, instead of
generating a libcall. This is also supported by GCC, and used by
libstdc++, and is also useful for libc++'s atomic_ref.

Also helps with (but doesn't fix) issue llvm#75081.

This also fixes a crash bug, when the second argument was a non-pointer
implicitly convertible to a pointer (such as an array, or a function).
yuxuanchen1997 pushed a commit that referenced this issue Jul 25, 2024
…is_lock_free`. (#99340)

The second argument passed to these builtins is used to validate whether
the object's alignment is sufficient for atomic operations of the given
size.

Currently, the builtins can be folded at compile time only when the
argument is 0/nullptr, or if the _type_ of the pointer guarantees
appropriate alignment.

This change allows the compiler to also evaluate non-null constant
pointers, which enables callers to check a specified alignment, instead
of only the type or an exact object. E.g.:
 `__atomic_is_lock_free(sizeof(T), (void*)4)`
can be potentially evaluated to true at compile time, instead of
generating a libcall. This is also supported by GCC, and used by
libstdc++, and is also useful for libc++'s atomic_ref.

Also helps with (but doesn't fix) issue #75081.

This also fixes a crash bug, when the second argument was a non-pointer
implicitly convertible to a pointer (such as an array, or a function).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema"
Projects
None yet
Development

No branches or pull requests

4 participants