Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AutoBump] Merge with 770393bb (Jun 17) (79) #343

Merged
merged 102 commits into from
Sep 16, 2024

Conversation

mgehre-amd
Copy link
Collaborator

No description provided.

KB9 and others added 30 commits June 14, 2024 16:09
…3432)

This prevents an assertion being triggered by the cast to FloatType.

Fixes llvm#92064
llvm#95619)

This reverts commit eca988a. The
underlying libc issue was fixed by PR#95576.

The original PR is llvm#95436 , which adds printf, putchar and vprintf in
bareemetal entrypoints
Fixes llvm#93711 .
This patch implements the ``fdopen`` function. Given that ``fdopen`` 
internally calls ``fcntl``, the implementation of ``fcntl`` has been
moved to the ``__support/OSUtil``, where it serves as an internal public
function.
This change fixes the PowerPC lit tests that are failing due to the
recent change to hoist constant-sized allocas at flang codegen. Three of
these changed lit tests are entirely rewritten to use variables instead
of numbered LLVM IR.
…uncf (llvm#95346)

Add an `fastMathAttr` on `arith::extf` and `arith::truncf`. If these two
ops are inserted by some promotion passes (like legalize-to-f32 /
emulate-unsupported-floats), they will be labeled as
`FastMathFlags::contract`, denoting that they can be then `eliminated by
canonicalizer`.

The `elimination` can help improve performance, while may introduce some
numerical differences.
This brings the unmodified SipHash reference implementation:
  https://github.com/veorq/SipHash
which has been very graciously licensed under our llvm license
(Apache-2.0 WITH LLVM-exception) by Jean-Philippe Aumasson.

SipHash is a lightweight hash function optimized for speed on short
messages. We use it as part of the AArch64 ptrauth ABI (in arm64e and
ELF PAuth) to generate discriminators based on language identifiers and
mangled names.

This commit brings the unmodified reference implementation and tests
as of f26d35e, specifically siphash.c and vectors.h, as SipHash.cpp and
SipHashTest.cpp.

Next, we will integrate it properly into libSupport, with a wrapping API
suited for the ptrauth use-case.
Start building it as part of the library, with some minor
tweaks compared to the reference implementation:
- clang-format to match libSupport
- remove tracing support
- add file header
- templatize cROUNDS/dROUNDS, as well as 8B/16B result length
- replace assert with static_assert
- use LLVM_FALLTHROUGH

This also exports interfaces for SipHash-2-4-64/-128, and
tests them using the reference test vectors.
This finally wraps the now-lightly-modified SipHash C reference
implementation, for the main interface we need (16-bit ptrauth
discriminators).

The exact algorithm is the little-endian interpretation of the
non-doubled (i.e. 64-bit) result of applying a SipHash-2-4 using the
constant seed `b5d4c9eb79104a796fec8b1b428781d4` (big-endian), with the
result reduced by modulo to the range of non-zero discriminators (i.e.
`(rawHash % 65535) + 1`).

By "stable" we mean that the result of this hash algorithm will the same
across different compiler versions and target platforms.

The 16-bit hashes are used extensively for the AArch64 ptrauth ABI,
because AArch64 can efficiently load a 16-bit immediate into the high
bits of a register without disturbing the remainder of the value, which
serves as a nice blend operation.

16 bits is also sufficiently compact to not inflate a loader relocation.
We disallow zero to guarantee a different discriminator from the places
in the ABI that use a constant zero.

Co-authored-by: John McCall <rjmccall@apple.com>
To implement SaveCore for elf binaries we need to populate some
additional fields in the prpsinfo struct. Those fields are the nice
value of the process whose core is to be taken as well as a boolean flag
indicating whether or not that process is a zombie. This commit adds
those as well as tests to ensure that the values are consistent with
expectations
Most of the InlineDescriptor fields were unused for global variables.
But more importantly, we need to differentiate between global variables
that are uninitialized because they didn't have an initializer when we
originally created them, and ones that are uninitialized because they
DID have an initializer, but evaluating it failed.
…eferencing pointer to pointers (llvm#95298)

This is a different implementation to llvm#94100, which has been reverted.

When -fdebug-info-for-profiling is specified, for any Load expression if
the pointer operand is not a declared variable, clang will emit debug
info describing the type of the pointer operand (which can be an
intermediate expr)
In GNU ld, -r forces -Bstatic and has precedence over -Bdynamic: -lfoo
probes libfoo.a but not libfoo.so, even if -Bdynamic is in effect. Our
behavior currently matches gold and probes libfoo.so. Since we don't
have strong opinion on the exact behavior, let's just follow GNU ld and
also unify the reason we report the "attempted static link of dynamic
object " error.

Close llvm#94958
…m#95394)

Unlike the existing fadd cases, choose to ignore the requirement for
amdgpu-unsafe-fp-atomics in case of fine-grained memory access. This
is to minimize migration pain to the new atomic control metadata. This
should not break any users, as the atomic intrinsics are still
directly consumed, and clang does not yet produce vector FP atomicrmw.
ARMISD::SUBS is a duplicate of ARMISD::SUBC.
The node was introduced in 5745b6a. This patch replaces SUBS with SUBC
and reverts changes in *.td files.
Rewrite divideCeil, divideNearest, divideFloorSigned, and
divideCeilSigned to never overflow.
Fix test case in llvm#95298 because another recent submitted patch removed
llvm.dbg intrinsics, updated test case accordingly
Multiple static instances of this utility function have been found in
different GlobalISel files.
Unifying them by adding an instance in utils.cpp.
…llvm#95578)

It matches the legalization of buffer loads similar to the SelectionDAG.
xgupta and others added 25 commits June 17, 2024 09:56
The dead code is caught by PVS studio analyzer -
https://pvs-studio.com/en/blog/posts/cpp/1126/, fragment N12.

Warning message -
V523 The 'then' statement is equivalent to the 'else' statement.
Options.cpp 1212
…lvm#95601)

This patch adds folds for the cases where both operands are the same or
where it can be established that the first operand is less than, equal
to, or greater than the second operand.
close: llvm#94737
alive2: https://alive2.llvm.org/ce/z/WF_7mX

In this patch, we combine `(X + Y) / 2` into `(X & Y)` only when both X
and Y are less than or equal to 1.
…llvm#95521)

Sinking currently only supports instructions that have zero or one uses.
Extend this to handle instructions with any number of uses, as long as
all uses are consistent (i.e. the "same" for all sinking candidates).

After llvm#94462 this is basically just a matter of looping over all uses
instead of checking the first one only.
…based on' (llvm#95650)

As discussed in
https://discourse.llvm.org/t/getelementptr-inbounds-inbounds-of-which-allocation/79024,
we need the pointer to be inbounds of *the* allocated object the pointer
is based on, not just any allocated object.
…lvm#95558)

Expand all constant expressions that use fat pointers upfront, so that
the rewriting logic only has to deal with instructions and not the
constant expression variants as well.

My primary motivation is to remove the creation of illegal constant
expressions (mul and shl) from this pass, but this also cuts down quite
a bit on the amount of duplicate logic.
This MIR test case is added to seek the consumption of VGPR lanes being
used for SGPR spills during si-lower-sgpr-spills pass of AMDGPU pass
pipeline. Basically, in this pass, stack slots are mapped to available
VGPR lanes for spilling purpose, thus ending the need for stack slots.

In current scenario, each new SGPR spill goes into new VGPR lanes as,
being mapped from its distinct stack slots assigned during SGPR
allocation pass. It can be clearly seen in the added test case.
For RISC-V, it's always 0 and I don't see any reason we will
change it in the future.
For single-index GEPs the source and result element types are the
same, but using the source type is semantically more correct.
llvm#91871)

This PR adds initial support for the `scmp`/`ucmp` 3-way comparison
intrinsics in the SelectionDAG. Some of the expansions/lowerings
are not optimal yet.
…lvm#95531)

This produces better/more canonical codegen than the generic LLVM
lowering, which is a pattern the backend currently does not recognize.
See: llvm#81840.
Only gfx908 was tested, and the returning versions weren't tested.
Follow up on llvm#95087 to fix incorrect usage instances of
divideCeilSigned.
This represents the enum type that can be assigned to a field using the
`<enum>` element in the target XML.

https://sourceware.org/gdb/current/onlinedocs/gdb.html/Enum-Target-Types.html

Each enumerator has:
* A non-empty name
* A value that is within the range of the field it's applied to

The XML includes a "size" but we don't need that for anything and it's a
pain to verify so I've left it out of our internal structures. When
emitting XML we'll set size to the size of the register using the enum.

An Enumerator class is added to RegisterFlags and hooked up to the
existing ToXML so lldb-server can use it to emit enums as well.

As enums are elements on the same level as flags, when emitting XML
we'll do so via the registers. Before emitting a flags element we look
at all the fields and see what enums they reference. Then print all of
those if we haven't already done so.

Functions are added to dump enum information for `register info` to use
to show the enum information.
This PR adds debug support for fixed size character type. The character
type gets translated to DIStringType.

As I have noticed in comments, currently DIStringType does not have a
way to represent the underlying character type of the string. This
restricts our ability to represent wide string. As an example, this is
how the debugger shows 2 different type of string. Note that non-ascii
characters work ok with default kind string.

  character(kind=4, len=5) :: str1
  character(len=16) :: str2
  str1 = 'hello'
  str2 = 'π = 3.14'

(gdb) p str1
$1 = 'h\000\000\000e\000\000\000l\000\000\000l\000\000\000o\000\000\000'

(gdb) p str2
$2 = 'π = 3.14       '
This doesn't need any work to be done in SROA itself, but rather in
functions that it uses. Specifically:
* DIExpression::createFragmentExpression is made to understand
DW_OP_LLVM_extract_bits
* valueCoversEntireFragment is made to check the active bits instead of
the fragment size, so that it handles extract_bits correctly
Co-authored-by: Louis Dionne <ldionne.2@gmail.com>
…bv(A, 0), 0) (llvm#95242)

There is an existing combine to remove the need for extract_subv that
requires matching vector types (all fixed or all scalable).

The combine doesn't need this restriction and so I've changed it to use
ValueType's "knownBits??" interface that supports mixed vector types. In
doing so we also need extra guards to prevent invalid operations (e.g.
extracting a scalable vector from a fixed length vector).
Fix passing temporary string object as argument to the StringRef
constructor in "parseRegister" function, because it causes errors in the
test "llvm/test/MC/Xtensa/Core/processor-control.s".
This should simplify handling of resulting value by the callers.
…94346)

The `llvm.invariant.start` intrinsic is already overloaded to work with
memory objects in any address space. We simply instantiate the intrinsic
with the appropriate pointer type.

Fixes llvm#94345.

Co-authored-by: Vito Kortbeek <kortbeek@synopsys.com>
@mgehre-amd mgehre-amd changed the title [AutoBump] Merge with 770393bb (Jun 17) (1) [AutoBump] Merge with 770393bb (Jun 17) (79) Sep 12, 2024
Base automatically changed from bump_to_c091dd48 to feature/fused-ops September 16, 2024 10:59
@mgehre-amd mgehre-amd merged commit 7e79490 into feature/fused-ops Sep 16, 2024
10 checks passed
@mgehre-amd mgehre-amd deleted the bump_to_770393bb branch September 16, 2024 10:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.