[AutoBump] Merge with 770393bb (Jun 17) (79) #343

mgehre-amd · 2024-09-12T15:23:42Z

No description provided.

…3432) This prevents an assertion being triggered by the cast to FloatType. Fixes llvm#92064

llvm#95619) This reverts commit eca988a. The underlying libc issue was fixed by PR#95576. The original PR is llvm#95436 , which adds printf, putchar and vprintf in bareemetal entrypoints

Fixes llvm#93711 . This patch implements the ``fdopen`` function. Given that ``fdopen`` internally calls ``fcntl``, the implementation of ``fcntl`` has been moved to the ``__support/OSUtil``, where it serves as an internal public function.

This change fixes the PowerPC lit tests that are failing due to the recent change to hoist constant-sized allocas at flang codegen. Three of these changed lit tests are entirely rewritten to use variables instead of numbered LLVM IR.

…uncf (llvm#95346) Add an `fastMathAttr` on `arith::extf` and `arith::truncf`. If these two ops are inserted by some promotion passes (like legalize-to-f32 / emulate-unsupported-floats), they will be labeled as `FastMathFlags::contract`, denoting that they can be then `eliminated by canonicalizer`. The `elimination` can help improve performance, while may introduce some numerical differences.

Lit test in llvm@7091dd2 was not updated for llvm@e7e90dd

This brings the unmodified SipHash reference implementation: https://github.com/veorq/SipHash which has been very graciously licensed under our llvm license (Apache-2.0 WITH LLVM-exception) by Jean-Philippe Aumasson. SipHash is a lightweight hash function optimized for speed on short messages. We use it as part of the AArch64 ptrauth ABI (in arm64e and ELF PAuth) to generate discriminators based on language identifiers and mangled names. This commit brings the unmodified reference implementation and tests as of f26d35e, specifically siphash.c and vectors.h, as SipHash.cpp and SipHashTest.cpp. Next, we will integrate it properly into libSupport, with a wrapping API suited for the ptrauth use-case.

Start building it as part of the library, with some minor tweaks compared to the reference implementation: - clang-format to match libSupport - remove tracing support - add file header - templatize cROUNDS/dROUNDS, as well as 8B/16B result length - replace assert with static_assert - use LLVM_FALLTHROUGH This also exports interfaces for SipHash-2-4-64/-128, and tests them using the reference test vectors.

…. NFC (llvm#95579)

This finally wraps the now-lightly-modified SipHash C reference implementation, for the main interface we need (16-bit ptrauth discriminators). The exact algorithm is the little-endian interpretation of the non-doubled (i.e. 64-bit) result of applying a SipHash-2-4 using the constant seed `b5d4c9eb79104a796fec8b1b428781d4` (big-endian), with the result reduced by modulo to the range of non-zero discriminators (i.e. `(rawHash % 65535) + 1`). By "stable" we mean that the result of this hash algorithm will the same across different compiler versions and target platforms. The 16-bit hashes are used extensively for the AArch64 ptrauth ABI, because AArch64 can efficiently load a 16-bit immediate into the high bits of a register without disturbing the remainder of the value, which serves as a nice blend operation. 16 bits is also sufficiently compact to not inflate a loader relocation. We disallow zero to guarantee a different discriminator from the places in the ABI that use a constant zero. Co-authored-by: John McCall <rjmccall@apple.com>

To implement SaveCore for elf binaries we need to populate some additional fields in the prpsinfo struct. Those fields are the nice value of the process whose core is to be taken as well as a boolean flag indicating whether or not that process is a zombie. This commit adds those as well as tests to ensure that the values are consistent with expectations

…llvm#95625)

…lYields (llvm#95502)

Most of the InlineDescriptor fields were unused for global variables. But more importantly, we need to differentiate between global variables that are uninitialized because they didn't have an initializer when we originally created them, and ones that are uninitialized because they DID have an initializer, but evaluating it failed.

) Fixes llvm#45002.

…eferencing pointer to pointers (llvm#95298) This is a different implementation to llvm#94100, which has been reverted. When -fdebug-info-for-profiling is specified, for any Load expression if the pointer operand is not a declared variable, clang will emit debug info describing the type of the pointer operand (which can be an intermediate expr)

...via a function pointer.

It needs C++2b.

In GNU ld, -r forces -Bstatic and has precedence over -Bdynamic: -lfoo probes libfoo.a but not libfoo.so, even if -Bdynamic is in effect. Our behavior currently matches gold and probes libfoo.so. Since we don't have strong opinion on the exact behavior, let's just follow GNU ld and also unify the reason we report the "attempted static link of dynamic object " error. Close llvm#94958

…vm#95393) Make this legal for gfx940 and gfx12

…m#95394) Unlike the existing fadd cases, choose to ignore the requirement for amdgpu-unsafe-fp-atomics in case of fine-grained memory access. This is to minimize migration pain to the new atomic control metadata. This should not break any users, as the atomic intrinsics are still directly consumed, and clang does not yet produce vector FP atomicrmw.

ARMISD::SUBS is a duplicate of ARMISD::SUBC. The node was introduced in 5745b6a. This patch replaces SUBS with SUBC and reverts changes in *.td files.

Rewrite divideCeil, divideNearest, divideFloorSigned, and divideCeilSigned to never overflow.

Fix test case in llvm#95298 because another recent submitted patch removed llvm.dbg intrinsics, updated test case accordingly

Multiple static instances of this utility function have been found in different GlobalISel files. Unifying them by adding an instance in utils.cpp.

…llvm#95578) It matches the legalization of buffer loads similar to the SelectionDAG.

The dead code is caught by PVS studio analyzer - https://pvs-studio.com/en/blog/posts/cpp/1126/, fragment N12. Warning message - V523 The 'then' statement is equivalent to the 'else' statement. Options.cpp 1212

…lvm#95601) This patch adds folds for the cases where both operands are the same or where it can be established that the first operand is less than, equal to, or greater than the second operand.

close: llvm#94737 alive2: https://alive2.llvm.org/ce/z/WF_7mX In this patch, we combine `(X + Y) / 2` into `(X & Y)` only when both X and Y are less than or equal to 1.

…llvm#95521) Sinking currently only supports instructions that have zero or one uses. Extend this to handle instructions with any number of uses, as long as all uses are consistent (i.e. the "same" for all sinking candidates). After llvm#94462 this is basically just a matter of looping over all uses instead of checking the first one only.

…4271) Fixes: llvm#76426

…based on' (llvm#95650) As discussed in https://discourse.llvm.org/t/getelementptr-inbounds-inbounds-of-which-allocation/79024, we need the pointer to be inbounds of *the* allocated object the pointer is based on, not just any allocated object.

…lvm#95558) Expand all constant expressions that use fat pointers upfront, so that the rewriting logic only has to deal with instructions and not the constant expression variants as well. My primary motivation is to remove the creation of illegal constant expressions (mul and shl) from this pass, but this also cuts down quite a bit on the amount of duplicate logic.

This MIR test case is added to seek the consumption of VGPR lanes being used for SGPR spills during si-lower-sgpr-spills pass of AMDGPU pass pipeline. Basically, in this pass, stack slots are mapped to available VGPR lanes for spilling purpose, thus ending the need for stack slots. In current scenario, each new SGPR spill goes into new VGPR lanes as, being mapped from its distinct stack slots assigned during SGPR allocation pass. It can be clearly seen in the added test case.

For RISC-V, it's always 0 and I don't see any reason we will change it in the future.

For single-index GEPs the source and result element types are the same, but using the source type is semantically more correct.

llvm#91871) This PR adds initial support for the `scmp`/`ucmp` 3-way comparison intrinsics in the SelectionDAG. Some of the expansions/lowerings are not optimal yet.

…lvm#95531) This produces better/more canonical codegen than the generic LLVM lowering, which is a pattern the backend currently does not recognize. See: llvm#81840.

Only gfx908 was tested, and the returning versions weren't tested.

Follow up on llvm#95087 to fix incorrect usage instances of divideCeilSigned.

This represents the enum type that can be assigned to a field using the `<enum>` element in the target XML. https://sourceware.org/gdb/current/onlinedocs/gdb.html/Enum-Target-Types.html Each enumerator has: * A non-empty name * A value that is within the range of the field it's applied to The XML includes a "size" but we don't need that for anything and it's a pain to verify so I've left it out of our internal structures. When emitting XML we'll set size to the size of the register using the enum. An Enumerator class is added to RegisterFlags and hooked up to the existing ToXML so lldb-server can use it to emit enums as well. As enums are elements on the same level as flags, when emitting XML we'll do so via the registers. Before emitting a flags element we look at all the fields and see what enums they reference. Then print all of those if we haven't already done so. Functions are added to dump enum information for `register info` to use to show the enum information.

This PR adds debug support for fixed size character type. The character type gets translated to DIStringType. As I have noticed in comments, currently DIStringType does not have a way to represent the underlying character type of the string. This restricts our ability to represent wide string. As an example, this is how the debugger shows 2 different type of string. Note that non-ascii characters work ok with default kind string. character(kind=4, len=5) :: str1 character(len=16) :: str2 str1 = 'hello' str2 = 'π = 3.14' (gdb) p str1 $1 = 'h\000\000\000e\000\000\000l\000\000\000l\000\000\000o\000\000\000' (gdb) p str2 $2 = 'π = 3.14 '

This doesn't need any work to be done in SROA itself, but rather in functions that it uses. Specifically: * DIExpression::createFragmentExpression is made to understand DW_OP_LLVM_extract_bits * valueCoversEntireFragment is made to check the active bits instead of the fragment size, so that it handles extract_bits correctly

Co-authored-by: Louis Dionne <ldionne.2@gmail.com>

…bv(A, 0), 0) (llvm#95242) There is an existing combine to remove the need for extract_subv that requires matching vector types (all fixed or all scalable). The combine doesn't need this restriction and so I've changed it to use ValueType's "knownBits??" interface that supports mixed vector types. In doing so we also need extra guards to prevent invalid operations (e.g. extracting a scalable vector from a fixed length vector).

Fix passing temporary string object as argument to the StringRef constructor in "parseRegister" function, because it causes errors in the test "llvm/test/MC/Xtensa/Core/processor-control.s".

This should simplify handling of resulting value by the callers.

…94346) The `llvm.invariant.start` intrinsic is already overloaded to work with memory objects in any address space. We simply instantiate the intrinsic with the appropriate pointer type. Fixes llvm#94345. Co-authored-by: Vito Kortbeek <kortbeek@synopsys.com>

Fix regression introduced in d4b8b72

KB9 and others added 30 commits June 14, 2024 16:09

[mlir][tosa] Only match rfft2d of floats in linalg conversion (llvm#9…

93ffe17

…3432) This prevents an assertion being triggered by the cast to FloatType. Fixes llvm#92064

Reapply "[libc] printf, putchar and vprintf in bareemetal entrypoints… (

98b117e

llvm#95619) This reverts commit eca988a. The underlying libc issue was fixed by PR#95576. The original PR is llvm#95436 , which adds printf, putchar and vprintf in bareemetal entrypoints

[AMDGPU] Fix lit failure (llvm#95620)

84e9401

Lit test in llvm@7091dd2 was not updated for llvm@e7e90dd

[gn build] Port cfbed2c

2693811

[llvm][AArch64] Support -mcpu=apple-m4 (llvm#95478)

2b33591

[llvm][AArch64] Rearrange Apple CPUs by generation, not product class…

a0cef2b

…. NFC (llvm#95579)

[libc][__support][bit] Switch popcount to Brian Kernighan’s Algorithm (…

6f5dfbd

…llvm#95625)

[mlir][scf]: Copy old attributes of old ForOp in replaceWithAdditiona…

d7e4813

…lYields (llvm#95502)

[clang-format] Don't over-indent comment below unbraced body (llvm#95354

cddb9ce

) Fixes llvm#45002.

[clang][Interp] Fix calling lambdas with explicit instance pointers...

7b6447a

...via a function pointer.

[clang][Interp][test] Move explicit object parameter test to cxx23.cpp

bb3091a

It needs C++2b.

[llvm] Use llvm::unique (NFC) (llvm#95628)

7c6d0d2

AMDGPU: Legalize atomicrmw fadd for v2f16/v2bf16 for local memory (ll…

0a9a5f9

…vm#95393) Make this legal for gfx940 and gfx12

[ARM] Remove duplicate custom SDag node (NFCI) (llvm#93419)

23c1b48

ARMISD::SUBS is a duplicate of ARMISD::SUBC. The node was introduced in 5745b6a. This patch replaces SUBS with SUBC and reverts changes in *.td files.

MathExtras: rewrite some methods to never overflow (llvm#95556)

bfd95a0

Rewrite divideCeil, divideNearest, divideFloorSigned, and divideCeilSigned to never overflow.

[clang][Interp] Fix checking null pointers for initialization

17712f5

[Debug Info] Fix debug info ptr to ptr test (llvm#95637)

47f8b85

Fix test case in llvm#95298 because another recent submitted patch removed llvm.dbg intrinsics, updated test case accordingly

[GISel] Unify multiple instances of getTypeForLLT (NFC) (llvm#95577)

27bebc1

Multiple static instances of this utility function have been found in different GlobalISel files. Unifying them by adding an instance in utils.cpp.

[AMDGPU][GISel] Use datalayout alignment for buffer-load legalization (…

5e9fcb9

…llvm#95578) It matches the legalization of buffer loads similar to the SelectionDAG.

xgupta and others added 25 commits June 17, 2024 09:56

[LLDB] Remove dead code (NFC) (llvm#95713)

e4e350e

The dead code is caught by PVS studio analyzer - https://pvs-studio.com/en/blog/posts/cpp/1126/, fragment N12. Warning message - V523 The 'then' statement is equivalent to the 'else' statement. Options.cpp 1212

[InstSimplify] Implement simple folds for ucmp/scmp intrinsics (l…

b7b3d17

…lvm#95601) This patch adds folds for the cases where both operands are the same or where it can be established that the first operand is less than, equal to, or greater than the second operand.

[InstCombine] simplify average of lsb (llvm#95684)

1d4e857

close: llvm#94737 alive2: https://alive2.llvm.org/ce/z/WF_7mX In this patch, we combine `(X + Y) / 2` into `(X & Y)` only when both X and Y are less than or equal to 1.

[clang][AArch64] Add validation for Global Register Variable. (llvm#9…

5fe7f73

…4271) Fixes: llvm#76426

[RISCV] Remove getOffsetOfLocalArea() (llvm#93765)

94a6b9c

For RISC-V, it's always 0 and I don't see any reason we will change it in the future.

[InstCombine] Prefer source over result element type (NFC)

9a86d0a

For single-index GEPs the source and result element types are the same, but using the source type is semantically more correct.

[SelectionDAG] Add support for the 3-way comparison intrinsics [US]CMP (

995835f

llvm#91871) This PR adds initial support for the `scmp`/`ucmp` 3-way comparison intrinsics in the SelectionDAG. Some of the expansions/lowerings are not optimal yet.

[mlir][ArmSVE] Lower predicate-sized vector.create_masks to whilelt (l…

657ec73

…lvm#95531) This produces better/more canonical codegen than the generic LLVM lowering, which is a pattern the backend currently does not recognize. See: llvm#81840.

AMDGPU: Cleanup struct buffer atomic fadd intrinsic tests

87aed82

Only gfx908 was tested, and the returning versions weren't tested.

mlir: fix incorrect usages of divideCeilSigned (llvm#95680)

e843f02

Follow up on llvm#95087 to fix incorrect usage instances of divideCeilSigned.

[libc++] Mark more types as trivially relocatable (llvm#89724)

1ba8ed0

Co-authored-by: Louis Dionne <ldionne.2@gmail.com>

[Xtensa] Fix register asm parsing. (llvm#95551)

52d87de

Fix passing temporary string object as argument to the StringRef constructor in "parseRegister" function, because it causes errors in the test "llvm/test/MC/Xtensa/Core/processor-control.s".

[clang][CodeGen] Return RValue from EmitVAArg (llvm#94635)

6d973b4

This should simplify handling of resulting value by the callers.

[MachineLICM] Correctly Apply Register Masks (llvm#95746)

770393b

Fix regression introduced in d4b8b72

[AutoBump] Merge with fixes of 93ffe17 (Jun 15)

ab22c35

[AutoBump] Merge with 770393b (Jun 17)

7e79490

mgehre-amd changed the title ~~[AutoBump] Merge with 770393bb (Jun 17) (1)~~ [AutoBump] Merge with 770393bb (Jun 17) (79) Sep 12, 2024

cferry-AMD approved these changes Sep 13, 2024

View reviewed changes

Base automatically changed from bump_to_c091dd48 to feature/fused-ops September 16, 2024 10:59

mgehre-amd merged commit 7e79490 into feature/fused-ops Sep 16, 2024
10 checks passed

mgehre-amd deleted the bump_to_770393bb branch September 16, 2024 10:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AutoBump] Merge with 770393bb (Jun 17) (79) #343

[AutoBump] Merge with 770393bb (Jun 17) (79) #343

mgehre-amd commented Sep 12, 2024

[AutoBump] Merge with 770393bb (Jun 17) (79) #343

[AutoBump] Merge with 770393bb (Jun 17) (79) #343

Conversation

mgehre-amd commented Sep 12, 2024