[AutoBump] Merge with a853d799 (25) #283

cferry-AMD · 2024-08-19T09:33:36Z

No description provided.

This patch fixes: clang/lib/CodeGen/CGExpr.cpp:5607:11: error: variable 'Result' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized]

The readme only states the goal and has links to further information, e.g., our meetings. --------- Co-authored-by: Shilei Tian <i@tianshilei.me>

Have to compare actual type size to pick up proper cast operation opcode.

Straightforward computation of `A − FLOOR (A / P) * P` should produce NaN, when P is infinity. The -menable-no-infs lowering can still use the relaxed operations sequence.

…lvm#87518) Reverts llvm#75481 Breaks multiple bots, see llvm#75481

This paper did not add any normative changes for us to check conformance against. It added a note describing a potential behavioral difference between compile-time and runtime evaluation of negative floating-point values in the presence of rounding modes.

…GlobalISel/legalizer/rvv/legalize-xor.mir

This was accidentally removed in https://reviews.llvm.org/D137799#4657404 / https://reviews.llvm.org/D137799#C3933303OL44, and downstream projects are forced to add it back. For example, https://git.savannah.gnu.org/cgit/guix.git/commit/?id=4e26331a5ee87928a16888c36d51e270f0f10f90 Fix this, by re-adding it. Co-authored-by: MarcoFalke <*~=`'#}+{/-|&$^_@721217.xyz>

…r reordering. If the node has cmp instruction with 3 or more different but swappable predicates, need to keep same kind of main/alternate opcodes to avoid incorrect detection of opcodes after reordering. Reordering changes the order and we may erroneously consider swappable opcodes as non-compatible/alternate, which may lead to a later compiler crash. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: llvm#87267

This addition catches common cases of malformed `tosa.reshape` ops. This prevents the `--tosa-to-tensor` pass from asserting when fed invalid operations, as these will be caught ahead of time by the verifier. Closes llvm#87396

Justifications: - LWG3950: Done in llvm#66206 - LWG3975: Wording changes only - LWG4011: Wording changes only - LWG4030: Wording changes only - LWG4043: Wording changes only - LWG3036 and P2875R4: We implemented neither, but the latter reverts the former, so now we implement both without doing anything!

`DefaultTimingManager::clear()` uses `out` to initialize `TimerImpl`, but the `out` is `nullptr` by default. This means if `DefaultTimingManager::setOutput()` is never called, `DefaultTimingManager` destructor may generate SIGSEGV.

…r available_externally functions (llvm#87279) This is to fix an assertion error. Apparently, `pseudo_probe_desc` could still be available for import functions, and its checksum mismatch state can be different from import function's `profile-checksum-mismatch` attr. This happens when unstable IR or ODR violation issue occurs, the definitions of the same function across different translation units could be different and result in different checksums. During link time deduplication, the internal function definition (the checksum in desc is computed based on) is substituted by the `available_externally` definition, which cause the inconsistency. Hence, we fix it to by always checking the state for the new `available_externally` definition, which is saved in the function attribute.

…ields"" (llvm#87529) Reverts llvm#87518 Revert is not needed as the regression was fixed with 1189e87. I assumed the crash and warning are different issues, but according to https://lab.llvm.org/buildbot/#/builders/240/builds/26629 fixing warning resolves the crash.

…lvm#87467) By generic intrinsics this mean things like dup, ext, zip and bsl that can always be executed with integer s16 operations and do not require fullfp16. This makes them always available, and brings them inline with GCC. https://godbolt.org/z/azs8eMv54 The relevant test cases have been moved into their own files, to allow them to be tested with armv8-a and armv8.2-a+fp16.

Before all the call probe ids are after block ids, in this change, it mixed the call probe and block probe by reordering them in lexical(line-number) order. For example: ``` main(): BB1 if(...) BB2 foo(..); else BB3 bar(...); BB4 ``` Before the profile is ``` main 1: .. 2: .. 3: ... 4: ... 5: foo ... 6: bar ... ``` Now the new order is ``` main 1: .. 2: .. 3: foo ... 4: ... 5: bar ... 6: ... ``` This can potentially make it more tolerant of profile mismatch, either from stale profile or frontend change. e.g. before if we add one block, even the block is the last one, all the call probes are shifted and mismatched. Moreover, this makes better use of call-anchor based stale profile matching. Blocks are matched based on the closest anchor, there would be more anchors used for the matching, reduce the mismatch scope.

Implemented long-standing TODO to support commutative intrinsics. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: llvm#86316

Add zext nneg tests and check we don't fold casts with different src types

…vm#87537) We should consistently use PseudoInstr instead of Mnemonic to name SIMCInstr, even though they may be the same in most cases

Allows src1 of VOP3 encoded VOPC to be an SGPR or inline immediate on GFX1150Plus The w32 and w64 _e64_dpp assembler only real instructions were unused, and erroneously constructed in a way that bugged parsing of the new instructions. They are removed. This patch is a follow up to PR llvm#87382

@clayborg

The previous diff (and it's subsequent fix) were reverted as the tests didn't work properly on the AArch64 & ARM LLDB buildbots. I made a couple more minor changes to tests (from @clayborg's feedback) and disabled them for non Linux-x86(_64) builds, as I don't have the ability do anything about an ARM64 Linux failure. If I had to guess, I'd say the toolchain on the buildbots isn't respecting the `-Wl,--build-id` flag. Maybe, one day, when I have a Linux AArch64 system I'll dig in to it. From the reverted PR: I've migrated the tests in my llvm#79181 from shell to API (at @JDevlieghere's suggestion) and addressed a couple issues that were exposed during testing. The tests first test the "normal" situation (no DebugInfoD involvement, just normal debug files sitting around), then the "no debug info" situation (to make sure the test is seeing failure properly), then it tests to validate that when DebugInfoD returns the symbols, things work properly. This is duplicated for DWP/split-dwarf scenarios. --------- Co-authored-by: Kevin Frei <freik@meta.com>

…m#87492)

…ructions. Compiler can improve analysis for operands of UIToFP/SIToFP instructions and operands of ICmp instruction. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: llvm#85966

…Cmp instructions." This reverts commit 899855d to fix the issue reported in https://lab.llvm.org/buildbot/#/builders/165/builds/51659.

Some TUs apparently end up with an ambiguity between `::llvm::detail` and `support::detail`, so we close that gap at the source.

Adding OffTType to fcntl.h and stdio.h 's Macro lists in libc/spec/posix.td as mentioned here: llvm#87266

…ructions. Compiler can improve analysis for operands of UIToFP/SIToFP instructions and operands of ICmp instruction. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: llvm#85966

…improve costs for testing Improves SSE vs AVX test results for llvm#87510

…#87543)

…hange_{weak,strong}` (llvm#87135) Spotted this minor mistake in the tests as I was looking into testing more thoroughly `atomic_ref`. The two argument overloads are tested just above. The names of the lambda clearly indicates that the intent was to test the one argument overload.

…for bitfields""" (llvm#87562) Reverts llvm#87529 Reverts llvm#87518 https://lab.llvm.org/buildbot/#/builders/37/builds/33262 is still broken

… and G_ICMP for scalable vector types This patch legalizes G_ZEXT, G_SEXT, and G_ANYEXT. If the type is a legal mask type, then the instruction is legalized as the element-wise select, where the condition on the select is the mask typed source operand, and the true and false values are 1 or -1 (for zero/any-extension and sign extension) and zero. If the type is a legal integer or vector integer type, then the instruction is marked as legal. The legalization of the extends may introduce a G_SPLAT_VECTOR, which needs to be legalized in this patch for the extend test cases to pass. A G_SPLAT_VECTOR is legal if the vector type is a legal integer or floating point vector type and the source operand is sXLen type. This is because the SelectionDAG patterns only support sXLen typed ISD::SPLAT_VECTORS, and we'd like to reuse those patterns. A G_SPLAT_VECTOR is cutom legalized if it has a legal s1 element vector type and s1 scalar operand. It is legalized to G_VMSET_VL or G_VMCLR_VL if the splat is all ones or all zeros respectivley. In the case of a non-constant mask splat, we legalize by promoting the scalar value to s8. In order to get the s8 element vector back into s1 vector, we use a G_ICMP. In order for the splat vector and extend tests to pass, we also need to legalize G_ICMP in this patch. A G_ICMP is legal if the destination type is a legal bool vector and the LHS and RHS are legal integer vector types.

…alable vector type

… with scalable vector type

This reverts commit 23616c6 because it breaks Fuchsia Clang toolchain builders. https://luci-milo.appspot.com/ui/p/fuchsia/builders/toolchain.ci/clang-linux-x64/b8751656876289840849/overview

…roll patterns (llvm#86005) Updates smmla unrolling patterns to handle vecmat contracts where `dimM=1`. This includes explicit vecmats in the form: `<1x8xi8> x <8x8xi8> --> <1x8xi32>` or implied with the leading dim folded: `<8xi8> x <8x8xi8> --> <8xi32>` Since the smmla operates on two `<2x8xi8>` input vectors to produce `<2x2xi8>` accumulators, half of each 2x2 accumulator tile is dummy data not pertinent to the computation, resulting in half throughput.

…sposes solely on leading unit dims. (llvm#85694) Updates `castAwayContractionLeadingOneDim` to check for leading unit dimensions before inserting `vector.transpose` ops. Currently `castAwayContractionLeadingOneDim` removes all leading unit dims based on the accumulator and transpose any subsequent operands to match the accumulator indexing. This does not take into account if the transpose is strictly necessary, for instance when given this vector-matrix contract: ```mlir %result = vector.contract {indexing_maps = [affine_map<(d0, d1, d2, d3) -> (d0, d1, d3)>, affine_map<(d0, d1, d2, d3) -> (d0, d2, d3)>, affine_map<(d0, d1, d2, d3) -> (d1, d2)>], iterator_types = ["parallel", "parallel", "parallel", "reduction"], kind = #vector.kind<add>} %lhs, %rhs, %acc : vector<1x1x8xi32>, vector<1x8x8xi32> into vector<1x8xi32> ``` Passing this through `castAwayContractionLeadingOneDim` pattern produces the following: ```mlir %0 = vector.transpose %arg0, [1, 0, 2] : vector<1x1x8xi32> to vector<1x1x8xi32> %1 = vector.extract %0[0] : vector<1x8xi32> from vector<1x1x8xi32> %2 = vector.extract %arg2[0] : vector<8xi32> from vector<1x8xi32> %3 = vector.contract {indexing_maps = [affine_map<(d0, d1, d2) -> (d0, d2)>, affine_map<(d0, d1, d2) -> (d0, d1, d2)>, affine_map<(d0, d1, d2) -> (d1)>], iterator_types = ["parallel", "parallel", "reduction"], kind = #vector.kind<add>} %1, %arg1, %2 : vector<1x8xi32>, vector<1x8x8xi32> into vector<8xi32> %4 = vector.broadcast %3 : vector<8xi32> to vector<1x8xi32> ``` The `vector.transpose` introduced does not affect the underlying data layout (effectively a no op), but it cannot be folded automatically. This change avoids inserting transposes when only leading unit dimensions are involved. Fixes llvm#85691

…tedType (llvm#87582) Previously the leading space was added in each string constant. This patch moves the leading space out of the string constants and is instead explicitly added to add clarity to the code.

Reverts llvm#86812. This commit caused a regression on the x86_64 MacOS buildbot: https://green.lab.llvm.org/job/llvm.org/view/LLDB/job/lldb-cmake/784/

DeclRef to field must be marked as LValue to be consistent with how the field decl will be evaluated. T->desugar() is unnecessary to call ->isArrayType().

This is a followup to llvm#86359 "[lldb] [ObjectFileMachO] LLVM_COV is not mapped into firmware memory (llvm#86359)" where I treat LLVM_COV segments in a Mach-O binary as non-loadable. There is another codepath in `DynamicLoaderStatic::LoadAllImagesAtFileAddresses` which is called to set the load addresses for a Module to the file addresses. It has no logic to detect a segment that is not loaded in virtual memory (ObjectFileMachO::SectionIsLoadable), so it would set the load address for this LLVM_COV segment to the file address and shadow actual code, breaking lldb behavior. This method currently sets the load address for any section that doesn't have a load address set already. This presumes that a Module was added to the Target, some mechanism set the correct load address for SOME segments, and then this method is going to set the other segments to a no-slide value, assuming they were forgotten. ObjectFile base class doesn't, today, vend a SectionIsLoadable method, but we do have ObjectFile::SetLoadAddress and at a higher level, Module::SetLoadAddress, when we're setting the same slide to all segments. That's the behavior we want in this method. If any section has a load address, we don't touch this Module. Otherwise we set all sections to have a load address that is the same as the file address. I also audited the other parts of lldb that are calling SectionList::SectionLoadAddress and looked if they should be more correctly using Module::SetLoadAddress for the entire binary. But in most cases, we have the potential for different slides for different sections so this section-by-section approach must be taken. rdar://125800290

…VRegisterBankInfo::getInstrMapping. This removes the special case for vectors. The default case in the second switch can handle GPR in addition to vectors. We just won't use the static ValueMapping entry.

…7301) Use the return type to measure the LMUL size for latency/throughput cost

…rhs` - When both operands are constant, the matcher runs into an infinite loop as the commutation should be applied only when LHS is a constant and RHS is not. Reviewers: arsenm Reviewed By: arsenm Pull Request: llvm#87426

… is vector. NFC If the type is vector, we can immediately know to use vector mapping. Previously we searched for FP uses, but then replaced it if the type was vector.

kazutakahirata and others added 30 commits April 3, 2024 09:55

[CodeGen] Fix a warning

1189e87

This patch fixes: clang/lib/CodeGen/CGExpr.cpp:5607:11: error: variable 'Result' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized]

[Offload][NFC] Add offload subfolder and README (llvm#77154)

33992ea

The readme only states the goal and has links to further information, e.g., our meetings. --------- Co-authored-by: Shilei Tian <i@tianshilei.me>

[SLP]Fix PR87477: fix alternate node cast cost/codegen.

07a5667

Have to compare actual type size to pick up proper cast operation opcode.

[flang] Fixed MODULO(x, inf) to produce NaN. (llvm#86145)

315c88c

Straightforward computation of `A − FLOOR (A / P) * P` should produce NaN, when P is infinity. The -menable-no-infs lowering can still use the relaxed operations sequence.

Revert "[clang][UBSan] Add implicit conversion check for bitfields" (l…

5822ca5

…lvm#87518) Reverts llvm#75481 Breaks multiple bots, see llvm#75481

[clang] Precommit test for llvm.allow.ubsan.check() (llvm#87435)

6099639

[RISCV][GISEL] Run update_mir_test_checks on llvm/test/CodeGen/RISCV/…

07d3f2a

…GlobalISel/legalizer/rvv/legalize-xor.mir

Updates to 'tosa.reshape' verifier (llvm#87416)

fbcd0c6

This addition catches common cases of malformed `tosa.reshape` ops. This prevents the `--tosa-to-tensor` pass from asserting when fed invalid operations, as these will be caught ahead of time by the verifier. Closes llvm#87396

[SLP]Add support for commutative intrinsics.

d578840

Implemented long-standing TODO to support commutative intrinsics. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: llvm#86316

[VectorCombine][X86] Add additional tests for llvm#87510

b15d27e

Add zext nneg tests and check we don't fold casts with different src types

[AArch64] Add a test for non-temporal masked loads / stores. NFC

52ae02d

AMDGPU: Use PseudoInstr to name SIMCInstr for DSDIR and SOPs, NFC (ll…

7c68a95

…vm#87537) We should consistently use PseudoInstr instead of Mnemonic to name SIMCInstr, even though they may be the same in most cases

[AMDGPU] Add a missing COV6 case to getAMDHSACodeObjectVersion() (llv…

607b4bc

…m#87492)

[SLP]Improve minbitwidth analysis for operands of IToFP and ICmp inst…

899855d

…ructions. Compiler can improve analysis for operands of UIToFP/SIToFP instructions and operands of ICmp instruction. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: llvm#85966

Revert "[SLP]Improve minbitwidth analysis for operands of IToFP and I…

fa2bbea

…Cmp instructions." This reverts commit 899855d to fix the issue reported in https://lab.llvm.org/buildbot/#/builders/165/builds/51659.

fully qualifies use of detail namespace (llvm#87536)

e506dd0

Some TUs apparently end up with an ambiguity between `::llvm::detail` and `support::detail`, so we close that gap at the source.

[libc] Added transitive bindings for OffsetType (llvm#87397)

3ee93f4

Adding OffTType to fcntl.h and stdio.h 's Macro lists in libc/spec/posix.td as mentioned here: llvm#87266

[SLP]Improve minbitwidth analysis for operands of IToFP and ICmp inst…

42cbceb

…ructions. Compiler can improve analysis for operands of UIToFP/SIToFP instructions and operands of ICmp instruction. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: llvm#85966

[VectorCombine][X86] shuffle-of-casts.ll - adjust zext nneg tests to …

d53b829

…improve costs for testing Improves SSE vs AVX test results for llvm#87510

vzakhari and others added 22 commits April 3, 2024 14:49

[flang][runtime] Enable I/O APIs in F18 runtime offload builds. (llvm…

718638d

…#87543)

Revert "Revert "Revert "[clang][UBSan] Add implicit conversion check …

029e1d7

…for bitfields""" (llvm#87562) Reverts llvm#87529 Reverts llvm#87518 https://lab.llvm.org/buildbot/#/builders/37/builds/33262 is still broken

[RISCV][GISEL] Regbank select for scalable vector G_ICMP

05f673b

[RISCV][GISEL] Instruction selection for G_ICMP

35a9393

[RISCV][GISEL] Regbankselect for G_ZEXT, G_SEXT, and G_ANYEXT with sc…

188ca37

…alable vector type

[RISCV][GISEL] Instruction selection for G_ZEXT, G_SEXT, and G_ANYEXT…

63c925c

… with scalable vector type

Revert "dsymutil: Re-add missing -latomic (llvm#85380)"

be57c90

This reverts commit 23616c6 because it breaks Fuchsia Clang toolchain builders. https://luci-milo.appspot.com/ui/p/fuchsia/builders/toolchain.ci/clang-linux-x64/b8751656876289840849/overview

[Bounds-Safety][NFC] Clean up leading space emission for CountAttribu…

5e3da75

…tedType (llvm#87582) Previously the leading space was added in each string constant. This patch moves the leading space out of the string constants and is instead explicitly added to add clarity to the code.

Revert "DebugInfoD issues, take 2" (llvm#87583)

20433e9

Reverts llvm#86812. This commit caused a regression on the x86_64 MacOS buildbot: https://green.lab.llvm.org/job/llvm.org/view/LLDB/job/lldb-cmake/784/

[BoundsSafety] Minor fixes on counted_by (llvm#87559)

7508438

DeclRef to field must be marked as LValue to be consistent with how the field decl will be evaluated. T->desugar() is unnecessary to call ->isArrayType().

[mlir][vector] Skip 0D vectors in vector linearization. (llvm#87577)

ef5a710

[RISCV] Remove G_TRUNC/ZEXT/SEXT/ANYEXT from the first switch in RISC…

7e2a1d6

…VRegisterBankInfo::getInstrMapping. This removes the special case for vectors. The default case in the second switch can handle GPR in addition to vectors. We just won't use the static ValueMapping entry.

[RISCV][TTI] Scale the cost of intrinsic stepvector with LMUL (llvm#8…

97523e5

…7301) Use the return type to measure the LMUL size for latency/throughput cost

[clang] Init fields added by llvm#87357

abd05eb

[RISCV][GISel] Don't check for FP uses of of IMPLICIT_DEF if the type…

a853d79

… is vector. NFC If the type is vector, we can immediately know to use vector mapping. Previously we searched for FP uses, but then replaced it if the type was vector.

[AutoBump] Merge with a853d79

c510199

cferry-AMD requested review from mgehre-amd and josel-amd August 20, 2024 06:44

mgehre-amd approved these changes Aug 20, 2024

View reviewed changes

Base automatically changed from bump_to_5b702be1 to feature/fused-ops August 21, 2024 20:23

mgehre-amd merged commit 8b08487 into feature/fused-ops Aug 21, 2024
10 checks passed

mgehre-amd deleted the bump_to_a853d799 branch August 21, 2024 20:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AutoBump] Merge with a853d799 (25) #283

[AutoBump] Merge with a853d799 (25) #283

cferry-AMD commented Aug 19, 2024

[AutoBump] Merge with a853d799 (25) #283

[AutoBump] Merge with a853d799 (25) #283

Conversation

cferry-AMD commented Aug 19, 2024