forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AutoBump] Merge with a853d799 (25) #283
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This patch fixes: clang/lib/CodeGen/CGExpr.cpp:5607:11: error: variable 'Result' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized]
The readme only states the goal and has links to further information, e.g., our meetings. --------- Co-authored-by: Shilei Tian <i@tianshilei.me>
Have to compare actual type size to pick up proper cast operation opcode.
Straightforward computation of `A − FLOOR (A / P) * P` should produce NaN, when P is infinity. The -menable-no-infs lowering can still use the relaxed operations sequence.
…lvm#87518) Reverts llvm#75481 Breaks multiple bots, see llvm#75481
This paper did not add any normative changes for us to check conformance against. It added a note describing a potential behavioral difference between compile-time and runtime evaluation of negative floating-point values in the presence of rounding modes.
…GlobalISel/legalizer/rvv/legalize-xor.mir
This was accidentally removed in https://reviews.llvm.org/D137799#4657404 / https://reviews.llvm.org/D137799#C3933303OL44, and downstream projects are forced to add it back. For example, https://git.savannah.gnu.org/cgit/guix.git/commit/?id=4e26331a5ee87928a16888c36d51e270f0f10f90 Fix this, by re-adding it. Co-authored-by: MarcoFalke <*~=`'#}+{/-|&$^_@721217.xyz>
…r reordering. If the node has cmp instruction with 3 or more different but swappable predicates, need to keep same kind of main/alternate opcodes to avoid incorrect detection of opcodes after reordering. Reordering changes the order and we may erroneously consider swappable opcodes as non-compatible/alternate, which may lead to a later compiler crash. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: llvm#87267
This addition catches common cases of malformed `tosa.reshape` ops. This prevents the `--tosa-to-tensor` pass from asserting when fed invalid operations, as these will be caught ahead of time by the verifier. Closes llvm#87396
Justifications: - LWG3950: Done in llvm#66206 - LWG3975: Wording changes only - LWG4011: Wording changes only - LWG4030: Wording changes only - LWG4043: Wording changes only - LWG3036 and P2875R4: We implemented neither, but the latter reverts the former, so now we implement both without doing anything!
`DefaultTimingManager::clear()` uses `out` to initialize `TimerImpl`, but the `out` is `nullptr` by default. This means if `DefaultTimingManager::setOutput()` is never called, `DefaultTimingManager` destructor may generate SIGSEGV.
…r available_externally functions (llvm#87279) This is to fix an assertion error. Apparently, `pseudo_probe_desc` could still be available for import functions, and its checksum mismatch state can be different from import function's `profile-checksum-mismatch` attr. This happens when unstable IR or ODR violation issue occurs, the definitions of the same function across different translation units could be different and result in different checksums. During link time deduplication, the internal function definition (the checksum in desc is computed based on) is substituted by the `available_externally` definition, which cause the inconsistency. Hence, we fix it to by always checking the state for the new `available_externally` definition, which is saved in the function attribute.
…ields"" (llvm#87529) Reverts llvm#87518 Revert is not needed as the regression was fixed with 1189e87. I assumed the crash and warning are different issues, but according to https://lab.llvm.org/buildbot/#/builders/240/builds/26629 fixing warning resolves the crash.
…lvm#87467) By generic intrinsics this mean things like dup, ext, zip and bsl that can always be executed with integer s16 operations and do not require fullfp16. This makes them always available, and brings them inline with GCC. https://godbolt.org/z/azs8eMv54 The relevant test cases have been moved into their own files, to allow them to be tested with armv8-a and armv8.2-a+fp16.
Before all the call probe ids are after block ids, in this change, it mixed the call probe and block probe by reordering them in lexical(line-number) order. For example: ``` main(): BB1 if(...) BB2 foo(..); else BB3 bar(...); BB4 ``` Before the profile is ``` main 1: .. 2: .. 3: ... 4: ... 5: foo ... 6: bar ... ``` Now the new order is ``` main 1: .. 2: .. 3: foo ... 4: ... 5: bar ... 6: ... ``` This can potentially make it more tolerant of profile mismatch, either from stale profile or frontend change. e.g. before if we add one block, even the block is the last one, all the call probes are shifted and mismatched. Moreover, this makes better use of call-anchor based stale profile matching. Blocks are matched based on the closest anchor, there would be more anchors used for the matching, reduce the mismatch scope.
Implemented long-standing TODO to support commutative intrinsics. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: llvm#86316
Add zext nneg tests and check we don't fold casts with different src types
…vm#87537) We should consistently use PseudoInstr instead of Mnemonic to name SIMCInstr, even though they may be the same in most cases
Allows src1 of VOP3 encoded VOPC to be an SGPR or inline immediate on GFX1150Plus The w32 and w64 _e64_dpp assembler only real instructions were unused, and erroneously constructed in a way that bugged parsing of the new instructions. They are removed. This patch is a follow up to PR llvm#87382
The previous diff (and it's subsequent fix) were reverted as the tests didn't work properly on the AArch64 & ARM LLDB buildbots. I made a couple more minor changes to tests (from @clayborg's feedback) and disabled them for non Linux-x86(_64) builds, as I don't have the ability do anything about an ARM64 Linux failure. If I had to guess, I'd say the toolchain on the buildbots isn't respecting the `-Wl,--build-id` flag. Maybe, one day, when I have a Linux AArch64 system I'll dig in to it. From the reverted PR: I've migrated the tests in my llvm#79181 from shell to API (at @JDevlieghere's suggestion) and addressed a couple issues that were exposed during testing. The tests first test the "normal" situation (no DebugInfoD involvement, just normal debug files sitting around), then the "no debug info" situation (to make sure the test is seeing failure properly), then it tests to validate that when DebugInfoD returns the symbols, things work properly. This is duplicated for DWP/split-dwarf scenarios. --------- Co-authored-by: Kevin Frei <freik@meta.com>
…ructions. Compiler can improve analysis for operands of UIToFP/SIToFP instructions and operands of ICmp instruction. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: llvm#85966
…Cmp instructions." This reverts commit 899855d to fix the issue reported in https://lab.llvm.org/buildbot/#/builders/165/builds/51659.
Some TUs apparently end up with an ambiguity between `::llvm::detail` and `support::detail`, so we close that gap at the source.
Adding OffTType to fcntl.h and stdio.h 's Macro lists in libc/spec/posix.td as mentioned here: llvm#87266
…ructions. Compiler can improve analysis for operands of UIToFP/SIToFP instructions and operands of ICmp instruction. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: llvm#85966
…improve costs for testing Improves SSE vs AVX test results for llvm#87510
…hange_{weak,strong}` (llvm#87135) Spotted this minor mistake in the tests as I was looking into testing more thoroughly `atomic_ref`. The two argument overloads are tested just above. The names of the lambda clearly indicates that the intent was to test the one argument overload.
…for bitfields""" (llvm#87562) Reverts llvm#87529 Reverts llvm#87518 https://lab.llvm.org/buildbot/#/builders/37/builds/33262 is still broken
… and G_ICMP for scalable vector types This patch legalizes G_ZEXT, G_SEXT, and G_ANYEXT. If the type is a legal mask type, then the instruction is legalized as the element-wise select, where the condition on the select is the mask typed source operand, and the true and false values are 1 or -1 (for zero/any-extension and sign extension) and zero. If the type is a legal integer or vector integer type, then the instruction is marked as legal. The legalization of the extends may introduce a G_SPLAT_VECTOR, which needs to be legalized in this patch for the extend test cases to pass. A G_SPLAT_VECTOR is legal if the vector type is a legal integer or floating point vector type and the source operand is sXLen type. This is because the SelectionDAG patterns only support sXLen typed ISD::SPLAT_VECTORS, and we'd like to reuse those patterns. A G_SPLAT_VECTOR is cutom legalized if it has a legal s1 element vector type and s1 scalar operand. It is legalized to G_VMSET_VL or G_VMCLR_VL if the splat is all ones or all zeros respectivley. In the case of a non-constant mask splat, we legalize by promoting the scalar value to s8. In order to get the s8 element vector back into s1 vector, we use a G_ICMP. In order for the splat vector and extend tests to pass, we also need to legalize G_ICMP in this patch. A G_ICMP is legal if the destination type is a legal bool vector and the LHS and RHS are legal integer vector types.
…alable vector type
… with scalable vector type
This reverts commit 23616c6 because it breaks Fuchsia Clang toolchain builders. https://luci-milo.appspot.com/ui/p/fuchsia/builders/toolchain.ci/clang-linux-x64/b8751656876289840849/overview
…roll patterns (llvm#86005) Updates smmla unrolling patterns to handle vecmat contracts where `dimM=1`. This includes explicit vecmats in the form: `<1x8xi8> x <8x8xi8> --> <1x8xi32>` or implied with the leading dim folded: `<8xi8> x <8x8xi8> --> <8xi32>` Since the smmla operates on two `<2x8xi8>` input vectors to produce `<2x2xi8>` accumulators, half of each 2x2 accumulator tile is dummy data not pertinent to the computation, resulting in half throughput.
…sposes solely on leading unit dims. (llvm#85694) Updates `castAwayContractionLeadingOneDim` to check for leading unit dimensions before inserting `vector.transpose` ops. Currently `castAwayContractionLeadingOneDim` removes all leading unit dims based on the accumulator and transpose any subsequent operands to match the accumulator indexing. This does not take into account if the transpose is strictly necessary, for instance when given this vector-matrix contract: ```mlir %result = vector.contract {indexing_maps = [affine_map<(d0, d1, d2, d3) -> (d0, d1, d3)>, affine_map<(d0, d1, d2, d3) -> (d0, d2, d3)>, affine_map<(d0, d1, d2, d3) -> (d1, d2)>], iterator_types = ["parallel", "parallel", "parallel", "reduction"], kind = #vector.kind<add>} %lhs, %rhs, %acc : vector<1x1x8xi32>, vector<1x8x8xi32> into vector<1x8xi32> ``` Passing this through `castAwayContractionLeadingOneDim` pattern produces the following: ```mlir %0 = vector.transpose %arg0, [1, 0, 2] : vector<1x1x8xi32> to vector<1x1x8xi32> %1 = vector.extract %0[0] : vector<1x8xi32> from vector<1x1x8xi32> %2 = vector.extract %arg2[0] : vector<8xi32> from vector<1x8xi32> %3 = vector.contract {indexing_maps = [affine_map<(d0, d1, d2) -> (d0, d2)>, affine_map<(d0, d1, d2) -> (d0, d1, d2)>, affine_map<(d0, d1, d2) -> (d1)>], iterator_types = ["parallel", "parallel", "reduction"], kind = #vector.kind<add>} %1, %arg1, %2 : vector<1x8xi32>, vector<1x8x8xi32> into vector<8xi32> %4 = vector.broadcast %3 : vector<8xi32> to vector<1x8xi32> ``` The `vector.transpose` introduced does not affect the underlying data layout (effectively a no op), but it cannot be folded automatically. This change avoids inserting transposes when only leading unit dimensions are involved. Fixes llvm#85691
…tedType (llvm#87582) Previously the leading space was added in each string constant. This patch moves the leading space out of the string constants and is instead explicitly added to add clarity to the code.
Reverts llvm#86812. This commit caused a regression on the x86_64 MacOS buildbot: https://green.lab.llvm.org/job/llvm.org/view/LLDB/job/lldb-cmake/784/
DeclRef to field must be marked as LValue to be consistent with how the field decl will be evaluated. T->desugar() is unnecessary to call ->isArrayType().
This is a followup to llvm#86359 "[lldb] [ObjectFileMachO] LLVM_COV is not mapped into firmware memory (llvm#86359)" where I treat LLVM_COV segments in a Mach-O binary as non-loadable. There is another codepath in `DynamicLoaderStatic::LoadAllImagesAtFileAddresses` which is called to set the load addresses for a Module to the file addresses. It has no logic to detect a segment that is not loaded in virtual memory (ObjectFileMachO::SectionIsLoadable), so it would set the load address for this LLVM_COV segment to the file address and shadow actual code, breaking lldb behavior. This method currently sets the load address for any section that doesn't have a load address set already. This presumes that a Module was added to the Target, some mechanism set the correct load address for SOME segments, and then this method is going to set the other segments to a no-slide value, assuming they were forgotten. ObjectFile base class doesn't, today, vend a SectionIsLoadable method, but we do have ObjectFile::SetLoadAddress and at a higher level, Module::SetLoadAddress, when we're setting the same slide to all segments. That's the behavior we want in this method. If any section has a load address, we don't touch this Module. Otherwise we set all sections to have a load address that is the same as the file address. I also audited the other parts of lldb that are calling SectionList::SectionLoadAddress and looked if they should be more correctly using Module::SetLoadAddress for the entire binary. But in most cases, we have the potential for different slides for different sections so this section-by-section approach must be taken. rdar://125800290
…VRegisterBankInfo::getInstrMapping. This removes the special case for vectors. The default case in the second switch can handle GPR in addition to vectors. We just won't use the static ValueMapping entry.
…7301) Use the return type to measure the LMUL size for latency/throughput cost
…rhs` - When both operands are constant, the matcher runs into an infinite loop as the commutation should be applied only when LHS is a constant and RHS is not. Reviewers: arsenm Reviewed By: arsenm Pull Request: llvm#87426
… is vector. NFC If the type is vector, we can immediately know to use vector mapping. Previously we searched for FP uses, but then replaced it if the type was vector.
mgehre-amd
approved these changes
Aug 20, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.