forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AutoBump] Merge with 267de854 (May 22) (50) #309
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…y implicit class member access expressions (llvm#92318) According to [expr.prim.id.general] p2: > If an _id-expression_ `E` denotes a non-static non-type member of some class `C` at a point where the current class is `X` and > - `E` is potentially evaluated or `C` is `X` or a base class of `X`, and > - `E` is not the _id-expression_ of a class member access expression, and > - if `E` is a _qualified-id_, `E` is not the un-parenthesized operand of the unary `&` operator, > > the _id-expression_ is transformed into a class member access expression using `(*this)` as the object expression. Consider the following: ``` struct A { void f0(); template<typename T> void f1(); }; template<typename T> struct B : T { auto g0() -> decltype(T::f0()); // ok auto g1() -> decltype(T::template f1<int>()); // error: call to non-static member function without an object argument }; template struct B<A>; ``` Clang incorrectly rejects the call to `f1` in the _trailing-return-type_ of `g1`. Furthermore, the following snippet results in a crash during codegen: ``` struct A { void f(); }; template<typename T> struct B : T { template<typename U> static void g(); template<> void g<int>() { return T::f(); // crash here } }; template struct B<A>; ``` This happens because we unconditionally build a `CXXDependentScopeMemberExpr` (with an implicit object expression) for `T::f` when parsing the template definition, even though we don't know whether `g` is an implicit object member function yet. This patch fixes these issues by instead building `DependentScopeDeclRefExpr`s for such expressions, and only transforming them into implicit class member access expressions during instantiation. Since we implemented the MS "unqualified lookup into dependent bases" extension by building an implicit class member access (and relying on the first component name of the _nested-name-specifier_ to be looked up in the context of the object expression during instantiation), we instead pre-append a fake _nested-name-specifier_ that refers to the injected-class-name of the enclosing class. This patch also refactors `Sema::BuildQualifiedDeclarationNameExpr` and `Sema::BuildQualifiedTemplateIdExpr`, streamlining their implementation and removing any redundant checks.
…m#92742) Previously `report_fatal_error` is used for reporting something goes wrong in the backend, but this is confusing because `report_fatal_error` basically means there are something unexpected & crashed in the backend. So, turn this "crash" into an elegant error reporting. After this patch, clang can diagnose it: bpf-crash.c:4:30: error: Invalid usage of the XADD return value 4 | u32 next_event_id() { return __sync_fetch_and_add(&GLOBAL_EVENT_ID, 1); } | ^ 1 error generated.
I still don't see why we need to select to different Real instructions on different targets, but at least this is less verbose.
This amends 702a2b6 to hopefully get the test passing for Windows again.
Related to the poor performance of MCAssembler based constant folding (see `bool MCExpr::evaluateAsAbsolute(int64_t &Res, const MCAssembler *Asm) const` and `AttemptToFoldSymbolOffsetDifference`), commit 9500a5d (llvm#91082) caused -O0 -g compile time regression. 9500a5d special cased .eh_frame FDE emitting. This patch adds a special case to .debug_* emitting as well to mitigate the rest regression. The MCAssembler based constant folding strategy should be improved to remove the two special cases.
This allows use at other places, in particular an updated version of llvm#92307.
…se class (llvm#92597) Consider the following: ``` template<typename T> struct A { struct B : A { }; }; ``` According to [class.derived.general] p2: > [...] A _class-or-decltype_ shall denote a (possibly cv-qualified) class type that is not an incompletely defined class; any cv-qualifiers are ignored. [...] Although GCC and EDG rejects this, Clang accepts it. This is incorrect, as `A` is incomplete within its own definition (outside of a complete-class context). This patch correctly diagnoses instances where the current instantiation is used as a base class before it is complete. Conversely, Clang erroneously rejects the following: ``` template<typename T> struct A { struct B; struct C : B { }; struct B : C { }; // error: circular inheritance between 'C' and 'A::B' }; ``` Though it may seem like no valid specialization of this template can be instantiated, an explicit specialization of either member classes for an implicit instantiated specialization of `A` would permit the definition of the other member class to be instantiated, e.g.: ``` template<> struct A<int>::B { }; A<int>::C c; // ok ``` So this patch also does away with this error. This means that circular inheritance is diagnosed during instantiation of the definition as a consequence of requiring the base class type to be complete (matching the behavior of GCC and EDG).
…2739) Removes two XFAILed tests, the other tests are marked UNSUPPORTED only on windows.
…ction template explicit specializations after C++14 (llvm#92449) Clang incorrectly accepts the following when using C++14 or later: ``` struct A { template<typename T> void f() const; template<> constexpr void f<int>(); }; ``` Non-static member functions declared `constexpr` are only implicitly `const` in C++11. This patch makes clang reject the explicit specialization of `f` in language modes after C++11.
Doh! CMake cache scripts don't have generator variables set yet, so the script can't depend on the generator variables. Instead I've added a variable that a user can specify to enable the distribution settings.
It's really great that we have the same information duplicated in TargetLibraryInfo and RuntimeLibcalls which both assume everything by default. Should fix issue reported after llvm#92287
Fixes error in GlobalISel CTLZ lowering caused by [llvm#88512](llvm#88512). --------- Co-authored-by: Leon Clark <leoclark@amd.com>
…ewritePattern (llvm#91987) * Implements `TransferWritePermutationLowering`, `TransferReadPermutationLowering` and `TransferWriteNonPermutationLowering` as a MaskableOpRewritePattern. Allowing to exit gracefully when such use of a xferOp is inside a `vector::MaskOp` * Updates MaskableOpRewritePattern to handle MemRefs and buffer semantics providing empty `Value()` as a return value for `matchAndRewriteMaskableOp` now represents successful rewriting without value to replace the original op. Split of llvm#90835
…92619) The current definition is a bit fuzzy... replace it with something that's somewhat rigorous. For functions, the definition is pretty narrow; as a consequence of language-level non-determinism, it's impossible to tell whether two functions are equivalent, so just embrace the non-determinism. For constants, we're pretty strict; otherwise you end up concluding constants can actually change value, which is bad for alias analysis. I think C++ standard don't allow any non-deterministic operations in constants, so we should be okay there? Poison is per-byte to allow some ambiguity in the way padding is defined.
Similar to llvm#92613, but for types. Co-authored-by: Jeremy Kun <j2kun@users.noreply.github.com>
…92595) We need to insert a constrained canonicalize. Depends llvm#92594
The ops supported are: `add`, `sub`, `xor`, `or`, `umax`, `uadd.sat` Proofs: https://alive2.llvm.org/ce/z/8ZMSRg The `add` case actually comes up in SPECInt, the rest are here mostly for completeness. Closes llvm#88579
…2738) This avoids the following build time warning, when building with the latest nightly Clang: warning: cast from 'FARPROC' (aka 'int (*)() __attribute__((stdcall))') to 'GetSystemTimeAsFileTimePtr' (aka 'void (*)(_FILETIME *) __attribute__((stdcall))') converts to incompatible function type [-Wcast-function-type-mismatch] This warning seems to have appeared since Clang commit 999d4f8, which restructured. The GetProcAddress function returns a `FARPROC` type, which is `int (WINAPI *)()`. Directly casting this to another function pointer type triggers this warning, but casting to a `void*` inbetween avoids this issue. (On Unix-like platforms, dlsym returns a `void*`, which doesn't exhibit this casting problem.)
…ps (llvm#90814) Implement folding and rewrite logic to eliminate no-op tensor and memref operations. This handles two specific cases: 1. tensor.insert_slice operations where the size of the inserted slice is known to be 0. 2. memref.copy operations where either the source or target memrefs are known to be emtpy. Co-authored-by: Spenser Bauman <sabauma@fastmail>
In building AddrSpaceQualType (llvm#90048), there is a bug in removeAddrSpaceQualType() for arrays. Arrays are weird because qualifiers on the element type also count as qualifiers on the type, so getSingleStepDesugaredType() can't remove the sugar on arrays. This results in an infinite loop in removeAddrSpaceQualType. To fix the issue, we use ASTContext::getUnqualifiedArrayType instead, which strips the qualifier off the element type, then reconstruct the array type.
…92796) This would consistently fail for me locally, to the point where I could not run ninja libc-unit-tests without ninja libc_setjmp_unittests failing. Turns out that since I enabled -ftrivial-auto-var-init=pattern in commit 1d5c16d ("[libc] default enable -ftrivial-auto-var-init=pattern (llvm#78776)") this has been a problem. Our x86_64 setjmp definition disabled -Wuninitialized, so we wound up clobbering these registers and instead backing up 0xAAAAAAAAAAAAAAAA rather than the actual register value. The implemenation should be rewritten entirely. I've proposed three different ways to do so (linked below). Until we decide which way to go, at least disable this hardening feature for this function for now so that the unit tests go back to green. Link: llvm#87837 Link: llvm#88054 Link: llvm#88157 Fixes: llvm#91164
These are untested and unsupported platforms. The pattern used makes sense for platform specific error numbers, but these are platforms we do not support. Excise this code. Link: llvm#91150
) This patch changes uses of llvm::function_ref for std::function when storing the callback inside of a class. The LLVM Programmer's manual mentions that llvm::function_ref is not safe to store as it contains pointers to external memory that are not guaranteed to exist in the future when it is stored. This causes issues when setting callbacks inside of a class that manages MCA state. Passing a lambda directly to the set callback functions will end up causing UB/segfaults when the lambda is called as some external memory is now invalid. This is easy to work around (create a separate std::function, pass that into the function setting the callback), but isn't ideal.
Currently only linalg.copy is recognized when trying to specialize linalg.generics back to named op. This diff enables recognition of more generic to named op e.g. linalg.fill, elemwise unary/binary.
Use it for 2 places in LegalizeIntegerTypes that created a VP_AND.
This reverts commit 89e1f77. llvm#88270 (comment) llvm#88270 (comment) Main concerns from @nikic are the interaction between the 'IndVars' and 'LoopDeletion' passes, increasing build times and adding extra complexity.
This change adds bindings for `mlirDenseElementsAttrGet` which accepts a list of MLIR attributes and constructs a DenseElementsAttr. This allows for creating `DenseElementsAttr`s of types not natively supported by Python (e.g. BF16) without requiring other dependencies (e.g. `numpy` + `ml-dtypes`).
... back into range of the array.
…ands before folding to AVG Pulled out of llvm#92096 - ensure we have completed a topological simplification of the SRA/SRL shift operands before we try to combine to a AVG node, as its difficult to later simplify through AVG nodes.
Look through SExt with a precondition that the operand is signed positive. https://alive2.llvm.org/ce/z/zvVVHj
…mandedOp` (llvm#92753) In `TargetLowering::ShrinkDemandedOp`, types of lhs and rhs may differ before legalization. In the original case, `VT` is `i64` and `SmallVT` is `i32`, but the type of rhs is `i8`. Then invalid truncate nodes will be created. See the description of ISD::SHL for further information: > After legalization, the type of the shift amount is known to be TLI.getShiftAmountTy(). Before legalization, the shift amount can be any type, but care must be taken to ensure it is large enough. https://github.com/llvm/llvm-project/blob/605ae4e93be8976095c7eedf5c08bfdb9ff71257/llvm/include/llvm/CodeGen/ISDOpcodes.h#L691-L712 This patch stops handling ISD::SHL in `TargetLowering::ShrinkDemandedOp` and duplicates the logic in `TargetLowering::SimplifyDemandedBits`. Additionally, it adds some additional checks like `isNarrowingProfitable` and `isTypeDesirableForOp` to improve the codegen on AArch64. Fixes llvm#92720.
Emit diagnostic messages for invalid modifiers in "reduction" clause. Fixes llvm#92397
…aintenence (llvm#92976) I had some trouble understanding why `removeReady` removed nodes from the Pending queue, since my intuition told me that the Pending queue did not represent a node that was ready. I took a deeper look and found that pickOnlyNode and pickNodeFromQueue only picked nodes from the Available queue too. I found that need to nodes from the Available and Pending queues that correspond to the opposite direction that we ended up choosing from (IsTopNode vs !IsTopNode). It took me a little longer than I would have liked to understand this fact, so I figured that I would add a comment in the code that makes it clear for future readers.
…vm#91459) OpenMP loop transformation did not work on a for-loop using an iterator or range-based for-loops. The first reason is that it combined the iterator's type for generated loops with the type of `NumIterations` as generated for any `OMPLoopBasedDirective` which is an integer. Fixed by basing all generated loop variables on `NumIterations`. Second, C++11 range-based for-loops include syntactic sugar that needs to be executed before the loop. This additional code is now added to the construct's Pre-Init lists. Third, C++20 added an initializer statement to range-based for-loops which is also added to the pre-init statement. PreInits used to be a `DeclStmt` which made it difficult to add arbitrary statements from `CXXRangeForStmt`'s syntactic sugar, especially the for-loops init statement which does not need to be a declaration. Change it to be a general `Stmt` that can be a `CompoundStmt` to hold arbitrary Stmts, including DeclStmts. This also avoids the `PointerUnion` workaround used by `checkTransformableLoopNest`. End-to-end tests are added to verify the expected number and order of loop execution and evaluations of expressions (such as iterator dereference). The order and number of evaluations of expressions in canonical loops is explicitly undefined by OpenMP but checked here for clarification and for changes to be noticed.
This function will return nullptr instead of returning a constant expression now, so be sure to handle that. Fixes llvm#93017.
Redefines the amd_kernel_code_t struct with MCExprs for members that would be derived from SIProgramInfo MCExpr members.
This commit eliminates a redundant matcher subexpression from the implementation of the "sizeof-pointer-to-aggregate" part of the clang-tidy check `bugprone-sizeof-expression`. I'm fairly certain that anything that was previously matched by the deleted matcher `StructAddrOfExpr` is also covered by the more general `PointerToStructExpr` (which remains in the same `anyOf`). This commit is made to "prepare the ground" for a followup change that would merge the functionality of the Clang Static Analyzer checker `alpha.core.SizeofPtr` into this clang-tidy check.
I believe these were forgotten when copying the clang in llvm#86816. This was flagged because the CHECK lines for CHECK-LD-ANY* had no associated RUN line. See llvm#92387 (comment)
This resolves an older FIXME comment.
…m#92548) This patch overrides the clearsSuperRegisters method defined in MCInstrAnalysis to identify register writes that clear the upper portion of all super-registers on AArch64 architecture. On AArch64, a write to a general-purpose register of 32-bit data size is defined to use the lower 32-bits of the register and zero extend the upper 32-bits. Similarly, SIMD and FP instructions operating on scalar data only access the lower bits of the SIMD&FP register. The unused upper bits are cleared to zero on a write. This also applies to SIMD vector registers when the element size in bits multiplied by the number of lanes is lower than 128. The upper 64 bits of the vector register are cleared to zero on a write.
cferry-AMD
approved these changes
Aug 26, 2024
An error occurred while trying to automatically change base from
bump_to_de483ad5
to
feature/fused-ops
September 4, 2024 05:02
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.