Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AutoBump] Merge with ac1f2de7 (33) #293

Merged
merged 26 commits into from
Aug 22, 2024
Merged

Conversation

cferry-AMD
Copy link
Collaborator

No description provided.

Hardcode84 and others added 26 commits April 16, 2024 12:39
Instead of iterating over potential induction var uses looking for
suitable `arith.addi`, try to trace it back from yield argument.
This patch updates the definition of `omp.taskloop` to enforce the
restrictions of a wrapper operation.
We don't need to consider the offset here anymore since we now
have proper integral pointers.
llvm#88475)

…return only Load since other output is chain.

Added testcase that showed mismatched expected arity when Load and chain
were returned as separate items after
003b58f
…lvm#86963)

This patch performs several cleanups with the main purpose of
normalizing the code patterns used to trigger codegen for MLIR OpenMP
operations and making the processing of clauses and constructs
independent. The following changes are made:

- Clean up unused `directive` argument to
`ClauseProcessor::processMap()`.
- Move general helper functions in OpenMP.cpp to the appropriate section
of the file.
- Create `gen<OpName>Clauses()` functions containing the clause
processing code specific for the associated OpenMP construct.
- Update `gen<OpName>Op()` functions to call the corresponding
`gen<OpName>Clauses()` function.
- Sort calls to `ClauseProcessor::process<ClauseName>()` alphabetically,
to avoid inadvertently relying on some arbitrary order. Update some
tests that broke due to the order change.
- Normalize `genOMP()` functions so they all delegate the generation of
MLIR to `gen<OpName>Op()` functions following the same pattern.
- Only process `nowait` clause on `TARGET` constructs if not compiling
for the target device.

A later patch can move the calls to `gen<OpName>Clauses()` out of
`gen<OpName>Op()` functions and passing completed clause structures
instead, in preparation to supporting composite constructs. That will
make it possible to reuse clause processing for a given leaf construct
when appearing alone or in a combined or composite construct, while
controlling where the associated code is produced.
Check if the non-null function pointer is even valid before calling
the function.
In LoongArch psABI v2.30, the R_LARCH_ALIGN requires symbol index to
support the third parameter of alignment directive. Create symbol for
each section is redundant because they have section symbol which can
also be used as symbol index. So use section symbol directly for
R_LARCH_ALIGN.
…#88455)

Currently neither the SPIR nor the SPIRV targets specify the AS for
globals in their datalayout strings. This is problematic because
CodeGen/LLVM will default to AS0 in this case, which produces Globals
that end up in the private address space for e.g. OCL, HIPSPV or SYCL.
This patch addresses it by completing the datalayout string.
…m#86312)

Fixes llvm#76609
This patch does:
- relax the phis constraint in `CanRedirectPredsOfEmptyBBToSucc`
- guarantee the BB has multiple different predecessors to redirect, so
that we can handle the case without phis in BB. Without this change and
phi constraint, we may redirect the CommonPred.

The motivation is consistent with JumpThreading. We always want the
branch to jump more direct to the destination, without passing the
middle block. In this way, we can expose more other optimization
opportunities.

An obivous example proposed by @dtcxzyw is like:
```llvm
define i32 @test(...) {
entry:
   br i1 %c, label %do.end, label %if.then

if.then:                                          ; preds = %entry
   %call2 = call i32 @dummy()
   %tobool3.not = icmp eq i32 %call2, 0
   br i1 %tobool3.not, label %do.end, label %return

do.end:                                           ; preds = %entry, %if.then
   br label %return

return:                                           ; preds = %if.then, %do.end
   %retval.0 = phi i32 [ 0, %do.end ], [ %call2, %if.then ]
   ret i32 %retval.0
}
```
`entry` can directly jump to return, without passing `do.end`, and then
the if-else pattern can be simplified further:
```llvm
define i32 @test(...) {
entry:
   br i1 %c, label %return, label %if.then

if.then:                                          ; preds = %entry
   %call2 = call i32 @dummy()
   br label %return

return:                                           ; preds = %if.then
   %retval.0 = phi i32 [ 0, %entry ], [ %call2, %if.then ]
   ret i32 %retval.0
}
```
For G_LOAD and G_STORE we want this information during regbankselect.
Today we treat load dest as integer and insert converts.

---------

Co-authored-by: Evgenii Kudriashov <evgenii.kudriashov@intel.com>
)"

This reverts commit f4960da.
Includes a fix for the MLIR test case.
Fixes:
/buildbot/worker/arc-folder/llvm-project/clang/lib/AST/Interp/Disasm.cpp:143:25: warning: cast from type 'const clang::interp::Block*' to type 'void*' casts away qualifiers [-Wcast-qual]
/buildbot/worker/arc-folder/llvm-project/clang/lib/AST/Interp/Disasm.cpp:271:23: warning: cast from type 'const clang::interp::Block*' to type 'void*' casts away qualifiers [-Wcast-qual]
…_subvector(y,c2-c1) (llvm#87925) (REAPPLIED)

If the extract_subvector is cheap, attempt to extract directly from an inserted subvector

Reapplied with a check to ensure we only attempt this for fixed vectors
…8222)

This is necessary to ensure that functions declared in different
translation units whose parameter types only differ in top-level
cv-qualification generate the same USR.

For example:
```
// A.cpp
void f(const int x); // c:@f@f#1I#

// B.cpp
void f(int x);       // c:@f@f#I#
``` 
With this patch, the USR for both functions will be
`c:@f@f#I#`.
…lvm#88762)

This adds a simple rewrite/legalization to decompose constant splats
larger than a single ArmSME tile into multiple SME virtual tile sized
splats. E.g. a constant splat to `vector<[8]x[8]xi32>` would decompose
into four `vector<[4]x[4]xi32>` splats.
…8733)

At the moment there is no support for vector.shuffle for scalable
vectors - various hooks/helpers related to `vector.shuffle` simply
ignore the scalable flags (e.g. ` ShuffleOp::inferReturnTypes`).

This is unlikely to change any time soon (vector shuffles are known to
be tricky for scalable vectors), hence this patch restricts
`vector.shuffle` to fixed width vectors.
__builtin_is_aligned
__builtin_is_align_up
__builtin_is_align_down
…lvm#87736)

Bug fix: Handle RVV return type in calling convention correctly.
Return values are handled in a same way as function arguments.
One thing to mention is that if a type can be broken down into
homogeneous
vector types, e.g. {<vscale x 4 x i32>, {<vscale x 4 x i32>, <vscale x 4
x i32>}},
it is considered as a vector tuple type and need to be handled by tuple
type rule.
…#88689)

Co-authored-by: Frederik Harwath <fharwath@amd.com>
Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
Base automatically changed from bump_to_61717c1a to feature/fused-ops August 21, 2024 20:25
An error occurred while trying to automatically change base from bump_to_61717c1a to feature/fused-ops August 21, 2024 20:25
@mgehre-amd mgehre-amd merged commit 8acc0d4 into feature/fused-ops Aug 22, 2024
5 checks passed
@mgehre-amd mgehre-amd deleted the bump_to_ac1f2de7 branch August 22, 2024 06:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.