Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[build] Update IMEX version #270

Merged
merged 5 commits into from
Aug 23, 2024
Merged

[build] Update IMEX version #270

merged 5 commits into from
Aug 23, 2024

Conversation

Menooker
Copy link

@Menooker Menooker commented Aug 21, 2024

The IMEX upstream has rebased its LLVM version to a recent commit (early August). In this PR, we update the LLVM commit for IMEX build and the IMEX commit as well. The IMEX commit points to the newest main-branch commit of the repo https://github.com/intel/mlir-extensions
For now, We don't need self-hosted IMEX fork. We still need a patch on IMEX at Menooker/mlir-extensions#1 from @dchigarev . Currently still using a forked IMEX, but the patch seems a bug fix and might be able to be pushed upstream.

@@ -83,6 +83,12 @@ struct TilesArray {
SmallVector<SmallVector<Value>> tileMatrix;
};

static xegpu::TensorDescType getTensorDescType(llvm::ArrayRef<int64_t> shape,
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The IMEX has originally a single function like:

TensorDescType get(ArrayRef<int64_t> shape, Type type, int aaa = 1, bool bbb = true);

And they added another similar overload function:

TensorDescType get(ArrayRef<int64_t> shape, Type type, int aaa = 1, SomeOtherType bbb = {});

This makes the default arguments unusable, xegpu::TensorDescType::get(shape, elementType) will end up with ambitious overloading error.

@Menooker Menooker requested a review from leshikus August 21, 2024 03:30
@Menooker
Copy link
Author

@leshikus @WangJialei-A It seems that updating llvm-imex commit hash will not trigger re-build of LLVM in CI? Would you please fix that?

@Menooker Menooker linked an issue Aug 21, 2024 that may be closed by this pull request
@dchigarev
Copy link
Contributor

@leshikus @WangJialei-A It seems that updating llvm-imex commit hash will not trigger re-build of LLVM in CI? Would you please fix that?

I think you just simply need to add cmake/llvm-version-imex.txt here:

  push:
    paths:
      - cmake/llvm-version.txt
+     - cmake/llvm-version-imex.txt
      - .github/workflows/build-llvm.yml

scripts/compile.sh Outdated Show resolved Hide resolved
Menooker and others added 2 commits August 21, 2024 17:04
Co-authored-by: Dmitry Chigarev <dmitry.chigarev@intel.com>
@Menooker
Copy link
Author

@leshikus @WangJialei-A It seems that updating llvm-imex commit hash will not trigger re-build of LLVM in CI? Would you please fix that?

I think you just simply need to add cmake/llvm-version-imex.txt here:

  push:
    paths:
      - cmake/llvm-version.txt
+     - cmake/llvm-version-imex.txt
      - .github/workflows/build-llvm.yml

Thanks for your suggestions, @dchigarev !

@Menooker
Copy link
Author

Current status: Test failed on f16_matmul_32x32.mlir
On my local test env, I got out of resource when calling clFinish to wait for the GPU queue, and this might indicate that GPU hangs after running the kernel. In the CI env, the execution succeeds with wrong results (all zero).

@dchigarev
Copy link
Contributor

Current status: Test failed on f16_matmul_32x32.mlir On my local test env, I got out of resource when calling clFinish to wait for the GPU queue, and this might indicate that GPU hangs after running the kernel. In the CI env, the execution succeeds with wrong results (all zero).

There was a patch in the dev branch of IMEX that fixes that (Menooker/mlir-extensions#1), it seems that it wasn't propagated to the new main branch

@Menooker
Copy link
Author

Current status: Test failed on f16_matmul_32x32.mlir On my local test env, I got out of resource when calling clFinish to wait for the GPU queue, and this might indicate that GPU hangs after running the kernel. In the CI env, the execution succeeds with wrong results (all zero).

There was a patch in the dev branch of IMEX that fixes that (Menooker/mlir-extensions#1), it seems that it wasn't propagated to the new main branch

Many thanks, @dchigarev ! I have rebased that to the current HEAD of main branch of IMEX and it works. Are you interested in submitting your PR to IMEX upstream?

@dchigarev
Copy link
Contributor

Are you interested in submitting your PR to IMEX upstream?

Yes, will do

@kurapov-peter kurapov-peter merged commit 4f771d5 into main Aug 23, 2024
5 checks passed
@kurapov-peter kurapov-peter deleted the yijie/rebase_imex branch August 23, 2024 14:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update IMEX's llvm version
3 participants