mlir_tblgen is broken for cross compile #1094

powderluv · 2022-07-21T19:10:50Z

Upstream MLIR has some quirks exporting MLIR_TBLGEN into the CMake PARENT_SCOPE https://github.com/llvm/llvm-project/blob/07b749800c5cd4105d49ab46be5f0a2079dd709a/mlir/CMakeLists.txt#L151-L156

Some of the tools like mlir-linalg-ods-yaml-gen do the right thing for cross compile with https://github.com/llvm/llvm-project/blob/27945f9282030136cb8b043b91b229ea2758c9ed/mlir/tools/mlir-linalg-ods-gen/CMakeLists.txt#L23-L35

We have some hacks on top of the "directly include tools/ in the top level mlir/CMakeLists.txt" hack here:

torch-mlir/CMakeLists.txt

Lines 92 to 98 in 72dd04c

    
           # In-tree build with LLVM_EXTERNAL_PROJECTS=torch-mlir 
        
           # FIXME: This should really be inherited from the LLVM tree.  In particular, 
        
           # it's going to change when cross-compiling. 
        
           set(MLIR_TABLEGEN_EXE mlir-tblgen) 
        
           if (TORCH_MLIR_ENABLE_MHLO) 
        
             set(MLIR_PDLL_TABLEGEN_EXE mlir-pdll) 
        
           endif()

This causes ARM64 cross compile for Apple M1 to fail on the MacOS Github Actions Runner like https://github.com/llvm/torch-mlir/runs/7441128393?check_suite_focus=true

cd /Users/runner/work/torch-mlir/torch-mlir/build && /Users/runner/work/torch-mlir/torch-mlir/build/bin/mlir-tblgen -gen-pass-decls -DTORCH_MLIR_ENABLE_MHLO -I /Users/runner/work/torch-mlir/torch-mlir/include/torch-mlir/Conversion -I/Users/runner/work/torch-mlir/torch-mlir/build/include -I/Users/runner/work/torch-mlir/torch-mlir/externals/llvm-project/llvm/include -I/Users/runner/work/torch-mlir/torch-mlir/externals/mlir-hlo/include -I/Users/runner/work/torch-mlir/torch-mlir/build/tools/torch-mlir/mlir-hlo/include -I/Users/runner/work/torch-mlir/torch-mlir/externals/llvm-project/llvm/../mlir/include -I/Users/runner/work/torch-mlir/torch-mlir/build/tools/mlir/include -I/Users/runner/work/torch-mlir/torch-mlir/include -I/Users/runner/work/torch-mlir/torch-mlir/build/tools/torch-mlir/include /Users/runner/work/torch-mlir/torch-mlir/include/torch-mlir/Conversion/Passes.td --write-if-changed -o tools/torch-mlir/include/torch-mlir/Conversion/Passes.h.inc -d tools/torch-mlir/include/torch-mlir/Conversion/Passes.h.inc.d
/bin/sh: /Users/runner/work/torch-mlir/torch-mlir/build/bin/mlir-tblgen: Bad CPU type in executable

The text was updated successfully, but these errors were encountered:

jpienaar · 2022-07-21T19:18:51Z

And locally on an M1 this should repro with

mkdir build
  cd build
  cmake $GITHUB_WORKSPACE/externals/llvm-project/llvm -GNinja \
    -DCMAKE_BUILD_TYPE=Release \
    -DCMAKE_LINKER=lld \
    -DCMAKE_C_COMPILER_LAUNCHER=ccache -DCMAKE_CXX_COMPILER_LAUNCHER=ccache \
    -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ \
    -DPython3_EXECUTABLE=$(which python) \
    -DLLVM_ENABLE_ASSERTIONS=ON \
    -DLLVM_ENABLE_PROJECTS=mlir \
    -DLLVM_EXTERNAL_PROJECTS="torch-mlir;torch-mlir-dialects" \
    -DLLVM_EXTERNAL_TORCH_MLIR_SOURCE_DIR="$GITHUB_WORKSPACE" \
    -DLLVM_EXTERNAL_TORCH_MLIR_DIALECTS_SOURCE_DIR="${GITHUB_WORKSPACE}/external/llvm-external-projects/torch-mlir-dialects" \
    -DMLIR_ENABLE_BINDINGS_PYTHON=ON \
    -DTORCH_MLIR_USE_INSTALLED_PYTORCH=OFF \
    -DCMAKE_OSX_ARCHITECTURES=arm6[4](https://github.com/llvm/torch-mlir/runs/7441128393?check_suite_focus=true#step:4:4) \
    -DMACOSX_DEPLOYMENT_TARGET=12.0 \
    -DLLVM_TARGETS_TO_BUILD=AArch[6](https://github.com/llvm/torch-mlir/runs/7441128393?check_suite_focus=true#step:4:6)4 \
    -DLLVM_USE_HOST_TOOLS=ON \
    -DLLVM_TARGETS_TO_BUILD=host
  ninja

?

powderluv · 2022-07-21T19:39:57Z

on an M1 you want to build x86_64 then :D let me try on my m1 and post a command

jpienaar · 2022-07-21T19:43:47Z

Thanks!

powderluv · 2022-07-21T22:38:13Z

if you don't have Rosetta you can do:

(mlir_venv) anush@MacBook-Pro build % cmake externals/llvm-project/llvm -B build -GNinja \                                                
    -DCMAKE_BUILD_TYPE=Release \
    -DCMAKE_LINKER=lld \
    -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ \
    -DPython3_EXECUTABLE=$(which python) \
    -DLLVM_ENABLE_ASSERTIONS=ON \
    -DLLVM_ENABLE_PROJECTS=mlir \
    -DLLVM_EXTERNAL_PROJECTS="torch-mlir;torch-mlir-dialects" \
    -DLLVM_EXTERNAL_TORCH_MLIR_SOURCE_DIR=`pwd` \
    -DLLVM_EXTERNAL_TORCH_MLIR_DIALECTS_SOURCE_DIR=`pwd`/external/llvm-external-projects/torch-mlir-dialects \
    -DMLIR_ENABLE_BINDINGS_PYTHON=ON \
    -DTORCH_MLIR_USE_INSTALLED_PYTORCH=OFF \
    -DCMAKE_OSX_ARCHITECTURES=x86_64 \
    -DMACOSX_DEPLOYMENT_TARGET=12.0 \
    -DLLVM_TARGETS_TO_BUILD=X86 \
    -DLLVM_USE_HOST_TOOLS=ON

Disabling Rosetta 2 can be complicated I think. (from https://developer.apple.com/forums/thread/669486)

Obtain a list of files/directories and LaunchAgents with: pkgutil --files com.apple.pkg.RosettaUpdateAuto
Save them in a way that you can access them in the recovery
Boot into recovery
Open terminal in recovery (btw to load recovery on M1 macs you long press the power button instead of holding CMD+R)
Run csrutil disable and confirm (temporary disable SIP)
Reboot
Delete the files listed at step 1 (in my case it was enough to delete/Library/Apple/usr/share/rosetta and /Library/Apple/usr/libexec with all their contents)
Reboot back to recovery terminal
Run csrutil enable and confirm

powderluv · 2022-07-21T22:40:20Z

Looks like the native tools are being built:

(mlir_venv) anush@MacBook-Pro build % file NATIVE/bin/mlir-tblgen 
NATIVE/bin/mlir-tblgen: Mach-O 64-bit executable arm64
(mlir_venv) anush@MacBook-Pro build % file bin/mlir-tblgen 
bin/mlir-tblgen: Mach-O 64-bit executable x86_64

so may just need to set the right version when -DLLVM_USE_HOST_TOOLS=ON

powderluv · 2022-07-22T09:01:00Z

@jpienaar made an easier recreate for you on Ubuntu / Linux:

Install ARM cross-compile toolchain:

sudo apt-get install gcc-arm-linux-gnueabihf g++-arm-linux-gnueabihf

Build torch-mlir with:

ubuntu:~/github/torch-mlir$ cmake -GNinja -Bbuild   -DCMAKE_BUILD_TYPE=Release   -DCMAKE_C_COMPILER=arm-linux-gnueabihf-gcc   -DCMAKE_CXX_COMPILER=arm-linux-gnueabihf-g++   -DPython3_FIND_VIRTUALENV=ONLY   -DLLVM_ENABLE_PROJECTS=mlir   -DLLVM_EXTERNAL_PROJECTS="torch-mlir;torch-mlir-dialects"   -DLLVM_EXTERNAL_TORCH_MLIR_SOURCE_DIR=`pwd`   -DLLVM_EXTERNAL_TORCH_MLIR_DIALECTS_SOURCE_DIR=`pwd`/externals/llvm-external-projects/torch-mlir-dialects   -DMLIR_ENABLE_BINDINGS_PYTHON=OFF -DTORCH_MLIR_USE_INSTALLED_PYTORCH=OFF  -DLLVM_TARGETS_TO_BUILD=AArch64 -DLLVM_USE_HOST_TOOLS=ON externals/llvm-project/llvm && cmake --build build --target check-torch-mlir-all

it will fail with:

[464/2371] Building passes.h.inc...
FAILED: tools/torch-mlir/mlir-hlo/include/mlir-hlo/Dialect/gml_st/transforms/passes.h.inc /home/anush/github/torch-mlir/build/tools/torch-mlir/mlir-hlo/include/mlir-hlo/Dialect/gml_st/transforms/passes.h.inc 
cd /home/anush/github/torch-mlir/build && /home/anush/github/torch-mlir/build/bin/mlir-tblgen -gen-pass-decls -name GmlSt -I /home/anush/github/torch-mlir/externals/mlir-hlo/include/mlir-hlo/Dialect/gml_st/transforms -I/home/anush/github/torch-mlir/build/include -I/home/anush/github/torch-mlir/externals/llvm-project/llvm/include -I/home/anush/github/torch-mlir/externals/llvm-project/llvm/../mlir/include -I/home/anush/github/torch-mlir/build/tools/mlir/include -I/home/anush/github/torch-mlir/externals/llvm-project/llvm/../mlir/include -I/home/anush/github/torch-mlir/build/tools/mlir/include -I/home/anush/github/torch-mlir/externals/mlir-hlo/include -I/home/anush/github/torch-mlir/build/tools/torch-mlir/mlir-hlo/include -I/home/anush/github/torch-mlir/build/tools/torch-mlir/mlir-hlo /home/anush/github/torch-mlir/externals/mlir-hlo/include/mlir-hlo/Dialect/gml_st/transforms/passes.td --write-if-changed -o tools/torch-mlir/mlir-hlo/include/mlir-hlo/Dialect/gml_st/transforms/passes.h.inc -d tools/torch-mlir/mlir-hlo/include/mlir-hlo/Dialect/gml_st/transforms/passes.h.inc.d
/bin/sh: 1: /home/anush/github/torch-mlir/build/bin/mlir-tblgen: Exec format error
ninja: build stopped: subcommand failed.

You can see the NATIVE binary is not built

ubuntu:~/github/torch-mlir$ find build  -name mlir-tblgen  | xargs file
build/tools/mlir/tools/mlir-tblgen:        directory
build/bin/mlir-tblgen:                     ELF 32-bit LSB pie executable, ARM, EABI5 version 1 (GNU/Linux), dynamically linked, interpreter /lib/ld-linux-armhf.so.3, BuildID[sha1]=d15ff4888c68e3084c3e26e9c1961595954bb549, for GNU/Linux 3.2.0, not stripped
build/NATIVE/tools/mlir/tools/mlir-tblgen: directory

jpienaar · 2022-07-22T11:58:08Z

Great, I'm in a PC meeting today but will try checking soon. I looked a bit yesterday and did see add_tablegen considered cross compilation, so seems likely it is "nesting" where a different setting is needed.

marbre · 2022-07-22T12:48:45Z

Seems related to what is described in https://llvm.org/docs/HowToCrossCompileLLVM.html:

The TableGen options are required to compile it with the host compiler, so you’ll need to compile LLVM (or at least llvm-tblgen) to your host platform before you start.

When cross-compiling IREE for bare-metal Arm, we need a two stage approach as well to compile the host tools and to compile for the target platform.

powderluv · 2022-07-26T05:42:19Z

@marbre yeah but it should be automatic for LLVM projects with -DLLVM_USE_HOST_TOOLS=ON . I think if we fix the "directly include tools/ in the top level mlir/CMakeLists.txt" hack it should automatically work.

marbre · 2022-07-26T14:52:21Z

Build torch-mlir with:

ubuntu:~/github/torch-mlir$ cmake -GNinja -Bbuild   -DCMAKE_BUILD_TYPE=Release   -DCMAKE_C_COMPILER=arm-linux-gnueabihf-gcc   -DCMAKE_CXX_COMPILER=arm-linux-gnueabihf-g++   -DPython3_FIND_VIRTUALENV=ONLY   -DLLVM_ENABLE_PROJECTS=mlir   -DLLVM_EXTERNAL_PROJECTS="torch-mlir;torch-mlir-dialects"   -DLLVM_EXTERNAL_TORCH_MLIR_SOURCE_DIR=`pwd`   -DLLVM_EXTERNAL_TORCH_MLIR_DIALECTS_SOURCE_DIR=`pwd`/externals/llvm-external-projects/torch-mlir-dialects   -DMLIR_ENABLE_BINDINGS_PYTHON=OFF -DTORCH_MLIR_USE_INSTALLED_PYTORCH=OFF  -DLLVM_TARGETS_TO_BUILD=AArch64 -DLLVM_USE_HOST_TOOLS=ON externals/llvm-project/llvm && cmake --build build --target check-torch-mlir-all

I am quite certain that this one has to fail, since you're passing -DCMAKE_C_COMPILER=arm-linux-gnueabihf-gcc -DCMAKE_CXX_COMPILER=arm-linux-gnueabihf-g++.

@marbre yeah but it should be automatic for LLVM projects with -DLLVM_USE_HOST_TOOLS=ON .

I would need to take a closer look to what is behind. I might have some time this evening.

I think if we fix the "directly include tools/ in the top level mlir/CMakeLists.txt" hack it should automatically work.

Can you elaborate on this? Do suggest you shuffle https://github.com/llvm/llvm-project/blob/28e665fa054d62d4e2c777774cc83dea533dfe6e/mlir/CMakeLists.txt#L154 around?
Edit: I think you mean a fix similar to https://github.com/llvm/llvm-project/blob/27945f9282030136cb8b043b91b229ea2758c9ed/mlir/tools/mlir-linalg-ods-gen/CMakeLists.txt#L23-L35 for mlir-tblgen.

powderluv · 2022-07-27T20:16:17Z

Yes the latter.

I think you mean a fix similar to https://github.com/llvm/llvm-project/blob/27945f9282030136cb8b043b91b229ea2758c9ed/mlir/tools/mlir-linalg-ods-gen/CMakeLists.txt#L23-L35 for mlir-tblgen.

jpienaar · 2022-07-27T21:27:40Z

https://reviews.llvm.org/D130350 relevant here.

powderluv · 2022-07-28T06:38:39Z

I don't think that fixes the issue. With the ARM recreate above you still get:

FAILED: include/llvm/IR/IntrinsicsR600.h /home/anush/github/torch-mlir/build/include/llvm/IR/IntrinsicsR600.h 
cd /home/anush/github/torch-mlir/build && /home/anush/github/torch-mlir/build/NATIVE/bin/llvm-tblgen -gen-intrinsic-enums -intrinsic-prefix=r600 -I /home/anush/github/torch-mlir/externals/llvm-project/llvm/include/llvm/IR -I/home/anush/github/torch-mlir/build/include -I/home/anush/github/torch-mlir/externals/llvm-project/llvm/include /home/anush/github/torch-mlir/externals/llvm-project/llvm/include/llvm/IR/Intrinsics.td --write-if-changed -o include/llvm/IR/IntrinsicsR600.h -d include/llvm/IR/IntrinsicsR600.h.d
/bin/sh: 1: /home/anush/github/torch-mlir/build/NATIVE/bin/llvm-tblgen: Exec format error
ninja: build stopped: subcommand failed.
1 anush@nod-shared-a100-ubuntu:~/github/torch-mlir$ file /home/anush/github/torch-mlir/build/NATIVE/bin/llvm-tblgen
/home/anush/github/torch-mlir/build/NATIVE/bin/llvm-tblgen: ELF 32-bit LSB pie executable, ARM, EABI5 version 1 (GNU/Linux), dynamically linked, interpreter /lib/ld-linux-armhf.so.3, BuildID[sha1]=eddee09609986fae38bfca74602ec8def8646241, for GNU/Linux 3.2.0, not stripped

marbre · 2022-07-28T13:31:27Z

I don't think that fixes the issue. With the ARM recreate above you still get:

Do you still pass -DCMAKE_C_COMPILER=arm-linux-gnueabihf-gcc -DCMAKE_CXX_COMPILER=arm-linux-gnueabihf-g++?

Anyway, I can offer to take over and look into the issue in more detail next week (oof tomorrow). I already digged into the cross-compiling mechanism LLVM relies on and into add_tablegen().

powderluv · 2022-07-28T13:34:53Z

Yes I did pass that and still no luck

powderluv · 2022-07-28T13:35:19Z

Thank you for offering to take over.

jpienaar · 2022-07-28T13:44:37Z

Indeed, thanks!

marbre · 2022-07-28T13:50:25Z

Yes I did pass that and still no luck

Well, one thing is that arm-none-linux-gnueabihf-g++ won't be able to produce the native binaries for x86. It only supports Arm targets. So you will need to use a compiler that has target support for Arm + x86 (e.g. a Clang build with multiple targets enabled) or you'll need a multi-stage compilation.

Looking into solving this issue next week :)

powderluv · 2022-07-28T13:52:02Z

With the host tools flag set LLVM builds native tools (the mlir_linalg_ods_gen_yaml) handles this well.

powderluv · 2022-08-03T21:30:40Z

@marbre any luck with this issue ?

u99127 · 2022-08-04T19:02:28Z

arm-linux-gnueabihf-gcc produces AArch32 code from a cross compiler suitable for executing in AArch32 ISA state under a Linux environment with glibc. Is that what you are looking for here ?

Ramana

powderluv · 2022-08-04T19:07:29Z

I think the arm-linux was just an easy repro case for folks to try out without requiring a macOS install. In all cases MLIR_TABLEGEN doesn't seem to respect -DLLVM_USE_HOST_TOOLS=ON like mlir-linalg-ods-yaml-gen does. That requires us to build host tools once (~3500+ files) and then use the host tools to cross compile (another ~3500 files).

A fix was attempted with https://reviews.llvm.org/D130350 but that doesn't fix it.

u99127 · 2022-08-04T19:19:25Z

Ah I see - sorry about the noise.

Ramana

marbre · 2022-08-05T09:09:42Z

@marbre any luck with this issue ?

I am still on it.

In all cases MLIR_TABLEGEN doesn't seem to respect -DLLVM_USE_HOST_TOOLS=ON like mlir-linalg-ods-yaml-gen does.

@powderluv What behavior exactly do you expect when setting LLVM_USE_HOST_TOOLS to ON? Also mlir-linalg-ods-yaml-gen calls build_native_tool and therefore should build the native tool, so I am not sure I fully understand what you think the expected behavior should look like.

powderluv · 2022-08-05T14:14:47Z

When -DLLVM_USE_HOST_TOOLS=ON any required host tools should be built with build_native_tool() and the cross-compile here: #1094 (comment) should just work and not fail. The mlir-linalg-ods-yaml-gen tool is built correctly and works as expected using host tool and cross-compile. However mlir_tblgen sets up the NATIVE tools correctly but somehow it is not built correctly.

So expected output of

ubuntu:~/github/torch-mlir$ cmake -GNinja -Bbuild   -DCMAKE_BUILD_TYPE=Release   -DCMAKE_C_COMPILER=arm-linux-gnueabihf-gcc   -DCMAKE_CXX_COMPILER=arm-linux-gnueabihf-g++   -DPython3_FIND_VIRTUALENV=ONLY   -DLLVM_ENABLE_PROJECTS=mlir   -DLLVM_EXTERNAL_PROJECTS="torch-mlir;torch-mlir-dialects"   -DLLVM_EXTERNAL_TORCH_MLIR_SOURCE_DIR=`pwd`   -DLLVM_EXTERNAL_TORCH_MLIR_DIALECTS_SOURCE_DIR=`pwd`/externals/llvm-external-projects/torch-mlir-dialects   -DMLIR_ENABLE_BINDINGS_PYTHON=OFF -DTORCH_MLIR_USE_INSTALLED_PYTORCH=OFF  -DLLVM_TARGETS_TO_BUILD=AArch64 -DLLVM_USE_HOST_TOOLS=ON externals/llvm-project/llvm && cmake --build build --target check-torch-mlir-all

Should pass and build for ARM but it will fail trying to run arm mlir_tblgen on x86_64.

powderluv · 2022-08-07T16:16:54Z

Ok I did some digging into it:

-DLLVM_USE_HOST_TOOLS=ON does the right thing for OSX and builds the NATIVE tools. It doesn't do the right thing on the linux recreate I posted above. I think that is because it expects a toolchain cmake file like: https://github.com/llvm/llvm-project/blob/main/llvm/cmake/platforms/Android.cmake. We can revisit the linux_x86_64 --> linux_arm64 later and test it with the correct toolchain.cmake.

On OSX (the original issue reported here): I think it just comes down to MLIR_TABLEGEN being exposed to downstream projects. Setting the CACHE will help if we set it for all tools (like PDLL etc). I will give it a try and if it is easy will post a PR.

jpienaar · 2022-08-07T16:32:00Z

There is also patch under review that sets the install directories of these more correctly, could be unrelated, but seems many folks hitting related pains here. Setting pdll in cache SGTM.

powderluv · 2022-08-07T18:05:27Z

ok more debug info:

When we run cmake we set the correct values

...
-- Setting MLIR_TABLEGEN_EXE to /Users/anush/github/torch-mlir/build/NATIVE/bin/mlir-tblgen                                  
-- Setting MLIR_TABLEGEN_TARGET to MLIR-tablegen-host                                                                        
-- Setting MLIR_PDLL_TABLEGEN_EXE to /Users/anush/github/torch-mlir/build/NATIVE/bin/mlir-pdll

and then when we are building the NATIVE tools it gets set again to

-- Setting MLIR_TABLEGEN_EXE to mlir-tblgen                                                                                                                                                                                                               
-- Setting MLIR_TABLEGEN_TARGET to mlir-tblgen                                                                                                                                                                                                            
-- Setting MLIR_PDLL_TABLEGEN_EXE to mlir-pdll

stephenneuendorffer · 2022-08-07T19:19:47Z

I've seen things happen like this with toolchain files, where information discovered from the outer build doesn't get passed to the inner build, including the toolchain file. As a result, the inner build can discover a different set of information from the outer build. This is an area where I think cmake recursion is very subtle and it's easy to shoot yerself in the foot. I've tended to lean on the side of making all information used by a sub-build explicit, but this can be challenging when there is alot of information to be passed.

marbre · 2022-08-08T14:49:57Z

So expected output of

ubuntu:~/github/torch-mlir$ cmake -GNinja -Bbuild   -DCMAKE_BUILD_TYPE=Release   -DCMAKE_C_COMPILER=arm-linux-gnueabihf-gcc   -DCMAKE_CXX_COMPILER=arm-linux-gnueabihf-g++   -DPython3_FIND_VIRTUALENV=ONLY   -DLLVM_ENABLE_PROJECTS=mlir   -DLLVM_EXTERNAL_PROJECTS="torch-mlir;torch-mlir-dialects"   -DLLVM_EXTERNAL_TORCH_MLIR_SOURCE_DIR=`pwd`   -DLLVM_EXTERNAL_TORCH_MLIR_DIALECTS_SOURCE_DIR=`pwd`/externals/llvm-external-projects/torch-mlir-dialects   -DMLIR_ENABLE_BINDINGS_PYTHON=OFF -DTORCH_MLIR_USE_INSTALLED_PYTORCH=OFF  -DLLVM_TARGETS_TO_BUILD=AArch64 -DLLVM_USE_HOST_TOOLS=ON externals/llvm-project/llvm && cmake --build build --target check-torch-mlir-all

Should pass and build for ARM but it will fail trying to run arm mlir_tblgen on x86_64.

Thanks for clarifying. First of all, I don't know if the target check-torch-mlir-all can ever pass when cross-compiling (at least not the way it is right now). The necessary torch-mlir-opt is cross-compiled for the target (from a quick look a native versions won't be built) and thus the test cannot be executed. Anyway, we should be able to cross-compile those tools for the target :)

-DLLVM_USE_HOST_TOOLS=ON does the right thing for OSX and builds the NATIVE tools. It doesn't do the right thing on the linux recreate I posted above.

It shouldn't be necessary to set LLVM_USE_HOST_TOOLS. It is already set if an LLVM project is cross-compiled:
https://github.com/llvm/llvm-project/blob/0f5385b70eddfbcb26fed973a4d6e4cc01260930/llvm/CMakeLists.txt#L716-L718
Further, I think it does the correct thing with D130350 applied. I played around with mlir-emitc and had success with getting a native mlir-tblgen for an x86_64 Linux and a cross compiled on for an Arm Linux.
Unfortunately, I messed up my build-script and cannot share exactly what I did last Friday. Trying to reproduce and will afterwards test with torch-mlir.

I think that is because it expects a toolchain cmake file like: https://github.com/llvm/llvm-project/blob/main/llvm/cmake/platforms/Android.cmake. We can revisit the linux_x86_64 --> linux_arm64 later and test it with the correct toolchain.cmake.

You don't need a toolchain file to cross-compile. You can pass all the necessary flags via the command line or pass them via a script. I think the failure is rather related to wrong build args passed via CMake, especially for Linux to Linux cross-compiling (with Apple clang this might be different due to flags like MACOSX_DEPLOYMENT_TARGET). It is really tricky to get this done right and the Linux reproducer definitely misses CMAKE_SYSTEM_NAME to do a correct cross-compile. As promised, I am looking further into this.

and then when we are building the NATIVE tools it gets set again to

-- Setting MLIR_TABLEGEN_EXE to mlir-tblgen                                                                                                                                                                                                               
-- Setting MLIR_TABLEGEN_TARGET to mlir-tblgen                                                                                                                                                                                                            
-- Setting MLIR_PDLL_TABLEGEN_EXE to mlir-pdll

Do you still set

torch-mlir/CMakeLists.txt

Line 105 in 290d775

set(MLIR_TABLEGEN_EXE mlir-tblgen)

?

powderluv · 2022-08-09T05:51:59Z

-DLLVM_USE_HOST_TOOLS=ON does the right thing for OSX and builds the NATIVE tools. It doesn't do the right thing on the linux recreate I posted above.

It shouldn't be necessary to set LLVM_USE_HOST_TOOLS. It is already set if an LLVM project is cross-compiled:
https://github.com/llvm/llvm-project/blob/0f5385b70eddfbcb26fed973a4d6e4cc01260930/llvm/CMakeLists.txt#L716-L718

So on my x86_64 macOS building with -DCMAKE_OSX_ARCHITECTURES=arm64 if -DLLVM_USE_HOST_TOOLS=ON is not set it doesn't attempt to build_native_tool() so I have to explicitly set it. Maybe that is the root cause ?

Further, I think it does the correct thing with D130350 applied. I played around with mlir-emitc and had success with getting a native mlir-tblgen for an x86_64 Linux and a cross compiled on for an Arm Linux.
Unfortunately, I messed up my build-script and cannot share exactly what I did last Friday. Trying to reproduce and will afterwards test with torch-mlir.

D130350 is a good start but ideally we want to avoid pushing these into the cache which has unintended consequences if we recompile / change flags etc. Ideally we keep exporting out into PARENT_SCOPE until the top level has the variables we are about. That said add_tblgen() itself is pushing stuff into CACHE.

I think that is because it expects a toolchain cmake file like: https://github.com/llvm/llvm-project/blob/main/llvm/cmake/platforms/Android.cmake. We can revisit the linux_x86_64 --> linux_arm64 later and test it with the correct toolchain.cmake.

You don't need a toolchain file to cross-compile. You can pass all the necessary flags via the command line or pass them via a script. I think the failure is rather related to wrong build args passed via CMake, especially for Linux to Linux cross-compiling (with Apple clang this might be different due to flags like MACOSX_DEPLOYMENT_TARGET). It is really tricky to get this done right and the Linux reproducer definitely misses CMAKE_SYSTEM_NAME to do a correct cross-compile. As promised, I am looking further into this.

Ok Thank you.

and then when we are building the NATIVE tools it gets set again to

-- Setting MLIR_TABLEGEN_EXE to mlir-tblgen                                                                                                                                                                                                               
-- Setting MLIR_TABLEGEN_TARGET to mlir-tblgen                                                                                                                                                                                                            
-- Setting MLIR_PDLL_TABLEGEN_EXE to mlir-pdll

Do you still set

torch-mlir/CMakeLists.txt

Line 105 in 290d775

set(MLIR_TABLEGEN_EXE mlir-tblgen)

?

Yes I remove it locally. I will send a PR for this now that D130350 has landed.

@ashay fyi since you updated the add_tblgen() recently.

marbre · 2022-08-09T12:48:58Z

So on my x86_64 macOS building with -DCMAKE_OSX_ARCHITECTURES=arm64 if -DLLVM_USE_HOST_TOOLS=ON is not set it doesn't attempt to build_native_tool() so I have to explicitly set it. Maybe that is the root cause ?

Honestly, IDK and I currently don't have a Mac available to test :/

D130350 is a good start but ideally we want to avoid pushing these into the cache which has unintended consequences if we recompile / change flags etc. Ideally we keep exporting out into PARENT_SCOPE until the top level has the variables we are about. That said add_tblgen() itself is pushing stuff into CACHE.

Yeah, I am not really sure if we should go with a cache variable or with explicitly exporting to parent scopes. And yes, you're right, add_tablegen sets a cache variable for ${project}_TABLEGEN (here). However, it also pushes to the parent scope...

Anyway, I was able to cross-compile torch-mlir-opt for Arm. This requires the patches #1196, #1197 and tensorflow/tensorflow#57054 (I locally modified the mlir-hlo submodule to make the build pass).

powderluv · 2022-08-09T13:10:17Z

What is your local command to cross-compile ?

powderluv · 2022-08-10T06:22:56Z

magically I am able to cross-compile on the GHA macos-12 runner #1204 but not on my iMacPro. If this works on GHA we can close this issue or make it low priority.

marbre · 2022-08-10T07:05:20Z

Let me try to get a running cross-compile from x86_64 Linux to Arm Linux with Clang (I used GCC so far). But I agree, we can than deprioritize.

powderluv · 2022-08-27T02:43:02Z

This is now fixed.

Signed-off-by: Charles Volzka <cjvolzka@us.ibm.com>

* Add check-onnx-backend to Mac CI. (llvm#1069) * Add check-onnx-backend to Mac CI. Signed-off-by: Ettore Tiotto <etiotto@ca.ibm.com> Signed-off-by: Ethan Wang <ywan2928@uwo.ca> * Additional Docker help and split README for easier reading (llvm#1084) * initial docker documentation Signed-off-by: Alexandre Eichenberger <alexe@us.ibm.com> * split README with no redundant place for info Signed-off-by: Alexandre Eichenberger <alexe@us.ibm.com> * update Signed-off-by: Alexandre Eichenberger <alexe@us.ibm.com> * update Signed-off-by: Alexandre Eichenberger <alexe@us.ibm.com> * update Signed-off-by: Alexandre Eichenberger <alexe@us.ibm.com> * update Signed-off-by: Alexandre Eichenberger <alexe@us.ibm.com> * update Signed-off-by: Alexandre Eichenberger <alexe@us.ibm.com> * respond to suggestions Signed-off-by: Alexandre Eichenberger <alexe@us.ibm.com> * specify that onnx-mlir.py script generates only code suitable to be exec in Linux and/or Docker env Signed-off-by: Alexandre Eichenberger <alexe@us.ibm.com> * fix checkdocs Signed-off-by: Alexandre Eichenberger <alexe@us.ibm.com> * responded to review suggestion on onnx-mlir --help Signed-off-by: Alexandre Eichenberger <alexe@us.ibm.com> * use ONNX-MLIR everywhere Signed-off-by: Alexandre Eichenberger <alexe@us.ibm.com> Signed-off-by: Ethan Wang <ywan2928@uwo.ca> * add verify for concat Signed-off-by: Ethan Wang <ywan2928@uwo.ca> * check all inputs Signed-off-by: Ethan Wang <ywan2928@uwo.ca> * Support filtering out lit tests based on targets (llvm#1087) Currently we ignore what targets llvm was built for in the lit tests, but recent changes to onnx-mlir explicitly initialize the available targets. This makes the corresponding change to the lit configuration, so that we can filter out the lit tests based on the available targets. Signed-off-by: Stella Stamenova <stilis@microsoft.com> Signed-off-by: Ethan Wang <ywan2928@uwo.ca> * Switch URLs to use main instead of master (llvm#1094) Signed-off-by: Charles Volzka <cjvolzka@us.ibm.com> Signed-off-by: Ethan Wang <ywan2928@uwo.ca> * Fix MacOS build badge (llvm#1092) Signed-off-by: Gong Su <gong_su@hotmail.com> Signed-off-by: Ethan Wang <ywan2928@uwo.ca> * onnx-mlir.py warning about binary output (.so and .jar) (llvm#1090) not directly usable if host is not Linux Signed-off-by: Gong Su <gong_su@hotmail.com> Signed-off-by: Ethan Wang <ywan2928@uwo.ca> * Make the doc example obey ONNX_MLIR_BUILD_TESTS (llvm#1083) * Make the doc example obey ONNX_MLIR_BUILD_TESTS Currently, ONNX_MLIR_BUILD_TESTS controls EXCLUDE_FROM_ALL, however, the targets added through add_executable will always build. We follow the llvm pattern and explicitly set EXCLUDE_FROM_ALL in the add_onnx_mlir_executable function if it is set for the directory, so that add_executable targets don't always build. Signed-off-by: Stella Stamenova <stilis@microsoft.com> Signed-off-by: Ethan Wang <ywan2928@uwo.ca> * Explicitly install into lib on all systems (llvm#1088) Signed-off-by: Gong Su <gong_su@hotmail.com> Co-authored-by: Alexandre Eichenberger <alexe@us.ibm.com> Signed-off-by: Ethan Wang <ywan2928@uwo.ca> * add check (llvm#1098) Signed-off-by: Tong Chen <chentong@us.ibm.com> Co-authored-by: Alexandre Eichenberger <alexe@us.ibm.com> Signed-off-by: Ethan Wang <ywan2928@uwo.ca> * fix typos and add ssh-client to dockerfile (llvm#1096) * fix typos and add ssh-client to dockerfile Signed-off-by: Ethan Wang <ywan2928@uwo.ca> * sync doc and script Signed-off-by: Ethan Wang <ywan2928@uwo.ca> Co-authored-by: Alexandre Eichenberger <alexe@us.ibm.com> Signed-off-by: Ethan Wang <ywan2928@uwo.ca> * Emit print statement only when the verbose option is in effect. (llvm#1097) Signed-off-by: Ettore Tiotto <etiotto@ca.ibm.com> Co-authored-by: Alexandre Eichenberger <alexe@us.ibm.com> Signed-off-by: Ethan Wang <ywan2928@uwo.ca> * format & refine code by request Signed-off-by: Ethan Wang <ywan2928@uwo.ca> * Support older versions 6, 11, 12 for Clip Op (llvm#1100) Signed-off-by: Tung D. Le <tung@jp.ibm.com> Co-authored-by: Alexandre Eichenberger <alexe@us.ibm.com> Signed-off-by: Ethan Wang <ywan2928@uwo.ca> * using front to get first input Signed-off-by: Ethan Wang <ywan2928@uwo.ca> * add 3 lit test for concat verifier Signed-off-by: Ethan Wang <ywan2928@uwo.ca> * add newline Signed-off-by: Ethan Wang <ywan2928@uwo.ca> * Add check-onnx-backend to Mac CI. (llvm#1069) * Add check-onnx-backend to Mac CI. Signed-off-by: Ettore Tiotto <etiotto@ca.ibm.com> Signed-off-by: Ethan Wang <ywan2928@uwo.ca> * Additional Docker help and split README for easier reading (llvm#1084) * initial docker documentation Signed-off-by: Alexandre Eichenberger <alexe@us.ibm.com> * split README with no redundant place for info Signed-off-by: Alexandre Eichenberger <alexe@us.ibm.com> * update Signed-off-by: Alexandre Eichenberger <alexe@us.ibm.com> * update Signed-off-by: Alexandre Eichenberger <alexe@us.ibm.com> * update Signed-off-by: Alexandre Eichenberger <alexe@us.ibm.com> * update Signed-off-by: Alexandre Eichenberger <alexe@us.ibm.com> * update Signed-off-by: Alexandre Eichenberger <alexe@us.ibm.com> * respond to suggestions Signed-off-by: Alexandre Eichenberger <alexe@us.ibm.com> * specify that onnx-mlir.py script generates only code suitable to be exec in Linux and/or Docker env Signed-off-by: Alexandre Eichenberger <alexe@us.ibm.com> * fix checkdocs Signed-off-by: Alexandre Eichenberger <alexe@us.ibm.com> * responded to review suggestion on onnx-mlir --help Signed-off-by: Alexandre Eichenberger <alexe@us.ibm.com> * use ONNX-MLIR everywhere Signed-off-by: Alexandre Eichenberger <alexe@us.ibm.com> Signed-off-by: Ethan Wang <ywan2928@uwo.ca> * Switch URLs to use main instead of master (llvm#1094) Signed-off-by: Charles Volzka <cjvolzka@us.ibm.com> Signed-off-by: Ethan Wang <ywan2928@uwo.ca> * Fix MacOS build badge (llvm#1092) Signed-off-by: Gong Su <gong_su@hotmail.com> Signed-off-by: Ethan Wang <ywan2928@uwo.ca> * fix typos and add ssh-client to dockerfile (llvm#1096) * fix typos and add ssh-client to dockerfile Signed-off-by: Ethan Wang <ywan2928@uwo.ca> * sync doc and script Signed-off-by: Ethan Wang <ywan2928@uwo.ca> Co-authored-by: Alexandre Eichenberger <alexe@us.ibm.com> Signed-off-by: Ethan Wang <ywan2928@uwo.ca> * Update document (llvm#1077) * create Signed-off-by: Tong Chen <chentong@us.ibm.com> * delete HowTOAddAnOperation.md Signed-off-by: Tong Chen <chentong@us.ibm.com> * modify testing Signed-off-by: Tong Chen <chentong@us.ibm.com> * create Signed-off-by: Tong Chen <chentong@us.ibm.com> * delete HowTOAddAnOperation.md Signed-off-by: Tong Chen <chentong@us.ibm.com> * modify testing Signed-off-by: Tong Chen <chentong@us.ibm.com> * fix Signed-off-by: Tong Chen <chentong@us.ibm.com> * create Signed-off-by: Tong Chen <chentong@us.ibm.com> * add comment Signed-off-by: Tong Chen <chentong@us.ibm.com> * delete HowTOAddAnOperation.md Signed-off-by: Tong Chen <chentong@us.ibm.com> * modify testing Signed-off-by: Tong Chen <chentong@us.ibm.com> * fix Signed-off-by: Tong Chen <chentong@us.ibm.com> * create Signed-off-by: Tong Chen <chentong@us.ibm.com> Signed-off-by: Ethan Wang <ywan2928@uwo.ca> * Update LLVM level (llvm#1095) * Update LLVM level to 700997a Signed-off-by: Ettore Tiotto <etiotto@ca.ibm.com> Signed-off-by: Ethan Wang <ywan2928@uwo.ca> * Pass a type converter to all ONNX operations. (llvm#1102) Signed-off-by: Ettore Tiotto <etiotto@ca.ibm.com> Signed-off-by: Ethan Wang <ywan2928@uwo.ca> * Nuke KrnlDummyCastOp now that we use MLIR's UnrealizedConversionCastOp (llvm#1103) * Nuke KrnlDummyCastOp now that we use MLIR's UnrealizedConversionCastOp Signed-off-by: Ettore Tiotto <etiotto@ca.ibm.com> * Remove a dependency in src/Dialect/Krnl/CMakeList.txt. Regenerate docs via 'ninja onnx-mlir-docs'. Signed-off-by: Ettore Tiotto <etiotto@ca.ibm.com> Signed-off-by: Ethan Wang <ywan2928@uwo.ca> * Add --emitObj option to onnx-mlir (llvm#1104) Signed-off-by: Ettore Tiotto <etiotto@ca.ibm.com> Signed-off-by: Ethan Wang <ywan2928@uwo.ca> * fix warnings (llvm#1093) Signed-off-by: Ian Bearman <ianb@microsoft.com> Co-authored-by: Stella Stamenova <stilis@microsoft.com> Co-authored-by: Alexandre Eichenberger <alexe@us.ibm.com> Signed-off-by: Ethan Wang <ywan2928@uwo.ca> * Add -march option to onnx-mlir (llvm#1107) Signed-off-by: Ettore Tiotto <etiotto@ca.ibm.com> Signed-off-by: Ethan Wang <ywan2928@uwo.ca> * Fix Doc spelling and broken links, removed warnings about using main (llvm#1106) * removed warning about main vs master in CONTRIBUTING, fixed links and spelling mistakes Signed-off-by: Alexandre Eichenberger <alexe@us.ibm.com> Signed-off-by: Ethan Wang <ywan2928@uwo.ca> * Update BuildONNX.md Signed-off-by: Ethan Wang <ywan2928@uwo.ca> Co-authored-by: Ettore Tiotto <etiotto@ca.ibm.com> Co-authored-by: Alexandre Eichenberger <alexe@us.ibm.com> Co-authored-by: Stella Stamenova <stilis@microsoft.com> Co-authored-by: Charles Volzka <42243335+cjvolzka@users.noreply.github.com> Co-authored-by: gongsu832 <gong_su@hotmail.com> Co-authored-by: chentong319 <chentong@us.ibm.com> Co-authored-by: Tung D. Le <tung@jp.ibm.com> Co-authored-by: Ian Bearman <ian.bearman@live.com>

powderluv assigned stellaraccident and jpienaar Jul 21, 2022

marbre self-assigned this Jul 28, 2022

powderluv mentioned this issue Aug 6, 2022

Quantization tests seem to fail on macOS #1008

Open

powderluv closed this as completed Aug 27, 2022

qedawkins pushed a commit to nod-ai/torch-mlir that referenced this issue Oct 3, 2022

Switch URLs to use main instead of master (llvm#1094)

8a2bf3d

Signed-off-by: Charles Volzka <cjvolzka@us.ibm.com>

mlir_tblgen is broken for cross compile #1094

mlir_tblgen is broken for cross compile #1094

Comments

powderluv commented Jul 21, 2022 • edited Loading

jpienaar commented Jul 21, 2022

powderluv commented Jul 21, 2022

jpienaar commented Jul 21, 2022

powderluv commented Jul 21, 2022

powderluv commented Jul 21, 2022

powderluv commented Jul 22, 2022

jpienaar commented Jul 22, 2022

marbre commented Jul 22, 2022

powderluv commented Jul 26, 2022

marbre commented Jul 26, 2022 • edited Loading

powderluv commented Jul 27, 2022

jpienaar commented Jul 27, 2022

powderluv commented Jul 28, 2022

marbre commented Jul 28, 2022

powderluv commented Jul 28, 2022

powderluv commented Jul 28, 2022

jpienaar commented Jul 28, 2022

marbre commented Jul 28, 2022

powderluv commented Jul 28, 2022

powderluv commented Aug 3, 2022

u99127 commented Aug 4, 2022

powderluv commented Aug 4, 2022

u99127 commented Aug 4, 2022

marbre commented Aug 5, 2022 • edited Loading

powderluv commented Aug 5, 2022 • edited Loading

powderluv commented Aug 7, 2022

jpienaar commented Aug 7, 2022

powderluv commented Aug 7, 2022

stephenneuendorffer commented Aug 7, 2022

marbre commented Aug 8, 2022

powderluv commented Aug 9, 2022

marbre commented Aug 9, 2022

powderluv commented Aug 9, 2022

powderluv commented Aug 10, 2022

marbre commented Aug 10, 2022

powderluv commented Aug 27, 2022

powderluv commented Jul 21, 2022 •

edited

Loading

marbre commented Jul 26, 2022 •

edited

Loading

marbre commented Aug 5, 2022 •

edited

Loading

powderluv commented Aug 5, 2022 •

edited

Loading