AKG(Auto Kernel Generator) is an optimizer for operators in Deep Learning Networks. It provides the ability to automatically fuse ops with specific patterns. AKG works with LuoJiaNET-GraphKernel to improve the performance of networks running on different hardware backends.
AKG composes with three basic optimization module, normalization, auto schedule and backend optimization.
-
normalization. In order to solve the limitation in expression ability of polyhedral(which can only process static linear programs), the computation IR needs to be normalized first. The mainly optimization of normalization module includes auto-inline, loop fusing, common subexpression elimination and so on.
-
auto schedule. Base on polyhedral technology, the auto schedule module mainly have auto-vectorization, auto-tiling, thread/block mapping, dependency analysis and memory promotion.
-
backend optimization. The backend optimization module mainly consists of TensorCore acceleration, double buffer optimization, storage flatten optimization and inject sync optimization.
At present, Ascend910
, NVIDIA V100/A100
and CPU
are supported. More Backends are on the list.
See LuoJiaNET README.md for details.
We suggest you build and run akg together with LuoJiaNET. And we also provide a way to run case in standalone mode for convenience sake. Refer to LuoJiaNET Installation for more information about compilation dependencies.
-
Build on Ascend910
git-lfs needs to be installed before cloning the source codes.
git clone https://gitee.com/luojianet/akg.git cd akg bash build.sh -e ascend -j8
-
Build on GPU
git clone https://gitee.com/luojianet/akg.git cd akg bash build.sh -e gpu -j8
-
Build on CPU
git clone https://gitee.com/luojianet/akg.git cd akg bash build.sh -e cpu -j8
- Set Environment
-
Ascend910
cd tests source ./test_env.sh
-
NVIDIA V100/A100
cd tests source ./test_env.sh gpu
-
CPU V100/A100
cd tests source ./test_env.sh cpu
- Run test
- Use script:
cd tests/st
python run.py -e gpu -o add -l level0 # run add operator on GPU
Detailed instructions see:python run.py -h
-
Use specific case:
- Ascend910
cd tests/st/ops/ pytest -s test_abs.py -m "level0 and platform_x86_ascend_training" # run level0 testcases on Ascend
- NVIDIA V100/A100
cd tests/st/ops/ pytest -s test_abs.py -m "level0 and platform_x86_gpu_training" # run level0 testcases on GPU
- CPU
cd tests/st/ops/ pytest -s test_abs.py -m "level0 and platform_x86_cpu" # run level0 testcases on CPU
See Wiki.
Welcome contributions. See LuoJiaNET Contributor Wiki for more details.
The release notes, see our RELEASE.