Copyright (c) 2019 Micron Technology, Inc. All Rights Reserved. This source code contains confidential information and trade secrets of Micron Technology, Inc. Use, disclosure, or reproduction is prohibited without the prior express written permission of Micron Technology, Inc.
This folder contains example implementation of mdla backend for pytorch.
The content here follows the PyTorch JIT compiler tutorial. and the torch tvm example
Other useful tutorial links are:
Download pybind11 from here and put pybind11 folder in this folder.
Install pytorch from source
Use the release tag v1.4.0
Build and install it using develop option as mentioned here
python3 setup.py develop --prefix=~/.local
You will need libmicrondla installed
Add api.h
, thvector.h
and thnets.h
into src
folder
Create a build folder and build torchMDLA using build.sh
mkdir build
./build.sh
test.py
contains a simple test that run a convolution using torchMDLA
The idea is to do:
- a pytorch model is created in .py
- torchscript will trace/script this model. Trace/script is run the model and creates a operations graph
- add a pass to label parts of this graph as custom operation to be run differently
- define how to run this new labeled subgraph
The source code is in source folder.
src
├── compiler.cpp
├── compiler.h
├── fusion_pass.cpp
├── fusion_pass.h
└── register.cpp
register.cpp
does the main parts: register custom pass, register custom operation and create python module.
fusion_pass.h
and fusion_pass.cpp
have the implementation of the custom pass that will group supported operations together
into a subgraph or a custom operation. Operations are identified with a Symbol (interned string), such as: aten::conv2d
.
The Symbol we gave to our custom operation is mdla::CompilationGroup
.
When torch.jit.trace
or graph_for
are called in python, this pass will be called and then it will run through the entire graph.
compiler.h
and compiler.cpp
have the implementation of the custom operation. It has a run function that will be called
whenever the custom operation is saw in the graph. The run function gets a subgraph that contains the operations (Nodes)
in the custom operation mdla::CompilationGroup
. Run function also gets the input tensors through a Stack.
Output of the subgraph is updated throught this Stack.
In the run function, compile and runtime for mdla::CompilationGroup
is implemented using microndla api functions.
pybind11 folder is pybind code used to create python module with C++ code.