A
onnx
runtime written in pureC99
with zero dependancies focused on small embedded devices. Run inference on your machine learning models no matter which framework you train it with and no matter the device that you use.
01 Introduction • 02 Code Overview • 03 Testing • 04 Contributing • 05 Requirements
This repo contains a pure C99 runtime to run inference on onnx
models. You can train your model with you favourite framework (tensorflow, keras, sk-learn, you name it!) and once trained export it to a .onnx
file, that will be used to run inference. This makes this library totally framework agnostic, no matter how you train your model, this repo will run it using the common interface that onnx
provides. This runtime was thought for embedded devices, that have low resources and that might not be able to compile newer cpp versions, so the idea is to keep the dependancies as minimum as possible, or even zero. No GPUs or fancy processor architectures, just pure non multi-thread C99 code, compatible with almost any embedded device. Lets allow our IoT devices to run inference on the edge, but without sacrificing the tools that the big AI fishes in the industry provide.
Note that this project is in a very early stage so its not even close to be production ready. Developers are needed so feel free to contact or contribute with a pull request. See Help Needed and doc for more information about how to contribute. So far we can run inference on the MNIST
model to recognise handwritten digits.
Other C/C++ related projects
Project | Framework | Language | Size |
---|---|---|---|
onnxruntime | ONNX | x | x |
darknet | ? | x | x |
uTensor | TensorFlow | x | x |
nnom | Keras | x | x |
ELL | ELL | x | x |
TF Lite | TF Lite | x | x |
Check the Makefile
inside test
that compiles the code and run a bunch of test cases for the implemented operators + MNIST digit recognition model. Library compilation into a static library is not done yet.
Note that this example won't work as it is. Some more work is needed.
int main()
{
/* Open the onnx model you want to use*/
Onnx__ModelProto *model = openOnnxFile("model.onnx");
/* Populate and alloc memory for your inputs array */
Onnx__TensorProto **inputs;
/* Define the number of inputs you have set*/
int numOfInputs = 1;
/* Run inference on the model with your inputs*/
Onnx__TensorProto **output = inference(model, inputs, numOfInputs);
/* In output you will find an array of tensors with the outputs of each node */
/* Free all resources */
return 0;
}
A simple command line interface is provided so you can easily use it from your terminal. Note that its still in a very early stage.
Just compile it
make build_cli
And use it. First parameter is the model, and second the input in .pb
format. In the future it might support other input formats such as images.
./connxr test/mnist/model.onnx test/mnist/test_data_set_0/input_0.pb
Note that so far only few small models (order of Mb) have been tested. If you try with a huge model of hundreds of Mb or Gb weird things might happen. TODO
- MNIST: https://github.com/onnx/models/tree/master/vision/classification/mnist
- tiny YOLO v2:https://github.com/onnx/models/tree/master/vision/object_detection_segmentation/tiny_yolov2
- tiny YOLO v3: https://github.com/onnx/models/tree/master/vision/object_detection_segmentation/tiny_yolov3 TODO!
- https://lutzroeder.github.io/netron/ TODO
- Quantized MNIST. TODO. Using ONNX MNIST as baseline and quantizing it. Work ongoing
- Add super resolution model. TODO https://github.com/onnx/models/tree/master/vision/super_resolution/sub_pixel_cnn_2016
- Very few basic operators are implemented, so a model that contains a not implemented operator will fail. See them inside
operators
folder - The only end to end tested model so far is the MNIST one, for handwritten recognition digits.
- Each operator works with many data types (double, float, int16, int32). Only few of them are implemented.
has_raw_data
is not supported. ATensorProto
is assumed to have the data inside any of the structs (int, float,...) and not in raw_data.- So far memory management is a mess, so you will find a memory leak for sure.
- There are some hardcodings, here and there.
- Integrate onnx backend testing
- Implement all operators contained in MNIST model
- Run end to end tests for MNIST model
- Implement a significant amount of onnx operators, most common ones
- Compile and deploy a model (i.e. MNIST) into a real embedded device
- Set up a nice CI with Azure or GitHub Actions
- Run profiling on the operators
- Migrate to nanopb to reduce the size of the pb files
- Run memory check and leak detection (Valgrind?)
- Add more tests than the onnx backend, which is not sufficient
- Create a nice Makefile, compile library as a static library to be linked
- Try different compilers
- Enable gcc extra options (pedantic, all W, etc,...)
- Implement some "Int" operators and fixed point stuff.
- Create and run a quantized variation of the MNIST model
This project is not associated in any way with ONNX and it is not an official solution nor officially supported by ONNX, it is just an application build on top of the .onnx
format that aims to help people that want to run inference in devices that are not supported by the official runtimes. Use at your own risk.
TODO