[MXNET-533] MXNet-ONNX export #11213

Roshrini · 2018-06-08T23:05:53Z

Description

This PR has MXNet to ONNX exporter APIs to export MXNet trained models to ONNX protobuf so that those models can be imported in other frameworks for inference.

Test framework:
Currently, we import ONNX models in MXNet, then export them to ONNX, import it in MXNet again to verify inference results.

Working models:

Alexnet, Densenet, resnet50, squeezenet, vgg16, vgg19, inception_v1, inception_v2, Googlenet, caffenet, R-CNN

@spidydev @anirudhacharya @piiswrong @sandeep-krishnamurthy @nswamy @anirudh2290

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage:
Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
Code is well-documented:
For user-facing API changes, API doc string has been updated.
To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

Feature1, tests, (and when applicable, API doc)
Feature2, tests, (and when applicable, API doc)

anirudhacharya · 2018-06-11T19:33:12Z

python/mxnet/contrib/onnx/_export/export_onnx.py

+        op = str(node["op"])
+        if op not in MXNetGraph.registry_:
+            raise AttributeError("No conversion function registered for op type %s yet." % op)
+        convert_fun = MXNetGraph.registry_[op]


change the name to convert_func

sandeep-krishnamurthy

Thanks for awesome work @Roshrini @spidydev @anirudhacharya

Some comments below.

sandeep-krishnamurthy · 2018-06-11T18:51:22Z

python/mxnet/contrib/onnx/_export/export_helper.py

+import mxnet as mx
+
+def load_module(json_path, params_path, input_shape):
+    """Loads the MXNet model file, retrieves symbol and parameters and returns.


nit: and returns MXNet symbol and params (weights).

sandeep-krishnamurthy · 2018-06-11T18:54:08Z

python/mxnet/contrib/onnx/_export/export_helper.py

+import logging
+import mxnet as mx
+
+def load_module(json_path, params_path, input_shape):


nit: json_path is too generic name for the function. Will be hard to maintain later. Can we more specific? sym_filepath, params_filepath or something like that?

sandeep-krishnamurthy · 2018-06-11T18:56:04Z

python/mxnet/contrib/onnx/_export/export_helper.py

+        Model weights including both arg and aux params.
+    """
+    if not (os.path.isfile(json_path) and os.path.isfile(params_path)):
+        raise ValueError("Provide valid path to the json and params file")


nit: It is always useful to have specific Error/Warnings message on what is wrong and why.

sandeep-krishnamurthy · 2018-06-11T18:58:22Z

python/mxnet/contrib/onnx/_export/export_helper.py

+        raise ValueError("Provide valid path to the json and params file")
+    else:
+        try:
+            model_name = json_path.rsplit('.', 1)[0].rsplit('-', 1)[0]


nit: I understand this logic reads symbol and epochs from sym.json file. But, please add code comment for this logic for future bug fixes.

sandeep-krishnamurthy · 2018-06-11T19:02:18Z

python/mxnet/contrib/onnx/_export/export_helper.py

+            model_name = json_path.rsplit('.', 1)[0].rsplit('-', 1)[0]
+            num_epochs = int(params_path.rsplit('.', 1)[0].rsplit('-', 1)[1])
+        except IndexError:
+            logging.info("Model and params name should be in format: "


Is epoch necessary? Only for retraining the loaded model?
As a standard, saving a model need not have epoch number. Probably a necessary for saving checkpoint models. Though MXNet as of today mandates. But if we introduce a new API to save models without epochs attached, do we have any issue here?

Keeping epochs to 0 if not provided with the model name

sandeep-krishnamurthy · 2018-06-11T19:50:29Z

python/mxnet/contrib/onnx/_export/op_translations.py

+        name=name,
+        epsilon=eps,
+        momentum=momentum,
+        spatial=1


MXNET doesnt't have spatial Batch Norm , so actually should be set to 0. While importing ONNX model we will ignore this attribute. But might be an issue when exporting to caffe2/other frameworks that supports spatialBN , thanks for pointing.

sandeep-krishnamurthy · 2018-06-11T19:51:13Z

python/mxnet/contrib/onnx/_export/op_translations.py

+    # Creating a dictionary here, but if this titlecase pattern
+    # mxnet_name.title()
+    act_types = {
+        "tanh": "Tanh",


only tanh and relu supported?

sandeep-krishnamurthy · 2018-06-11T19:52:10Z

python/mxnet/contrib/onnx/_export/op_translations.py

+    onnx_pad_width = [0]*num_pad_values
+
+    start_index = 0
+    end_index = int(num_pad_values/2)


nit: floor?

MXNet pad values in pad op(https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.pad) is always multiple of two. Will add comment to clarify.

sandeep-krishnamurthy · 2018-06-11T19:59:40Z

python/mxnet/contrib/onnx/_export/op_translations.py

+
+
+@mx_op.register("slice_axis")
+def convert_slice_axis(node, **kwargs):


just slice operator?

Slice operator will be added later. not used by any models tested yet :)

sandeep-krishnamurthy · 2018-06-11T20:00:43Z

python/mxnet/contrib/onnx/_import/op_translations.py

@@ -114,7 +114,7 @@ def maximum(attrs, inputs, proto_obj):
        for op_input in inputs[2:]:
            mxnet_op = symbol.maximum(mxnet_op, op_input)
    else:
-        mxnet_op = inputs[0]
+        mxnet_op = symbol.maximum(inputs[0], inputs[0])


maximum of same element?

yes, onnx has a case where if there is only one input, it returns that input itself as output. MXNet needs 2 inputs always

ThomasDelteil · 2018-06-12T06:56:39Z

python/mxnet/contrib/onnx/_export/export_helper.py

+import logging
+import mxnet as mx
+
+def load_module(sym_filepath, params_filepath, input_shape):


Question: what is the purpose of this function why couldn't it be replaced by a simple:

sym = mx.sym.load(sym_filepath) params = mx.nd.load(params_filepath) return sym, params

sym.load and nd.load works to get model and params objects from files but if the model is trained using old version of mxnet, it wont upgrade the model. There will is a compatibility issue.
for example, some models has "param" or "attr" instead of "attrs" in json file.

Nice corner case that is hard to think of 👍

ThomasDelteil · 2018-06-12T06:58:49Z

python/mxnet/contrib/onnx/_export/export_model.py

+    model : str or symbol object
+        Path to the json file or Symbol object
+    weights : str or symbol object
+        Path to the params file or Params object. (Including both arg_params and aux_params)


is it a dictionary of Parameters or something else ?

it can be both , changed the desc to be more explicit.

ThomasDelteil · 2018-06-12T07:05:24Z

python/mxnet/contrib/onnx/_export/export_model.py

+from .export_helper import load_module
+
+
+def export_model(model, weights, input_shape, input_type=np.float32,


weights -> params, to be consistent with the rest of the file

zhreshold · 2018-06-12T06:56:12Z

python/mxnet/contrib/onnx/_export/export_onnx.py

+        return dict([(k.replace("arg:", "").replace("aux:", ""), v.asnumpy())
+                     for k, v in weights_dict.items()])
+
+    def create_onnx_graph_proto(self, sym, params, in_shape, in_type, log=False):


verbose=False

zhreshold · 2018-06-12T07:02:42Z

python/mxnet/contrib/onnx/__init__.py

@@ -18,3 +18,4 @@

 from ._import.import_model import import_model, get_model_metadata


this does not make sense to me.
why do you want to put public function in private module folder _import or _export and include them later?

import is a reserved keyword, we cant have a folder called import. we can probably rename the two folders to onnx_import and onnx_export and make its member files private, except for the modules that we are exposing to the user.

folder name changed to be public , _import --> onnx2mx , _export-->mx2onnx . also changed the files in the folder as per their usage

zhreshold · 2018-06-12T07:03:39Z

python/mxnet/contrib/onnx/_export/export_model.py

+from .export_helper import load_module
+
+
+def export_model(model, weights, input_shape, input_type=np.float32,


use verbose=False

zhreshold · 2018-06-12T07:05:27Z

python/mxnet/contrib/onnx/_export/export_onnx.py

+        # create module, passing cpu context
+        ctx = context.cpu()
+        test_mod = mod.Module(symbol=sym, data_names=data_names, context=ctx, label_names=None)
+        test_mod.bind(for_training=False, data_shapes=data_shapes, label_shapes=None)


label_shapes may not always be None?

True, but the motive of this function is to just get the shape of the output after forward pass.

zhreshold · 2018-06-12T07:07:08Z

python/mxnet/contrib/onnx/_export/export_onnx.py

+        self.output_tensors = []
+
+    @staticmethod
+    def register(op_name):


there is no doc for input and output through out static methods in this class

usually , detailed info is only added for public api's.

zhreshold · 2018-06-12T07:08:38Z

python/mxnet/contrib/onnx/_export/export_onnx.py

+            op = node["op"]
+            name = node["name"]
+            if log:
+                print("Converting idx: %d, op: %s, name: %s" % (idx, op, name))


use logging.xx

zhreshold · 2018-06-12T07:11:12Z

python/mxnet/contrib/onnx/_export/export_onnx.py

+            if log:
+                print("Converting idx: %d, op: %s, name: %s" % (idx, op, name))
+
+            if op == "null" and name not in params:


agree, the logic is a confusing here, better simplify it

zhreshold · 2018-06-12T07:15:31Z

tests/python-pytest/onnx/export/backend.py

+
+
+    @classmethod
+    def prepare(cls, model, device='CPU', **kwargs):


why not directly use mx.cpu()? and it's in capital letter without careful handling.

this method is declared by ONNX. The backends using ONNX test framework derives from "backend" class and implements these functions.

zhreshold · 2018-06-12T07:18:35Z

tests/python-pytest/onnx/export/backend.py

@@ -0,0 +1,98 @@
+# Licensed to the Apache Software Foundation (ASF) under one


it seems to be added already, but what is the name python-pytest?
there's already a folder tests/python/unittest

mxnet unittests uses nosetests but onnx backend test framework uses pytest, so keep them separate we created another folder for pytests.

just remove the python-pytest folder, and use onnx, the name is pretty confusing and meaningless as an empty one containing only onnx.

in the future if there is another component that is built into MXNet that uses pytest instead of nosetests, then what will we do?

the point of naming it python-pytest is separate them from the other tests which uses nosetests framework?

And this naming was part of a previous PR #9963 and was suggested by @marcoabreu during the review process.

zhreshold · 2018-06-12T07:20:08Z

tests/python-pytest/onnx/export/mxnet_export_test.py

+    params.update(arg_params)
+    params.update(aux_params)
+
+    onnx_file = model_path.rsplit('/', 1)[0] + "/exported_"+model_name+".onnx"


use + to concat path is not portable

2. Refactored test framework to support ONNX backened tests. 2. Added Operator support: - Convolution2D - BatchNorm - Add

- Add, Sub, Mul, Div, Sum

- sigmoid, relu, pad( constant, edge, reflect), tanh - enabled corresponding ONNX backend tests.

Added Operators : Ceil, Floor

MaxPool, AvgPool, GlobalMaxPool, GlobalAvgPool, matmul

ArgMax, ArgMin, maximum, minimum

…dded only for these. Fixed logic error with convert_string_to_list()

Changed underline files public or private as per usage Resolved conflicts with the latest

Added some error checking

…onnx_export

Roshrini · 2018-06-18T21:35:57Z

@aaronmarkham Can you review docs part of this PR?

Roshrini · 2018-06-19T16:25:10Z

@sandeep-krishnamurthy @zhreshold Thank you for reviewing the code. Addressed all the comments now.

sandeep-krishnamurthy

Thank you @Roshrini @spidydev. Great work! Will be very useful for users in combination with ONNX model zoo.

Will wait for other reviewers approval and doc approval.
@zhreshold @aaronmarkham @ThomasDelteil

sandeep-krishnamurthy · 2018-06-25T16:42:57Z

LGTM. Merging the changes.

szha · 2018-06-28T19:31:38Z

@sandeep-krishnamurthy There are vetos in effect from @zhreshold. Per agreement among committers you are not supposed to merge it.

szha · 2018-06-28T19:32:11Z

@zhreshold could you take a look at this change again and see if your concerns are sufficiently addressed?

zhreshold · 2018-06-28T20:01:01Z

Sorry about the late update, there's one minor issue need to be addressed.

zhreshold · 2018-06-28T20:02:06Z

Since the conversation is really long, I might have missed some updates, please ping me directly if I am not responsive. Thanks!

anirudhacharya · 2018-06-28T21:30:21Z

@zhreshold also please let me know how to ping you directly, do i do it on the slack channel?

szha · 2018-06-28T21:34:55Z

@zhreshold if this is a small issue to address then let's request a patch from the author. Could you create an issue?

marcoabreu · 2018-06-28T21:44:00Z

Since there is so much conversation in this thread, could you please list the open issues?

sandeep-krishnamurthy · 2018-06-28T22:03:08Z

@szha - I explicitly pinged all the reviewers and waited for 6 days, before merging the PR. I also tried to best of my ability to gather data from contributors if the suggested changes by other reviewers are addressed before merging the PR.

zhreshold · 2018-06-28T22:54:39Z

I opened a new issue regarding my concerns in #11475

szha · 2018-06-28T23:15:28Z

@sandeep-krishnamurthy thanks for the efforts. Please respect "request changes" as vetos nonetheless and try and reach @zhreshold, especially given that you sit in the same office. Much appreciated.

* Resolve conflicts * Export module Test Framework * refactoring export to work with pretrained models * comments added * 1. Refactored export module. 2. Refactored test framework to support ONNX backened tests. 2. Added Operator support: - Convolution2D - BatchNorm - Add * Added Arithmetic operators: - Add, Sub, Mul, Div, Sum * Added operator support: - sigmoid, relu, pad( constant, edge, reflect), tanh - enabled corresponding ONNX backend tests. * Enabled ONNX tests: test_conv, test_basic_conv Added Operators : Ceil, Floor * Added support for: MaxPool, AvgPool, GlobalMaxPool, GlobalAvgPool, matmul * adding more operators * Added Operator support: ArgMax, ArgMin, maximum, minimum * Enabled more BASIC_MODEL tests * Added power operator tests * Added support for reshape. ONNX only supports 0, -1 special values. Added only for these. Fixed logic error with convert_string_to_list() * some tests enabled * enabling squeezenet * LRN Op support * mul_scalar modified to take scalar input * cleaning some code * Resolving conlicts on rebase * Resolving rebase conflicts * id mapping updated for all operators * save onnx models added, some code cleanup * enabled more tests * conv pad calc fixed * reshape op fix * Added support for elu, leakyRelu, prelu * Cleanup - Removed run_node, not needed anymore. - Used correct get_metadata api * valueinfoproto fix, googlenet test added * Removed redundant code. - run_node - Using correct get_metadata_api * dilation added * Lint fixes * lint fixes * some fixes to make export work with onx1.2.1 * enabled more tests * mxnet_export_test file added * duplicate file deleted * reduce ops added * some small fixes * some lint fixes * Add tests for inception_v1 and inception_v2 * Add CI runs for export module * docstring added * lint fixes, pooling attr fix * fix * fix global_pool * CI run fix * code cleanup * lint fix * some code cleanup * pad in pooling added * slicechannel notimplementederror raised * Added required license comments * Lint fixes * lint fix * lint fix * lint fix * lint fix * Correct license statement * Adding onnx a runtime dependency * Fix import module error for string_types * Making ONNX runtime dependency * fixing some comments * addressing some comments * params rename * lint fixes * fixes * spatial disabled, path fixed * fixing some comments * Added support for remaining act_type(softsign, sigmoid, softrelu) in Activation operator * changing import * adding some comments * Add squeeze op * Refactored logic to handle extra node(output label node) for saved mxnet model Added comments * minor fix for squeeze operator. Also, added error handling * identity operator added * scalar ops added * Renamed onnx support folders to mark it public folders Changed underline files public or private as per usage Resolved conflicts with the latest * Added support L2Normalization op Added some error checking * added comments and warning * added comments and warning * doc API ref added

Roshrini requested a review from szha as a code owner June 8, 2018 23:05

szha requested review from piiswrong and zhreshold and removed request for szha June 11, 2018 18:11

anirudhacharya reviewed Jun 11, 2018

View reviewed changes

sandeep-krishnamurthy suggested changes Jun 11, 2018

View reviewed changes

ThomasDelteil reviewed Jun 12, 2018

View reviewed changes

zhreshold suggested changes Jun 12, 2018

View reviewed changes

rajanksin force-pushed the onnx_export branch 7 times, most recently from f28076a to 9922a00 Compare June 14, 2018 15:24

Roshrini and others added 15 commits June 14, 2018 11:54

Resolve conflicts

8d083f8

Export module Test Framework

4c04e19

refactoring export to work with pretrained models

e66ef3b

comments added

aa4e7cb

1. Refactored export module.

f9e8cbf

2. Refactored test framework to support ONNX backened tests. 2. Added Operator support: - Convolution2D - BatchNorm - Add

Added Arithmetic operators:

de3632d

- Add, Sub, Mul, Div, Sum

Added operator support:

46931d3

- sigmoid, relu, pad( constant, edge, reflect), tanh - enabled corresponding ONNX backend tests.

Enabled ONNX tests: test_conv, test_basic_conv

ca74b02

Added Operators : Ceil, Floor

Added support for:

f653baf

MaxPool, AvgPool, GlobalMaxPool, GlobalAvgPool, matmul

adding more operators

2cf7be0

Added Operator support:

8ecf9dd

ArgMax, ArgMin, maximum, minimum

Enabled more BASIC_MODEL tests

af73173

Added power operator tests

e851cf3

Added support for reshape. ONNX only supports 0, -1 special values. A…

54edda1

…dded only for these. Fixed logic error with convert_string_to_list()

some tests enabled

bb0b76f

rajanksin force-pushed the onnx_export branch 2 times, most recently from a723644 to 43788cf Compare June 14, 2018 21:55

Renamed onnx support folders to mark it public folders

20bb2fd

Changed underline files public or private as per usage Resolved conflicts with the latest

rajanksin force-pushed the onnx_export branch from 43788cf to 20bb2fd Compare June 14, 2018 22:33

Added support L2Normalization op

f4902e1

Added some error checking

Roshrini mentioned this pull request Jun 15, 2018

MXNet_ONNX export feature request #11305

Open

added comments and warning

84c9320

rajanksin force-pushed the onnx_export branch from 84c9320 to a98bbe1 Compare June 15, 2018 20:31

added comments and warning

f36230d

rajanksin force-pushed the onnx_export branch from a98bbe1 to f36230d Compare June 16, 2018 16:15

Roshrini added 2 commits June 18, 2018 14:33

doc API ref added

6327847

Merge branch 'onnx_export' of https://github.com/Roshrini/mxnet into …

83d798c

…onnx_export

sandeep-krishnamurthy approved these changes Jun 20, 2018

View reviewed changes

sandeep-krishnamurthy merged commit 7d91602 into apache:master Jun 25, 2018

zhreshold mentioned this pull request Jun 28, 2018

Amending ONNX importer/exporter #11213 #11475

Closed

ciyongch mentioned this pull request Jun 4, 2020

License issue with 1.6.0.rc1 #17329

Closed

11 tasks



		@mx_op.register("slice_axis")
		def convert_slice_axis(node, **kwargs):

		from .export_helper import load_module


		def export_model(model, weights, input_shape, input_type=np.float32,

		@@ -18,3 +18,4 @@

		from ._import.import_model import import_model, get_model_metadata



		@classmethod
		def prepare(cls, model, device='CPU', **kwargs):

		@@ -0,0 +1,98 @@
		# Licensed to the Apache Software Foundation (ASF) under one

[MXNET-533] MXNet-ONNX export #11213

[MXNET-533] MXNet-ONNX export #11213

Conversation

Roshrini commented Jun 8, 2018

Description

Checklist

Essentials

Changes

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sandeep-krishnamurthy left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Roshrini commented Jun 18, 2018

Roshrini commented Jun 19, 2018

sandeep-krishnamurthy left a comment

Choose a reason for hiding this comment

sandeep-krishnamurthy commented Jun 25, 2018

szha commented Jun 28, 2018

szha commented Jun 28, 2018

zhreshold commented Jun 28, 2018

zhreshold commented Jun 28, 2018

anirudhacharya commented Jun 28, 2018

szha commented Jun 28, 2018

marcoabreu commented Jun 28, 2018

sandeep-krishnamurthy commented Jun 28, 2018

zhreshold commented Jun 28, 2018

szha commented Jun 28, 2018