Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[Numpy] The symbolic export of BatchNorm is wrong #18373

Closed
sxjscience opened this issue May 20, 2020 · 5 comments · Fixed by #18377
Closed

[Numpy] The symbolic export of BatchNorm is wrong #18373

sxjscience opened this issue May 20, 2020 · 5 comments · Fixed by #18377
Assignees

Comments

@sxjscience
Copy link
Member

import mxnet as mx
import json
import pprint
mx.npx.set_np()
net = mx.gluon.nn.BatchNorm(epsilon=2E-5, axis=2)
net.hybridize()
net.initialize()
a = net(mx.np.ones((10, 3, 5, 5)))
net.export('bnorm', 0)
with open('bnorm-symbol.json') as f:
   dat = json.load(f)
   pprint.pprint(dat)

Output:

           {'attrs': {'__profiler_scope__': 'batchnorm0:',
                      'axis': '1',
                      'eps': '1e-05',
                      'fix_gamma': 'False',
                      'momentum': '0.9',
                      'use_global_stats': 'False'},
            'inputs': [[0, 0, 0], [1, 0, 0], [2, 0, 0], [3, 0, 1], [4, 0, 1]],
            'name': 'batchnorm0_fwd',
            'op': 'BatchNorm'}]}

We can find that eps and axis are not stored.

@sxjscience
Copy link
Member Author

I find that issue does not only happen in numpy but also exists in ndarray:

import mxnet as mx
import json
import pprint
#mx.npx.set_np()
net = mx.gluon.nn.BatchNorm(epsilon=2E-5, axis=2)
net.hybridize()
net.initialize()
a = net(mx.nd.ones((10, 3, 5, 5)))
net.export('bnorm', 0)
with open('bnorm-symbol.json') as f:
   dat = json.load(f)
   pprint.pprint(dat)

Output:

           {'attrs': {'__profiler_scope__': 'batchnorm0:',
                      'axis': '1',
                      'eps': '1e-05',
                      'fix_gamma': 'False',
                      'momentum': '0.9',
                      'use_global_stats': 'False'},
            'inputs': [[0, 0, 0], [1, 0, 0], [2, 0, 0], [3, 0, 1], [4, 0, 1]],
            'name': 'batchnorm0_fwd',
            'op': 'BatchNorm'}]}

@wkcn
Copy link
Member

wkcn commented Jun 5, 2020

Hi @sxjscience , is it available to delete the pre-built pip packages impacted by this issue?

BatchNorm is universally used, and this bug will not raise any exception. Users may install the previous version of MXNet with this bug, and find that the accuracy drops.

@sxjscience
Copy link
Member Author

@wkcn Yes, this is a disaster for the users. However, deleting the pre-built pip packages is also not a good option because there are users that are not using BatchNorm. We will need to ensure that the official 1.7 release does not contain this bug.

@szha
Copy link
Member

szha commented Jun 5, 2020

cc @ciyongch

@ciyongch
Copy link
Contributor

ciyongch commented Jun 5, 2020

Hi @szha, v1.7.x doesn't include the PR #17679 (it's a new feature after code freeze), so there's no such issue on this branch. While for v1.x branch, the fix were already cherry-picked.
I just check the latest commit of both v1.7.x and v1.x branches with the above reproducer, it works well. So no action is needed for this case.

           {'attrs': {'axis': '2',
                      'eps': '2e-05',
                      'fix_gamma': 'False',
                      'momentum': '0.9',
                      'use_global_stats': 'False'},
            'inputs': [[0, 0, 0], [1, 0, 0], [2, 0, 0], [3, 0, 1], [4, 0, 1]],
            'name': 'batchnorm0_fwd',
            'op': 'BatchNorm'}]}

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants