Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[MXNET-507] Set dtype=int32 for ret_indices in ordering ops #11134

Closed
wants to merge 18 commits into from

Conversation

sxjscience
Copy link
Member

@sxjscience sxjscience commented Jun 4, 2018

Description

There are two problems in the ordering operators, i.e, topk, sort, argsort:

  1. Only real_t is supported.

  2. The indices are stored as real_t. This will cause error in the backward pass where the gradient are passed to the wrong locations.

For example, we can not run the following code in the previous version:

import mxnet as mx
import numpy as np
import mxnet.ndarray as nd

ctx = mx.cpu()

a = mx.nd.arange(54686454, ctx=ctx, dtype=np.int32)
a.attach_grad()

k = 10
with mx.autograd.record():
b = mx.nd.topk(a, k=k, ret_typ='value')
b.backward(mx.nd.ones((k,), ctx=ctx, dtype=np.int32))
a_grad = a.grad.asnumpy()
for i in range(-1, - k - 1, -1):
assert a_grad[i] == 1

I propose to fix this bug by changing the dtype of indices to int32. This will make the code backward incompatible. However, it only breaks some rare usages. Normally we will directly output the indices or use them to slice a tensor. The current change do not break these common usages. (I have used the other solution mentioned below.)

Another solution is to support an additional flag dtype for these operators (as suggested in #11031). The problem of this solution is that we cannot avoid a nested macro, which is extremely slow to compile. So I haven't solved it this way.

MACRO_SWITCH(dtype, DType, {
   MACRO_SWITCH(idtype, IDType, {
      ...
   });
});

This PR also fixes the issue reported in #10085 that ordering ops do not support kNullOp.

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • Arbitrary dtype (exclude float16) for ordering operators, tests
  • For topk, argsort, add the dtype option to specify the type of the output indices, tests
  • Fix the bug that ordering ops do not support kNullOp, tests

Comments

  • This change is backward incompatible. However, I think it's reasonable to set the dtype of indices to int32 because it does not break the common usage where we use these indices to slice a tensor. The change is backward compatible now.

@sxjscience sxjscience changed the title [MXNET-507] Set dtype of index to be int32 for ordering ops [MXNET-507] Set dtype=int32 for ret_indices in ordering ops Jun 4, 2018
@sxjscience
Copy link
Member Author

After discussing offline with Eric, I'll add an additional dtype flag to make sure that the OP is backward compatible.

@szha
Copy link
Member

szha commented Jun 5, 2018

Feel free to remove "breaking" label when you're ready.

@asitstands
Copy link
Contributor

I think that it would be helpful if a check for the ranges of the floating point types is added. The recent addtion of dtype to multinomial implemented the check. https://github.com/apache/incubator-mxnet/blob/3eada3b32aeab5c8cdf7d507bcc3a986c9e5b91f/src/operator/random/sample_multinomial_op.h#L73-L76

@sxjscience
Copy link
Member Author

sxjscience commented Jun 11, 2018 via email

@szha szha removed the Breaking label Jun 12, 2018
@sxjscience sxjscience requested review from anirudh2290 and removed request for anirudh2290 June 19, 2018 02:45
@sxjscience
Copy link
Member Author

Is it okay to merge this in?

DMLC_DECLARE_FIELD(dtype)
.add_enum("uint8", mshadow::kUint8)
.add_enum("int32", mshadow::kInt32)
.add_enum("float16", mshadow::kFloat16)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we remove this option given that it won't be supported?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I'll remove it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@szha Actually, the dtype of indices can be float16. There is no need to remove the option.

@thomelane
Copy link
Contributor

@sxjscience any chance you could add a small update to the documentation saying that the output is sorted. thanks!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants