[Discussion] 1.7.0 Roadmap #16864

pengzhao-intel · 2019-11-20T05:56:22Z

copy below content from 1.6 roadmap PR #15589 :), thanks @szha

Let's start a discussion here about the roadmap towards 1.7.0. We are looking for:

New features that are useful to your research and development.
Improvements and patches to existing features.
If you have any item that you'd like to propose to have in the roadmap, please do:

Create (or locate existing) issue/pull request for the item, note the issue/pull request number.
Comment in this issue:

the above issue number,
one sentence of what the item is about and why it's useful to you.

Indicate whether you'd be willing to help out on the item.
Share the ETA if you're driving the item and have an guesstimate on when it will be done.

Feel free to include items that weren't included in past roadmap discussions that you still wish to include in this release.
cc @apache/mxnet-committers

pengzhao-intel · 2019-11-20T06:02:22Z

For MKLDNN backend

MKLDNN as the default CPU binary distribution
https://lists.apache.org/thread.html/1a22dbd79098adab6d02d16e8d607bae2acc908c0bb1b085d28a51ba@%3Cdev.mxnet.apache.org%3E
AMP to support bfloat16 training
[DONE] MKLDNN's branding changed to DNNL from v1.1 Upgrade MKL-DNN dependency to v1.1 #16823
Reference: https://intel.github.io/mkl-dnn/dev_guide_transition_to_dnnl.html

atiqsayyed · 2019-11-20T07:37:30Z

For Scala 2.12 release

Prebuilt scala package for 2.12 #16438
with growing scala community, the current version for most of the system is 2.13 and existing mxnet release contains only 2.11, It'll be very helpful to have releases with the latest 2.12 and 2.13 versions.

I needed this feature so i locally build 2.12 version, I want to help with this issue but I need some guidance about the current build package and how to go ahead with it.

I'm more than willing to contribute to this specific story

leezu · 2019-11-20T07:52:13Z

@atiqsayyed this may be feasible for MXNet 1.6 release if you are willing to work on it. You can comment in #16438 and ping yzhliu, nswamy or pllarroy for guidance (listed as codeowner https://github.com/apache/incubator-mxnet/blob/61c8bafdcfee129e4f7a491438a2402e6762ddd9/CODEOWNERS#L16)

cjolivier01 · 2019-11-20T14:48:27Z

XLA or MLIR graph support. Basically generate XLA-compiler-consumable network graph protobuf similar to pytorch’s approach. This actually isn’t a huge undertaking and will add a lot of value to mxnet imho, making it usable for other custom hardware as only an incremental approach from other XLA-compatible platforms such as pytorch and tensorflow.

pengzhao-intel · 2019-11-21T00:55:13Z

XLA or MLIR graph support. Basically generate XLA-compiler-consumable network graph protobuf similar to pytorch’s approach. This actually isn’t a huge undertaking and will add a lot of value to mxnet imho, making it usable for other custom hardware as only an incremental approach from other XLA-compatible platforms such as pytorch and tensorflow.

Really good suggestion! Our team is working on the XLA/TVM supports for MXNet but it's still in the early stage and I am not sure we can catch up 1.7.

I will update our progress in the community :)

cjolivier01 · 2019-11-21T02:12:46Z

XLA or MLIR graph support. Basically generate XLA-compiler-consumable network graph protobuf similar to pytorch’s approach. This actually isn’t a huge undertaking and will add a lot of value to mxnet imho, making it usable for other custom hardware as only an incremental approach from other XLA-compatible platforms such as pytorch and tensorflow.

Really good suggestion! Our team is working on the XLA/TVM supports for MXNet but it's still in the early stage and I am not sure we can catch up 1.7.

I will update our progress in the community :)

What sort of ETA were you thinking for XLA support? 1.7.1? :D

pengzhao-intel · 2019-11-21T02:30:57Z

What sort of ETA were you thinking for XLA support? 1.7.1? :D

Most likely, we will start from TVM first and then extend to XLA.

I am not sure about the timeline of 1.7. Maybe some experimental features go to 1.7 and most of the features at 1.8 (or later).

ptrendx · 2019-11-21T17:38:11Z

XLA is effectively dead at this point so I'm not sure why we would want to invest in that. MLIR is not really ready for prime time. Out of all of the compiler technologies (which I agree are important) TVM seems to be the best and most mature option (additional points for it being Apache project as well and multiple community members already working on integrating it into MXNet).

cjolivier01 · 2019-11-21T18:06:26Z

XLA is effectively dead at this point so I'm not sure why we would want to invest in that. MLIR is not really ready for prime time. Out of all of the compiler technologies (which I agree are important) TVM seems to be the best and most mature option (additional points for it being Apache project as well and multiple community members already working on integrating it into MXNet).

You're certainly entitled to your opinion on XLA/MLIR and TVM, but the fact of the matter is that XLA is more widely supported than TVM (two leading vendors, TF and Pytorch, for instance). If adoption of mxnet is what you're looking for, then adopting technologies that make it easier for hardware vendors to support the maximum number of platforms for the least amount of investment is the best route, imho. So far the approach that mxnet has been taking (opting for proprietary technologies) has arguably been not entirely successful.

szha · 2019-11-22T02:28:04Z

@cjolivier01 @pengzhao-intel @ptrendx would you mind opening a feature request issue as suggested by the initial post? The roadmap issue is usually for tracking purpose, and having other discussions inside makes it harder to track the features to add.

cjolivier01 · 2019-11-22T03:26:40Z

@cjolivier01 @pengzhao-intel @ptrendx would you mind opening a feature request issue as suggested by the initial post? The roadmap issue is usually for tracking purpose, and having other discussions inside makes it harder to track the features to add.

“ Let's start a discussion here about the roadmap towards 1.7.0. We are looking for:

New features that are useful to your research and development.”

This is what’s in the description of this page. I don’t think this has been strayed from.

szha · 2019-11-22T03:28:54Z

I was referring the the instructions just below the lines you were quoting:

If you have any item that you'd like to propose to have in the roadmap, please do:

Create (or locate existing) issue/pull request for the item, note the issue/pull request number.
Comment in this issue:

the above issue number,

one sentence of what the item is about and why it's useful to you.

guoquan · 2019-11-26T21:44:59Z

Let's have it then. #16916 I would (personally) focus it to requesting support for XLA devices.
It would be helpful in the way that it enables access to the ~~evil~~ TPU.

wkcn · 2019-12-06T06:43:20Z

Existing Feature:

dynamic custom operator support on CPU dynamic custom operator support #15921

Propose:

dynamic custom operator support on GPU
The implementation of operators on CPU and GPU could share the same forward/backward function, and distinguish them by MXTensor::dltensor.ctx.
The inference of Sparse ConvNets
Fast Sparse ConvNets

mikeobr · 2019-12-06T20:25:30Z

A feature that would be very useful for production deployments of MXNet models would be the ability to cache or save Autotune results, there are several feature requests already around this topic, such as #16173 and #10567.

Our production servers have multiple instances of networks on them. Autotuning currently has the issues of :

Cold start times when new versions start getting calls
Instability issues if autotuning is triggered simultaneously amongst several networks.

apeforest · 2019-12-26T21:27:25Z

A simple script to build from source using cmake: #17180

ChaiBapchya · 2020-02-12T23:01:14Z

Proposal : Exposing OpPerf utility in the MXNet's Pip.

@TaoLv As discussing in one of the OpPerf PR's (#17500), let's make OpPerf available to users by adding it to the MXNet binary.
This will enhance the usability of the tool.

Brief Description : OpPerf

OpPerf is tool for benchmarking MXNet operator execution. It returns performance stats about operator (specifically Memory Consumption, Forward Time & Backward Time (if applicable)).

Currently, OpPerf utility can be tested by cloning the mxnet repo + setting the PYTHONPATH to the path to cloned repo and run in 1 of the 3 ways

1. Benchmark All Ops

python incubator-mxnet/benchmark/opperf/opperf.py --output-format md --output-file mxnet_operator_benchmark_results.md

This runs OpPerf on all MXNet operators (whose inputs have been given in OpPerf default_params file).

Sample output : https://gist.github.com/ChaiBapchya/7ec49647bb2ae8549e00d703e99371af

2. Benchmark category-specific ops

from benchmark.opperf.nd_operations.binary_operators import run_mx_binary_broadcast_operators_benchmarks

# Run all Binary Broadcast operations benchmarks with default input values
print(run_mx_binary_broadcast_operators_benchmarks())

3. Benchmark individual ops

import mxnet as mx
from mxnet import nd

from benchmark.opperf.utils.benchmark_utils import run_performance_test

add_res = run_performance_test(nd.add, run_backward=True, dtype='float32', ctx=mx.cpu(),
                               inputs=[{"lhs": (1024, 1024),
                                        "rhs": (1024, 1024)}],
                               warmup=10, runs=25)
print(add_res)

For more details : https://github.com/apache/incubator-mxnet/tree/master/benchmark/opperf

stu1130 · 2020-04-17T00:26:48Z

#17177 solves the locale issue for not only JVM languages but also Python, see #18079. So I want to include this one on 1.7

ciyongch · 2020-04-22T00:54:54Z

A kindly remainder, we've postponed the code freeze date to April 25th PST to extend the time windows for those pending PRs targeting in v1.7.0, please make sure you've all your need in the v1.7.x branch now. Thanks!

ciyongch · 2020-04-23T12:22:20Z

Hi @ptrendx @roywei , may I know how to decide (or whom should I check with) if this release should go out on Medium blog or not, which you did for 1.5.0 and 1.6.0? Thanks!

deepakkumar1984 · 2020-04-29T13:27:29Z

Can this feature #17940 be considered?

Regards,
Deepak

ciyongch · 2020-04-30T00:55:29Z

Hi @deepakkumar1984, it's a great feature for MXNet extension, but given the release process (1.7.0 is code freeze now, which means no more new feature will be included in this release) and besides this feature is still under development, I do suggest to make it mature and push it to the next release (like 1.8.0 if there's a plan, or probably 2.0?) What do you think?
Thanks,
Ciyong

deepakkumar1984 · 2020-04-30T01:04:09Z

Hello Ciyong, Thanks a lot for your suggestion. I will work on the completeness of the library and then request again in next release. I was thinking if it could get some visibility to help getting some dev and test contribution, If its possible to mention about this library somewhere in MxNet site, it will be very very helpful. I can come up with the content, the current status of the API Development is as follows: MxNet Core: 90% Dev completed (Working on examples and documentation) Keras-MxNet: 40% Dev completed Gluon-CV: 20% Dev completed MxNet-SciKit: 10% Scikit learn version of library based on MxNet NDArray which will give CPU and GPU capability. Gluon-NLP: 5% Dev Gluon-TS: 5% Dev Regards, Deepak

ciyongch · 2020-04-30T09:44:11Z

@deepakkumar1984 that sounds great! A RFC or updates on dev@mxnet.apache.org could be helpful to get more visibility as well as suggestions from community :)

I was thinking if it could get some visibility to help getting some dev and test contribution, If its possible to mention about this library somewhere in MxNet site, it will be very very helpful.

pengzhao-intel added Roadmap Feature request labels Nov 20, 2019

pengzhao-intel pinned this issue Nov 20, 2019

sxjscience mentioned this issue Dec 26, 2019

[RFC] Apache MXNet 2.0 Roadmap #16167

Open

szha mentioned this issue Jan 19, 2020

adding docs for 64bit C APIs of large tensor and removing not required int64 C APIs #17309

Merged

4 tasks

This was referenced Feb 27, 2020

[RFC] New Branches for MXNet 1.x, 1.7.x, and 2.x #17701

Closed

bump up 1.x branch to 1.7 #17740

Closed

bump up 1.x branch to 1.7.0 #17741

Merged

This was referenced Apr 11, 2020

Fix ElemwiseSum for more than 4 inputs #17995

Merged

GPU gemms true fp16 (#17466) #18023

Merged

[v1.x] Backport #17702 and #17872 to v1.x branch #18038

Merged

zixuanweeei mentioned this issue Apr 14, 2020

Cherrypick #18018 #18044

Merged

ElaineBao mentioned this issue Apr 15, 2020

[v1.x] backport #17900 "[MKLDNN] support using any format in pooling backward" #18067

Merged

7 tasks

This was referenced Apr 16, 2020

[1.7] MXNet Extension PRs (#17623, #17569, #17762) #18063

Merged

update cub commit for license issue #15963

Closed

pengzhao-intel unpinned this issue May 25, 2020

ciyongch mentioned this issue Sep 1, 2020

Update NEWS, README and website for 1.7.0 #19047

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Discussion] 1.7.0 Roadmap #16864

[Discussion] 1.7.0 Roadmap #16864

pengzhao-intel commented Nov 20, 2019 •

edited

Loading

pengzhao-intel commented Nov 20, 2019 •

edited

Loading

atiqsayyed commented Nov 20, 2019

leezu commented Nov 20, 2019 •

edited

Loading

cjolivier01 commented Nov 20, 2019

pengzhao-intel commented Nov 21, 2019

cjolivier01 commented Nov 21, 2019

pengzhao-intel commented Nov 21, 2019

ptrendx commented Nov 21, 2019

cjolivier01 commented Nov 21, 2019

szha commented Nov 22, 2019

cjolivier01 commented Nov 22, 2019

szha commented Nov 22, 2019

guoquan commented Nov 26, 2019

wkcn commented Dec 6, 2019 •

edited

Loading

mikeobr commented Dec 6, 2019

apeforest commented Dec 26, 2019

ChaiBapchya commented Feb 12, 2020 •

edited

Loading

stu1130 commented Apr 17, 2020 •

edited

Loading

ciyongch commented Apr 22, 2020

ciyongch commented Apr 23, 2020

deepakkumar1984 commented Apr 29, 2020

ciyongch commented Apr 30, 2020

deepakkumar1984 commented Apr 30, 2020 via email •

edited

Loading

ciyongch commented Apr 30, 2020

[Discussion] 1.7.0 Roadmap #16864

[Discussion] 1.7.0 Roadmap #16864

Comments

pengzhao-intel commented Nov 20, 2019 • edited Loading

pengzhao-intel commented Nov 20, 2019 • edited Loading

atiqsayyed commented Nov 20, 2019

leezu commented Nov 20, 2019 • edited Loading

cjolivier01 commented Nov 20, 2019

pengzhao-intel commented Nov 21, 2019

cjolivier01 commented Nov 21, 2019

pengzhao-intel commented Nov 21, 2019

ptrendx commented Nov 21, 2019

cjolivier01 commented Nov 21, 2019

szha commented Nov 22, 2019

cjolivier01 commented Nov 22, 2019

szha commented Nov 22, 2019

guoquan commented Nov 26, 2019

wkcn commented Dec 6, 2019 • edited Loading

mikeobr commented Dec 6, 2019

apeforest commented Dec 26, 2019

ChaiBapchya commented Feb 12, 2020 • edited Loading

Brief Description : OpPerf

1. Benchmark All Ops

2. Benchmark category-specific ops

3. Benchmark individual ops

stu1130 commented Apr 17, 2020 • edited Loading

ciyongch commented Apr 22, 2020

ciyongch commented Apr 23, 2020

deepakkumar1984 commented Apr 29, 2020

ciyongch commented Apr 30, 2020

deepakkumar1984 commented Apr 30, 2020 via email • edited Loading

ciyongch commented Apr 30, 2020

pengzhao-intel commented Nov 20, 2019 •

edited

Loading

pengzhao-intel commented Nov 20, 2019 •

edited

Loading

leezu commented Nov 20, 2019 •

edited

Loading

wkcn commented Dec 6, 2019 •

edited

Loading

ChaiBapchya commented Feb 12, 2020 •

edited

Loading

stu1130 commented Apr 17, 2020 •

edited

Loading

deepakkumar1984 commented Apr 30, 2020 via email •

edited

Loading