From f2e90a22c7190e86441a415b1dbdb47ad3603f30 Mon Sep 17 00:00:00 2001 From: ciyong Date: Thu, 3 Sep 2020 01:15:46 +0800 Subject: [PATCH] Update NEWS, README and website for 1.7.0 (#19047) * update NEWS.md and README.md * update get_started and pip for 1.7.0 * update website for 1.7.0 * update download link for source package * Use archive.apache.org for the previous releases and mirror for the latest release * update description for oneDNN and add native binary installation command * update news --- NEWS.md | 815 ++++++++++++++---- docs/static_site/src/.htaccess | 6 +- .../_includes/get_started/get_started.html | 7 +- .../get_started/linux/python/cpu/pip.md | 60 +- .../get_started/linux/python/gpu/pip.md | 9 +- .../src/_includes/get_started/pip_snippet.md | 2 +- .../src/pages/get_started/download.md | 3 +- 7 files changed, 717 insertions(+), 185 deletions(-) diff --git a/NEWS.md b/NEWS.md index 5e301d1f65a3..0ba22152d4e0 100644 --- a/NEWS.md +++ b/NEWS.md @@ -18,87 +18,128 @@ MXNet Change Log ================ - [MXNet Change Log](#mxnet-change-log) - * [1.6.0](#160) - + [Deprecation of Python 2](#deprecation-of-python-2) - + [New Features](#new-features) - - [NumPy compatible interface and using TVM to generate operators](#numpy-compatible-interface-and-using-tvm-to-generate-operators) - - [Graph optimizations](#graph-optimizations) + - [1.7.0](#170) + - [New features](#new-features) + - [MXNet Extensions: custom operators, partitioning, and graph passes](#mxnet-extensions-custom-operators-partitioning-and-graph-passes) + - [OpPerf utility enabled in the binary distribution](#opperf-utility-enabled-in-the-binary-distribution) + - [MKL-DNN](#mkl-dnn) + - [MKL-DNN as the default CPU backend in binary distribution](#mkl-dnn-as-the-default-cpu-backend-in-binary-distribution) + - [Branding change to DNNL](#branding-change-to-dnnl) + - [Support bfloat16 datatype](#support-bfloat16-datatype) - [New operators](#new-operators) - + [Feature improvements](#feature-improvements) - - [Automatic Mixed Precision](#automatic-mixed-precision) - - [Gluon Fit API](#gluon-fit-api) - - [MKLDNN](#mkldnn) + - [Feature improvements](#feature-improvements) + - [Numpy compatible interface(experimental)](#numpy-compatible-interfaceexperimental) - [Large tensor support](#large-tensor-support) + - [MKL-DNN enhancement](#mkl-dnn-enhancement) - [TensorRT integration](#tensorrt-integration) - - [Higher order gradient support](#higher-order-gradient-support) - - [Operator improvements](#operator-improvements) + - [Quantization](#quantization) - [Profiler](#profiler) - - [ONNX import/export](#onnx-importexport) + - [ONNX](#onnx) + - [New models](#new-models) + - [Operator improvements](#operator-improvements) - [Bug fixes](#bug-fixes) - + [Front end API](#front-end-api) + - [Front end API](#front-end-api) - [Gluon](#gluon) - [Symbol](#symbol) - + [Language Bindings](#language-bindings) + - [Language Bindings](#language-bindings) - [Python](#python) - [C/C++](#cc) + - [R](#r) - [Clojure](#clojure) - [Julia](#julia) - [Perl](#perl) - [Scala](#scala) - + [Performance improvements](#performance-improvements) - + [Examples and tutorials](#examples-and-tutorials) - + [Website and documentation](#website-and-documentation) - + [CI/CD](#cicd) - + [Misc](#misc) - * [1.5.1](#151) - + [Bug-fixes](#bug-fixes-1) - * [1.5.0](#150) - + [New Features](#new-features-1) + - [Performance improvements](#performance-improvements) + - [Example and tutorials](#example-and-tutorials) + - [Website and documentation](#website-and-documentation) + - [CI/CD](#cicd) + - [License](#license) + - [Miscellaneous changes](#miscellaneous-changes) + - [1.6.0](#160) + - [Deprecation of Python 2](#deprecation-of-python-2) + - [New features](#new-features-1) + - [NumPy compatible interface and using TVM to generate operators](#numpy-compatible-interface-and-using-tvm-to-generate-operators) + - [Graph optimizations](#graph-optimizations) + - [Pointwise fusion for GPU](#pointwise-fusion-for-gpu) + - [Eliminate common subexpressions](#eliminate-common-subexpressions) + - [Default MKLDNN Subgraph fusion](#default-mkldnn-subgraph-fusion) + - [New operators](#new-operators-1) + - [Feature improvements](#feature-improvements-1) + - [Automatic Mixed Precision](#automatic-mixed-precision) + - [Gluon Fit API](#gluon-fit-api) + - [MKLDNN](#mkldnn) + - [Large tensor support](#large-tensor-support-1) + - [TensorRT integration](#tensorrt-integration-1) + - [Higher order gradient support](#higher-order-gradient-support) + - [Operator improvements](#operator-improvements-1) + - [Profiler](#profiler-1) + - [ONNX import/export](#onnx-importexport) + - [Runtime discovery of features](#runtime-discovery-of-features) + - [Bug fixes](#bug-fixes-1) + - [Front end API](#front-end-api-1) + - [Gluon](#gluon-1) + - [Symbol](#symbol-1) + - [Language Bindings](#language-bindings-1) + - [Python](#python-1) + - [C/C++](#cc-1) + - [Clojure](#clojure-1) + - [Julia](#julia-1) + - [Perl](#perl-1) + - [Scala](#scala-1) + - [Performance improvements](#performance-improvements-1) + - [Examples and tutorials](#examples-and-tutorials) + - [Website and documentation](#website-and-documentation-1) + - [CI/CD](#cicd-1) + - [Misc](#misc) + - [1.5.1](#151) + - [Bug-fixes](#bug-fixes-2) + - [1.5.0](#150) + - [New Features](#new-features-2) - [Automatic Mixed Precision(experimental)](#automatic-mixed-precisionexperimental) - [MKL-DNN Reduced precision inference and RNN API support](#mkl-dnn-reduced-precision-inference-and-rnn-api-support) - [Dynamic Shape(experimental)](#dynamic-shapeexperimental) - - [Large Tensor Support](#large-tensor-support-1) + - [Large Tensor Support](#large-tensor-support-2) - [Dependency Update](#dependency-update) - [Gluon Fit API(experimental)](#gluon-fit-apiexperimental) - - [New Operators](#new-operators-1) - + [Feature Improvements](#feature-improvements-1) + - [New Operators](#new-operators-2) + - [Feature Improvements](#feature-improvements-2) - [Operators](#operators) - [MKLDNN](#mkldnn-1) - - [ONNX](#onnx) + - [ONNX](#onnx-1) - [TensorRT](#tensorrt) - [FP16 Support](#fp16-support) - - [Deep Graph Library(DGL) support](#deep-graph-library-dgl--support) + - [Deep Graph Library(DGL) support](#deep-graph-librarydgl-support) - [Horovod Integration](#horovod-integration) - [Dynamic Shape](#dynamic-shape) - [Backend Engine](#backend-engine) - - [Large Tensor Support](#large-tensor-support-2) - - [Quantization](#quantization) - - [Profiler](#profiler-1) + - [Large Tensor Support](#large-tensor-support-3) + - [Quantization](#quantization-1) + - [Profiler](#profiler-2) - [CoreML](#coreml) - + [Front End API](#front-end-api-1) - - [Gluon](#gluon-1) - - [Python](#python-1) - + [Language Bindings](#language-bindings-1) - - [Scala](#scala-1) + - [Front End API](#front-end-api-2) + - [Gluon](#gluon-2) + - [Python](#python-2) + - [Language Bindings](#language-bindings-2) + - [Scala](#scala-2) - [Java](#java) - [C++](#c) - - [Clojure](#clojure-1) - - [Julia](#julia-1) - - [Perl:](#perl-1) - - [R](#r) - + [Performance Improvements](#performance-improvements-1) - + [Example and Tutorials](#example-and-tutorials) - + [Website](#website) - + [Documentation](#documentation) - + [Build and Test](#build-and-test) - + [Bug-fixes](#bug-fixes-2) - + [License](#license) - + [Depreciations](#depreciations) - + [Known Issues](#known-issues) - * [1.4.1](#141) - + [Bug-fixes](#bug-fixes-3) - * [1.4.0](#140) - + [New Features](#new-features-2) + - [Clojure](#clojure-2) + - [Julia](#julia-2) + - [Perl:](#perl-2) + - [R](#r-1) + - [Performance Improvements](#performance-improvements-2) + - [Example and Tutorials](#example-and-tutorials-1) + - [Website](#website) + - [Documentation](#documentation) + - [Build and Test](#build-and-test) + - [Bug-fixes](#bug-fixes-3) + - [License](#license-1) + - [Depreciations](#depreciations) + - [Known Issues](#known-issues) + - [1.4.1](#141) + - [Bug-fixes](#bug-fixes-4) + - [1.4.0](#140) + - [New Features](#new-features-3) - [Java Inference API](#java-inference-api) - [Julia API](#julia-api) - [Control Flow Operators (experimental)](#control-flow-operators-experimental) @@ -106,32 +147,32 @@ MXNet Change Log - [Subgraph API (experimental)](#subgraph-api-experimental) - [JVM Memory Management](#jvm-memory-management) - [Topology-aware AllReduce (experimental)](#topology-aware-allreduce-experimental) - - [MKLDNN backend: Graph optimization and Quantization (experimental)](#mkldnn-backend--graph-optimization-and-quantization-experimental) - * [Graph Optimization](#graph-optimization) - * [Quantization](#quantization-1) - + [New Operators](#new-operators-2) - + [Feature improvements](#feature-improvements-2) + - [MKLDNN backend: Graph optimization and Quantization (experimental)](#mkldnn-backend-graph-optimization-and-quantization-experimental) + - [Graph Optimization](#graph-optimization) + - [Quantization](#quantization-2) + - [New Operators](#new-operators-3) + - [Feature improvements](#feature-improvements-3) - [Operator](#operator) - [Optimizer](#optimizer) - [Sparse](#sparse) - - [ONNX](#onnx-1) + - [ONNX](#onnx-2) - [MKLDNN](#mkldnn-2) - [Inference](#inference) - [Other](#other) - + [Frontend API updates](#frontend-api-updates) - - [Gluon](#gluon-2) - - [Symbol](#symbol-1) - + [Language API updates](#language-api-updates) + - [Frontend API updates](#frontend-api-updates) + - [Gluon](#gluon-3) + - [Symbol](#symbol-2) + - [Language API updates](#language-api-updates) - [Java](#java-1) - - [R](#r-1) - - [Scala](#scala-2) - - [Clojure](#clojure-2) - - [Perl](#perl-2) - - [Julia](#julia-2) - + [Performance benchmarks and improvements](#performance-benchmarks-and-improvements) - + [Bug fixes](#bug-fixes-4) - + [Licensing updates](#licensing-updates) - + [Improvements](#improvements) + - [R](#r-2) + - [Scala](#scala-3) + - [Clojure](#clojure-3) + - [Perl](#perl-3) + - [Julia](#julia-3) + - [Performance benchmarks and improvements](#performance-benchmarks-and-improvements) + - [Bug fixes](#bug-fixes-5) + - [Licensing updates](#licensing-updates) + - [Improvements](#improvements) - [Tutorial](#tutorial) - [Example](#example) - [Documentation](#documentation-1) @@ -140,98 +181,544 @@ MXNet Change Log - [Installation](#installation) - [Build and CI](#build-and-ci) - [3rd party](#3rd-party) - * [TVM:](#tvm) - * [CUDNN:](#cudnn) - * [Horovod:](#horovod) - + [Deprications](#deprications) - + [Other](#other-1) - + [How to build MXNet](#how-to-build-mxnet) - + [List of submodules used by Apache MXNet (Incubating) and when they were updated last](#list-of-submodules-used-by-apache-mxnet--incubating--and-when-they-were-updated-last) - * [1.3.1](#131) - + [Bug fixes](#bug-fixes-5) - + [Documentation fixes](#documentation-fixes) - + [Other Improvements](#other-improvements) - + [Submodule updates](#submodule-updates) - + [Known issues](#known-issues) - * [1.3.0](#130) - + [New Features - Gluon RNN layers are now HybridBlocks](#new-features---gluon-rnn-layers-are-now-hybridblocks) - + [MKL-DNN improvements](#mkl-dnn-improvements) - + [New Features - Gluon Model Zoo Pre-trained Models](#new-features---gluon-model-zoo-pre-trained-models) - + [New Features - Clojure package (experimental)](#new-features---clojure-package-experimental) - + [New Features - Synchronized Cross-GPU Batch Norm (experimental)](#new-features---synchronized-cross-gpu-batch-norm-experimental) - + [New Features - Sparse Tensor Support for Gluon (experimental)](#new-features---sparse-tensor-support-for-gluon-experimental) - + [New Features - Control flow operators (experimental)](#new-features---control-flow-operators-experimental) - + [New Features - Scala API Improvements (experimental)](#new-features---scala-api-improvements-experimental) - + [New Features - Rounding GPU Memory Pool for dynamic networks with variable-length inputs and outputs (experimental)](#new-features---rounding-gpu-memory-pool-for-dynamic-networks-with-variable-length-inputs-and-outputs-experimental) - + [New Features - Topology-aware AllReduce (experimental)](#new-features---topology-aware-allreduce-experimental) - + [New Features - Export MXNet models to ONNX format (experimental)](#new-features---export-mxnet-models-to-onnx-format-experimental) - + [New Features - TensorRT Runtime Integration (experimental)](#new-features---tensorrt-runtime-integration-experimental) - + [New Examples - Scala](#new-examples---scala) - + [Maintenance - Flaky Tests improvement effort](#maintenance---flaky-tests-improvement-effort) - + [Maintenance - MXNet Model Backwards Compatibility Checker](#maintenance---mxnet-model-backwards-compatibility-checker) - + [Maintenance - Integrated testing for "the Straight Dope"](#maintenance---integrated-testing-for-the-straight-dope) - + [Bug-fixes](#bug-fixes-6) - + [Performance Improvements](#performance-improvements-2) - + [API Changes](#api-changes) - + [Other features](#other-features) - + [Usability Improvements](#usability-improvements) - * [1.2.0](#120) - + [New Features - Added Scala Inference APIs](#new-features---added-scala-inference-apis) - + [New Features - Added a Module to Import ONNX models into MXNet](#new-features---added-a-module-to-import-onnx-models-into-mxnet) - + [New Features - Added Support for Model Quantization with Calibration](#new-features---added-support-for-model-quantization-with-calibration) - + [New Features - MKL-DNN Integration](#new-features---mkl-dnn-integration) - + [New Features - Added Exception Handling Support for Operators](#new-features---added-exception-handling-support-for-operators) - + [New Features - Enhanced FP16 support](#new-features---enhanced-fp16-support) - + [New Features - Added Profiling Enhancements](#new-features---added-profiling-enhancements) - + [Breaking Changes](#breaking-changes) - + [Bug Fixes](#bug-fixes-7) - + [Performance Improvements](#performance-improvements-3) - + [API Changes](#api-changes-1) - + [Sparse Support](#sparse-support) - + [Deprecations](#deprecations) - + [Other Features](#other-features) - + [Usability Improvements](#usability-improvements-1) - + [Known Issues](#known-issues-1) - * [1.1.0](#110) - + [Usability Improvements](#usability-improvements-2) - + [Bug-fixes](#bug-fixes-8) - + [New Features](#new-features-3) - + [API Changes](#api-changes-2) - + [Deprecations](#deprecations-1) - + [Performance Improvements](#performance-improvements-4) - + [Known Issues](#known-issues-2) - * [1.0.0](#100) - + [Performance](#performance) - + [New Features - Gradient Compression [Experimental]](#new-features---gradient-compression-experimental) - + [New Features - Support of NVIDIA Collective Communication Library (NCCL) [Experimental]](#new-features---support-of-nvidia-collective-communication-library--nccl--experimental) - + [New Features - Advanced Indexing [General Availability]](#new-features---advanced-indexing-general-availability) - + [New Features - Gluon [General Availability]](#new-features---gluon-general-availability) - + [New Features - ARM / Raspberry Pi support [Experimental]](#new-features---arm---raspberry-pi-support-experimental) - + [New Features - NVIDIA Jetson support [Experimental]](#new-features---nvidia-jetson-support-experimental) - + [New Features - Sparse Tensor Support [General Availability]](#new-features---sparse-tensor-support-general-availability) - + [Bug-fixes](#bug-fixes-9) - + [Doc Updates](#doc-updates) - * [0.12.1](#0121) - + [Bug-fixes](#bug-fixes-10) - * [0.12.0](#0120) - + [Performance](#performance-1) - + [New Features - Gluon](#new-features---gluon) - + [New Features - Autograd](#new-features---autograd) - + [New Features - Sparse Tensor Support](#new-features---sparse-tensor-support) - + [Other New Features](#other-new-features) - + [API Changes](#api-changes-3) - + [Bug-fixes](#bug-fixes-11) - * [0.11.0](#0110) - + [Major Features](#major-features) - + [API Changes](#api-changes-4) - + [Performance Improvements](#performance-improvements-5) - + [Bugfixes](#bugfixes) - + [Refactors](#refactors) - * [0.10.0](#0100) - * [0.9.3](#093) - * [v0.8](#v08) - * [v0.7](#v07) - * [v0.5 (initial release)](#v05-initial-release) + - [TVM:](#tvm) + - [CUDNN:](#cudnn) + - [Horovod:](#horovod) + - [Deprications](#deprications) + - [Other](#other-1) + - [How to build MXNet](#how-to-build-mxnet) + - [List of submodules used by Apache MXNet (Incubating) and when they were updated last](#list-of-submodules-used-by-apache-mxnet-incubating-and-when-they-were-updated-last) + - [1.3.1](#131) + - [Bug fixes](#bug-fixes-6) + - [Documentation fixes](#documentation-fixes) + - [Other Improvements](#other-improvements) + - [Submodule updates](#submodule-updates) + - [Known issues](#known-issues-1) + - [1.3.0](#130) + - [New Features - Gluon RNN layers are now HybridBlocks](#new-features---gluon-rnn-layers-are-now-hybridblocks) + - [MKL-DNN improvements](#mkl-dnn-improvements) + - [New Features - Gluon Model Zoo Pre-trained Models](#new-features---gluon-model-zoo-pre-trained-models) + - [New Features - Clojure package (experimental)](#new-features---clojure-package-experimental) + - [New Features - Synchronized Cross-GPU Batch Norm (experimental)](#new-features---synchronized-cross-gpu-batch-norm-experimental) + - [New Features - Sparse Tensor Support for Gluon (experimental)](#new-features---sparse-tensor-support-for-gluon-experimental) + - [New Features - Control flow operators (experimental)](#new-features---control-flow-operators-experimental) + - [New Features - Scala API Improvements (experimental)](#new-features---scala-api-improvements-experimental) + - [New Features - Rounding GPU Memory Pool for dynamic networks with variable-length inputs and outputs (experimental)](#new-features---rounding-gpu-memory-pool-for-dynamic-networks-with-variable-length-inputs-and-outputs-experimental) + - [New Features - Topology-aware AllReduce (experimental)](#new-features---topology-aware-allreduce-experimental) + - [New Features - Export MXNet models to ONNX format (experimental)](#new-features---export-mxnet-models-to-onnx-format-experimental) + - [New Features - TensorRT Runtime Integration (experimental)](#new-features---tensorrt-runtime-integration-experimental) + - [New Examples - Scala](#new-examples---scala) + - [Maintenance - Flaky Tests improvement effort](#maintenance---flaky-tests-improvement-effort) + - [Maintenance - MXNet Model Backwards Compatibility Checker](#maintenance---mxnet-model-backwards-compatibility-checker) + - [Maintenance - Integrated testing for "the Straight Dope"](#maintenance---integrated-testing-for-%22the-straight-dope%22) + - [Bug-fixes](#bug-fixes-7) + - [Performance Improvements](#performance-improvements-3) + - [API Changes](#api-changes) + - [Other features](#other-features) + - [Usability Improvements](#usability-improvements) + - [1.2.0](#120) + - [New Features - Added Scala Inference APIs](#new-features---added-scala-inference-apis) + - [New Features - Added a Module to Import ONNX models into MXNet](#new-features---added-a-module-to-import-onnx-models-into-mxnet) + - [New Features - Added Support for Model Quantization with Calibration](#new-features---added-support-for-model-quantization-with-calibration) + - [New Features - MKL-DNN Integration](#new-features---mkl-dnn-integration) + - [New Features - Added Exception Handling Support for Operators](#new-features---added-exception-handling-support-for-operators) + - [New Features - Enhanced FP16 support](#new-features---enhanced-fp16-support) + - [New Features - Added Profiling Enhancements](#new-features---added-profiling-enhancements) + - [Breaking Changes](#breaking-changes) + - [Bug Fixes](#bug-fixes-8) + - [Performance Improvements](#performance-improvements-4) + - [API Changes](#api-changes-1) + - [Sparse Support](#sparse-support) + - [Deprecations](#deprecations) + - [Other Features](#other-features-1) + - [Usability Improvements](#usability-improvements-1) + - [Known Issues](#known-issues-2) + - [1.1.0](#110) + - [Usability Improvements](#usability-improvements-2) + - [Bug-fixes](#bug-fixes-9) + - [New Features](#new-features-4) + - [API Changes](#api-changes-2) + - [Deprecations](#deprecations-1) + - [Performance Improvements](#performance-improvements-5) + - [Known Issues](#known-issues-3) + - [1.0.0](#100) + - [Performance](#performance) + - [New Features - Gradient Compression [Experimental]](#new-features---gradient-compression-experimental) + - [New Features - Support of NVIDIA Collective Communication Library (NCCL) [Experimental]](#new-features---support-of-nvidia-collective-communication-library-nccl-experimental) + - [New Features - Advanced Indexing [General Availability]](#new-features---advanced-indexing-general-availability) + - [New Features - Gluon [General Availability]](#new-features---gluon-general-availability) + - [New Features - ARM / Raspberry Pi support [Experimental]](#new-features---arm--raspberry-pi-support-experimental) + - [New Features - NVIDIA Jetson support [Experimental]](#new-features---nvidia-jetson-support-experimental) + - [New Features - Sparse Tensor Support [General Availability]](#new-features---sparse-tensor-support-general-availability) + - [Bug-fixes](#bug-fixes-10) + - [Doc Updates](#doc-updates) + - [0.12.1](#0121) + - [Bug-fixes](#bug-fixes-11) + - [0.12.0](#0120) + - [Performance](#performance-1) + - [New Features - Gluon](#new-features---gluon) + - [New Features - Autograd](#new-features---autograd) + - [New Features - Sparse Tensor Support](#new-features---sparse-tensor-support) + - [Other New Features](#other-new-features) + - [API Changes](#api-changes-3) + - [Bug-fixes](#bug-fixes-12) + - [0.11.0](#0110) + - [Major Features](#major-features) + - [API Changes](#api-changes-4) + - [Performance Improvements](#performance-improvements-6) + - [Bugfixes](#bugfixes) + - [Refactors](#refactors) + - [0.10.0](#0100) + - [0.9.3](#093) + - [v0.8](#v08) + - [v0.7](#v07) + - [v0.5 (initial release)](#v05-initial-release) + +## 1.7.0 + +### New features +#### MXNet Extensions: custom operators, partitioning, and graph passes + +Adds support for extending MXNet with custom operators, partitioning strategies, and graph passes. All implemented in a library easily compiled separately from the MXNet codebase, and dynamically loaded at runtime into any prebuilt installation of MXNet. + + - fix for number of inputs/outputs for backward custom ops (#17069) + - Enhancements for custom subgraph op (#17194) + - Disable flaky test_custom_op_fork (#17481) + - fix custom op makefile (#17516) + - Update CustomOp doc with changes for GPU support (#17486) + - [WIP] MXNet Extensions enhancements (#17885) (#18128) + - Dynamic subgraph property (#17034) + - Dynamic subgraph property doc (#17585) + - [1.7] Backport MXNet Extension PRs (#17623, #17569, #17762) #18063 (#18069) + +#### OpPerf utility enabled in the binary distribution + - [OpPerf] Add Neural network loss ops (#17482) + - [OpPerf] Fixes the issue when you pass NDArray to run_perf_test (#17508) + - [OpPerf] Fix markdown for native profile and add profile param in function desc (#17494) + - [OpPerf] Add Indexing ops (#16253) + - [OpPerf] Implement remaining random sampling ops (#17502) + - [OpPerf] Implement remaining GEMM ops (#17501) + - [OpPerf] Implement all linalg ops (#17528) + - [OpPerf] Fixed native output ordering, added warmup & runs command line args (#17571) + - [OpPerf] Add norm, cast ops, remaining optimizer ops (#17542) + - [OpPerf] Fixed Python profiler bug (#17642) + +#### MKL-DNN +##### MKL-DNN as the default CPU backend in binary distribution +##### Branding change to DNNL + - Upgrade MKL-DNN dependency to v1.1 (#16823) + +##### Support bfloat16 datatype + - Add bfloat16 floating-point format support based on AMP (#17265) + +#### New operators + - [New Op] Add deformable conv v2 (#16341) + - Add MXNet Ops for fast multihead attention (#16408) + - Support boolean elemwise/broadcast binary add, multiply and true_divide (#16728) + - add gammaln, erf, erfinv (#16811) + - add aligned roi introduced in Detectron2 (#16619) + - Implement atleast_1d/2d/3d (#17099) + - Interleaved MHA for CPU path (#17138) + - Lamb optimizer update (#16715) + - Quantized Embedding (#16691) + - Add gelu fuse ops (#18082) (#18092) + +### Feature improvements +#### Numpy compatible interface(experimental) + - [NumPy] NumPy support for linalg.inv (#16730) + - add numpy op nan_to_num (#16717) + - [Numpy] Add sampling method for bernoulli (#16638) + - Fix numpy-compatible mean output type for integer inputs (#16792) + - [Numpy] Fix collect_params().zero_grad() in gluon numpy interface (#16716) + - [Numpy][Operator] 'where' Implementation in MXNet (#16829) + - [Numpy] Random.normal() with backward (#16330) + - Add OP diag [numpy] (#16786) + - Mixed precison binary op backward (use in) for numpy (#16791) + - add numpy op diagflat [numpy] (#16813) + - add op bitwise_or [numpy] (#16801) + - [Numpy] Implementation npx.{sample}_n (#16876) + - [Numpy] Add NumPy support for np.linalg.det and np.linalg.slogdet (#16800) + - Op Unravel_index PR [Numpy] (#16862) + - [Numpy] Fix imperative basic indexing in numpy (#16902) + - [Numpy] Basic indexing in symbolic interface of DeepNumpy (#16621) + - [Numpy] add op full_like, c++ impl, fix zeros_like, ones_like type inference (#16804) + - [Numpy] Implement numpy operator 'average' (#16720) + - [Bugfix] [Numpy] Add `kAddTo` and kNullOp to Transpose (#16979) + - set rtol = 1e-2 and atol = 1e-4 when dtype == np.float32 in test_numpy_op.py:test_np_linalg_solve (#17025) + - Op_Diagonal [Numpy] (#16989) + - numpy bincount (#16965) + - [numpy] add op bitwise_not (#16947) + - [Numpy ]Modify np.random.shuffle to enable inplace by default (#17133) + - [numpy] fix argsort typo (#17150) + - [numpy] add op round (#17175) + - [numpy]Add op delete (#17023) + - [numpy] add op flipud, fliplr (#17192) + - [CI] Re-enable testing with numpy 1.18 (#17200) + - [Numpy] Add broadcast_to scalar case (#17233) + - [Numpy] Random.gamma() implemented (#16152) + - [Numpy] add row_stack (=vstack) (#17171) + - [Numpy] Add infra for performing constraint check (#17272) + - porting numpy-compatible hstack to master and add dstack for interoperability (#17030) + - adding asnumpy() to output of gather(implicitly called) to fix gather test in large vector and tensor tests (#17290) + - [numpy] add op random.exponential (#17280) + - [NumPy] Add NumPy support for norm (#17014) + - [numpy]add op random.lognormal (#17415) + - Add numpy random weibull operator (#17505) + - [numpy] Add np.random.pareto and np.random.power (#17517) + - [Numpy] Add sort op (#17393) + - [numpy]implement exponential backward (#17401) + - [Numpy] Where operator scalar version (#17249) + - [numpy] add op matmul (#16990) + - [numpy]add op random.logistic, random.gumbel (#17302) + - [numpy][Do Not Review]add op insert (#16865) + - [numpy] add op random.rayleigh (#17541) + - [numpy] add fallback ops (#17609) + - [numpy] add op pad (#17328) + - [numpy] add op fabs, sometrue, round_ (#17619) + - Add arange_like to npx (#16883) + - try to move shape_array to npx (#16897) + - support np.argsort (#16949) + - np.broadcast_to extension (#17358) + - support bitwise_and (#16861) + - fix np.argmax/argmin output data type (#17476) + - add op random.beta (#17390) + - add op isnan isinf (#17535) + - array_split pr (#17032) + - Mixed data type binary ops (#16699) + - randn implemented (#17141) + - refactor and reduce float types for some functions, also add bitwise_xor (#16827) + - any/all (#17087) + - amax (#17176) + - fix format (#17100) + - add op empty_like, add nan_to_num to dispatch (#17169) + - handle array_like fill_value for np.full; add unit test coverage (#17245) + - add np.amin (#17538) + - add npx.gather_nd (#17477) + - add np.random.chisquare (#17524) + - add polyval (#17416) + - add isposinf isneginf isfinite (#17563) + - Support broadcast assign for `npi_boolean_mask_assign_tensor` (#17131) + - Implement Weibull backward (#17590) + - support np.dsplit, fix some error msgs and corner cases for hsplit and vsplit, add interoperability tests for h/v/dsplit (#17478) + - add np.product (#17489) + - Implement np.random.pareto backward (#17607) + - add np.ediff1d (#17624) + - more support for boolean indexing and assign (#18352) + - Fix einsum gradient (#18482) + - [v1.7.x] Backport PRs of numpy features (#18653) + - [v1.7.x] backport mixed type binary ops to v1.7.x (#18649) + - revise activations (#18700) + +#### Large tensor support + - [Large Tensor] Add support to Random Sample & Pdf ops (#17445) + - [Large Tensor] Add LT support for NN optimizers and 1 activation function (#17444) + - [Large Tensor] Fixed SoftmaxActivation op (#17634) + - [Large Tensor] Fixed col2im op (#17622) + - [Large Tensor] Fixed Spatial Transformer op (#17617) + - [Large Tensor] Fix ravel_multi_index op (#17644) + - Sparse int64 Large tensor support (#16898) + - Re-Enabling Large Tensor Nightly on GPU (#16164) + - enabling build stage gpu_int64 to enable large tensor nightly runs (#17546) + - [Large Tensor] Fixed Embedding op (#17599) + +#### MKL-DNN enhancement + - MKLDNN FC : Add error info when mkldnn fc bias dimension is wrong (#16692) + - [MKLDNN] support mkldnn gelu (#16710) + - [MKLDNN] Fix int8 convolution/fc bias overflow (#16734) + - [MKLDNN] use dim_t instead of int in slice/transpose operators (#16737) + - Mkldnn fullyConnect bwd bug fix (#16890) + - Revert Mkldnn fullyConnect bwd bug fix (#16890) (#16907) + - [MKLDNN] Use MKLDNNRun (#16772) + - [MKLDNN] mkldnn RNN operator enhancement (#17075) + - [MKLDNN] enable MaxPooling with full pooling convention (#16860) + - update mkldnn to v1.1.2 (#17165) + - improve mkldnn doc (#17198) + - [MKLDNN] Fix _copyto (#17173) + - [MKLDNN] Support channel wise quantization for FullyConnected (#17187) + - fixed seed for mkldnn test (#17386) + - add mkldnn softmax backward (#17170) + - cmake: copy dnnl headers to include/mkldnn (#17647) + - [mkldnn]Mkldnn bn opt backport from master to 1.7x (#18009) + - [v1.x] Update 3rdparty/mkldnn remote URL and pin to v1.3 (#17972) (#18033) + - [v1.x] backport #17900 [MKLDNN] support using any format in pooling backward (#18067) + - Static link MKL-DNN library (#16731) + - Add large tensor nightly tests for MKL-DNN operators (#16184) + - [MKL-DNN] Enable and Optimization for s8 eltwise_add (#16931) + - [MKL-DNN] Enhance Quantization Method (#17161) + - Static Build and CD for mxnet-cu102/mxnet-cu102mkl (#17074) + - MKL-DNN RNN backward path enhancement (#17183) + - cmake: check USE_OPENMP and pass proper MKL-DNN build flags (#17356) + - update mkl to 2020.0 (#17355) + - Enable MKL-DNN by default in pip packages (#16899) + - Enable MKL-DNN FullyConnected backward (#17318) + - Softmax primitive cache and in-place computation (#17152) + - boolean_mask_assign with start_axis (#16886) + - use identity_with_cast (#16913) + - change error tolerance for bf16 bn (#18110) + - [v1.x] Backport #17689 and #17884 to v1.x branch (#18064) + - refactor codes and add an option to skip/check weight's version to reduce overhead (#17707) (#18039) + - [v1.x] Backport #17702 and #17872 to v1.x branch (#18038) + +#### TensorRT integration + - Update TensorRT tutorial to build-from-source. (#14860) + - Minor fix, use RAII for TensorRT builder and network object (#17189) + +#### Quantization + - Add silent option to quantization script (#17094) + +#### Profiler + - Implemented final two binary ops, added default params for functionality (#17407) + - Implement remaining nn_activation ops in opperf (#17475) + - Implement all miscellaneous ops (#17511) + - Implement remaining nn_basic ops in opperf (#17456) + +#### ONNX + - Fix memory leak reported by ASAN in NNVM to ONNX conversion (#15516) + - ONNX export: Gather (#15995) + - ONNX export: Slice op - Handle None value for ends (#14942) + +#### New models + - [Model] Implement Neural Collaborative Filtering with MXNet (#16689) + - Further optimization for NCF model (#17148) + - HMM Model (#17120) + +#### Operator improvements + - Faster GPU NMS operator (#16542) + - [MXNET-1421] Added (CuDNN)BatchNorm operator to the list of mirrored operators (#16022) + - dynamic custom operator support (#15921) + - Multi Precision Lamb Update operator (#16885) + - Add im2col and col2im operator (#16502) + - Quantized Elemwise Mul Operator (#17147) + - Enhancements for MXTensor for custom operators (#17204) + - Enabling large tensor support for binary broadcast operators (#16755) + - Fix operators lying about their number of inputs (#17049) + - [WIP] Fallback mechanism for mx.np operators (#16923) + - Dynamic custom operator GPU support (#17270) + - Fix flaky - test_operator_gpu.test_np_insert (#17620) + - MXNet FFI for Operator Imperative Invocation (#17510) + - [MXNET-978] Higher Order Gradient Support `logp1`, `expm1`, `square`. (#15416) + - [MXNET-978] Higher Order Gradient Support `arcsin`, `arccos`. (#15515) + - [MXNET-978] Higher Order Gradient Support `rsqrt`, `rcbrt`. (#15476) + - gather_nd: check bound and wrap negative indices (#17208) + - Remove dilation restriction for conv3d (#17491) + - Fix storage type infer of softmax backward (#17576) + - Fix and optimize handling of vectorized memory accesses (#17767) (#18113) + - Cherry-pick of #17995 and #17937 to 1.x branch (#18041) + - No tensor cores for fp32 interleaved attention, remove div by 8 restriction (#17994) (#18085) + - GPU gemms true fp16 (#17466) (#18023) + - Add support for boolean inputs to FusedOp (#16796) + +#### Bug fixes + - [BUG FIX] Always preserve batch dimension in batches returned from dataloader (#16233) + - Fix SliceChannel Type inference (#16748) + - change _generate_op_module_signature get_module_file open with encoding=utf-8,it fix some encode error in Chinese windows system. (#16738) + - Fix rtrue_divide grad (#16769) + - fix inv test flakiness using random matrices generated by SVD (#16782) + - [MXNET-1426] Fix the wrong result of sum, mean, argmin, argmax when inputs contain inf or nan (#16234) + - Fix (#16781) + - fix expand_dims fall back when input's ndim is 0 (#16837) + - [fix] missing input log higher order. (#15331) + - Fix IndentationError in setup.py (#16857) + - Fix a few np issues (#16849) + - Fix InferAttr/InferShapeAttr not calling inference for all nodes in a graph (#16836) + - fix for enable model parallelism for non-fp32 data (#16683) + - Fix NDArrayIter iteration bug when last_batch_handle='pad' (#16166) + - Fix crashing on Windows in ObjectPool ~ctor (#16941) + - Fix NDArrayIter cant pad when size is large (#17001) + - fix axis=-1 bug (#17016) + - Fix CUDNN detection for CMake build (#17019) + - Fix omp assert issue (#17039) + - mshadow: fix vector access (#17021) + - [BUGFIX] Fix race condition in kvstore.pushpull (#17007) + - [BUGFIX] Fix trainer param order (#17068) + - [BugFix] fix filter channel calculation in ModulatedDeformableConvV2 (#17070) + - Fix reshape interoperability test (#17155) + - fix norm sparse fallback (#17149) + - fix py27 quantization (#17153) + - fix int8 add ut (#17166) + - Fix and clean up Ubuntu build from source instructions (#17229) + - fix lstm layer with projection save params (#17266) + - Fix rendering of ubuntu_setup.md codeblocks (#17294) + - Fix #17267, add expected and got datatype for concat error msgs (#17271) + - [BUGFIX] fix model zoo parallel download (#17372) + - fix use int8, uint8, int32, int64 (#17188) + - [Fix] Add ctx to the original ndarray and revise the usage of context to ctx (#16819) + - Fix ndarray indexing bug (#16895) + - fix requantize flaky test (#16709) + - Initial checkin (#16856) + - Fix flakey test_ndarray.py:test_reduce (#17312) + - fix flaky test: boolean index and fix bugs (#17222) + - Fix IOT Devices section of Get Started page (#17326) + - add logic for no batch size while getting data arrays from executors (#17772) (#18122) + - Fix reverse shape inference in LayerNorm (#17683) + - fix full and full_like when input is boolean (#17668) + - Fix MBCC inference (#17660) + - Additional fix for vector access. (#17230) + - Cherrypick Fix nightly large_vector test caused by incorrect with_seed path (#18178) (#18220) + - [1.7] Pass args fix3 (#18237) + - fixing batch_norm and layer_norm for large tensors (#17805) (#18261) + - [1.7.x] Backport of LSTM and GRU fix (#17898) and RNN op (#17632) (#18316) + - [v1.7.x] backport #18500 - [Bug Fixed] Fix batch norm when grad_req is `add` (#18517) + - Fix the monitor_callback invalid issue during calibration with variable input shapes (#18632) (#18703) + +### Front end API + - Fix the problem in printing feature in c++ API examples : feature_extract (#15686) + - updating MXNet version to 1.6.0 in base.h for C APIs (#16905) + - [API] unified API for custom kvstores (#17010) + - fix parameter names in the estimator api (#17051) + - adding docs for 64bit C APIs of large tensor (#17309) + - Add API docs to INT64 APIs (#16617) + +#### Gluon + - [Quantization] Enhance gluon quantization API (#16695) + - [Gluon] Improve estimator usability and fix logging logic (#16810) + - Fix test_gluon.py:test_sync_batchnorm when number of GPUS > 4 (#16834) + - [Gluon] Update contrib.Estimator LoggingHandler to support logging per batch interval (#16922) + - Include eval_net the validation model in the gluon estimator api (#16957) + - Fix Gluon Estimator nightly test (#17042) + - [MXNET-1431] Multiple channel support in Gluon PReLU (#16262) + - Fix gluon.Trainer regression if no kvstore is used with sparse gradients (#17199) + - refactor gluon.utils.split_data() following np.array_split() (#17123) + - Add RandomApply in gluon's transforms (#17242) + - Partitioning Gluon HybridBlocks (#15969) + - Random rotation (#16794) + - bump up atol for gradient check (#16843) + - Extend estimator.evaluate() to support event handlers (#16971) + - [MXNET-1438] Adding SDML loss function (#17298) + +#### Symbol + - Add unoptimized symbol to executor for sharing (#16798) + - Enforces NDArray type in get_symbol (#16871) + - Fix #17164 symbolblock with BatchNorm inside during cast to fp16 (#17212) + - autograd video and image link fixes and removing symbol tutorials (#17227) + - Fix CosineEmbeddingLoss in when symbol API is used (#17308) + - Fix Horovod build error due to missing exported symbols (#17348) + - Update symbol.py (#17408) + - update symbol to json (#16948) + +### Language Bindings +#### Python + - Python 2 compatibility fix in base.py + - adding stacktrace in Jenkinsfile_utils.groovy to inspect Python2 failure cause in CI (#17065) + - Fix image display in python autograd tutorial (#17243) + - Fix Python 3 compatibility in example/speech_recognition (#17354) + - Stop testing Python 2 on CI (#15990) + - Docs: Python tutorials doc fixes (#17435) + - pin python dependencies (#17556) + - Python 2 cleanup (#17583) + +#### C/C++ + - Simplify C++ flags (#17413) + +#### R + - fix R docs (#16733) + - [R package] Make R package compilation support opencv 4.0 (#16934) + - Support R-package with cmake build and fix installation instructions (#17228) + - Fix R-package/src/Makevars for OpenCV 4 (#17404) + - Fix typo in Install the MXNet Package for R (#17340) + +#### Clojure + +#### Julia + - [MXNET-1440] julia: porting `current_context` (#17142) + - julia: porting `context.empty_cache` (#17172) + - pin Markdown version to 3.1 in Julia doc build (#17549) + +#### Perl + - [Perl] - ndarray operator overloading enhancements (#16779) + - MXNET-1447 [Perl] Runtime features and large tensor support. (#17610) + +#### Scala + - Fix scala publish & nvidia-docker cublas issue (#16968) + - Fix publishing scala gpu with cpu instance (#16987) + - swap wget to curl in Scala scripts (#17041) + - [Scala/Java] Remove unnecessary data slicing (#17544) + - quantile_scalar (#17572) + - Fix get_started scala gpu (#17434) + - Fix MBCC & scala publish pipeline (#17643) + - Bump up additional scala 1.x branch to 1.7.0 (#17765) + +### Performance improvements + - Build.py improvement (#16976) + - Improvements to config.cmake (#17639) + - [Done] BilinearResize2D optimized (#16292) + - Speed fused_op compilation by caching ptx and jit-compiled functions (#16783) + - Improve the speed of the pointwise fusion graph pass (#17114) + - broadcast_axis optimization (#17091) + - Optimize AddTakeGrad Tensor Sum (#17906) (#18045) + +### Example and tutorials + - Add CustomOp tutorial doc (#17241) + - Correct the grammar in 1-ndarray tutorial (#17513) + +### Website and documentation + - Website edits (#17050) + - [Website 2.0] Nightly Build for v1.x (#17956) + - [docs] Fix runtime feature detection documentation (#16746) + - Adding user guidelines for using MXNet built with Large Tensor Support (#16894) + - fix typo and doc (#16921) + - large tensor faq doc fix (#16953) + - [DOC] Add a few tips for running horovod (#17235) + - Update NOTICE to fix copyright years (#17330) + - [DOC] Fix tutorial link, and better error msg (#17057) + - doc fix for argmax & argmin (#17604) + +### CI/CD + - support mixed-precision true_divide (#16711) + - Try to fix CI (#16908) + - mixed precision for power (#16859) + - Fix desired precision for test_ndarray.py:test_reduce (#16992) + - [reproducibility] multi_sum_sq review, AtomicAdd removal (#17002) + - fix precision problem in linalg_solve, linalg_tensorinv, linalg_cholesky op test (#16981) + - grouping large array tests based on type and updating nightly CI function (#17305) + - [LICENSE] fix cpp predcit license (#17377) + - [CI] Fix static build pipeline (#17474) + - skipping tests that cannot fit in nightly CI machine corrected imports (#17450) + - Update Windows CI scripts to use syntax compatible with Win 2019 server powershell. (#17526) + - Fix Non-ASCII character in docstring (#17600) + - [CI] Follow redirects when downloading apache-maven-3.3.9-bin.tar.gz (#17608) + - [CI] Upgrade sphinx and autodocsumm (#17594) + - Reduce load on CI due to excessive log flood (#17629) + - Enable users to specify BLAS (#17648) + - [CI] Add AMI id to instance info on builds (#17649) + - [v1.7.x] Backport staggered CI builds (#17999 & #18119) (#18142) + - [v1.7.x] Backport #17177 to 1.7.x (Fix incorrect calculation results when the C locale is set to a locale that uses commas as the decimal separator) (#18147) + - Fix formatting and typos in CD README.md (#16703) + - [CD] dynamic libmxet pipeline fix + small fixes (#16966) + - [CD] enable s3 publish for nightly builds in cd (#17112) + - [CD] fix CD pipeline (#17259) + - [CD] update publish path (#17453) + - fix CD and remove leftover from #15990 (#17551) + - Fix nightly build (#16773) + - Update pypi_publish.py to disable nighlty build upload to Pypi (#17082) + - [v1.7.x] update jetson dockerfile to support CUDA 10.0 (#18339) + - Remove manually created symbolic link to ninja-build (#18437) (#18456) + - Increase staggered build timeout to 180 min (#18568) (#18585) + +### License + - Don't relicense FindCUDAToolkit.cmake (#17334) + - fix license and copyright issues (#17364) + - Update ps-lite LICENSE (#17351) + - remove unused file with license issue (#17371) + - Update LICENSE for fonts (#17365) + - license np_einsum file under bsd (#17367) + - Update Apache License for mshadow (#18109) (#18134) + - Julia: remove downloading of the non-ASF binary build (#18489) (#18502) + - Add missing license header for md files (#18541) + - [v1.7.x]License checker enhancement (#18478) + +### Miscellaneous changes + - Link fixes4 (#16764) + - Refactoring names for mxnet version of nnvm to avoid conflicting with the original tvm/nnvm. (#15303) + - minor typo fix (#17008) + - Add micro averaging strategy to pearsonr metric (#16878) + - introduce gradient update handler to the base estimator (#16900) + - fix latency calculation and print issue (#17217) + - add inference benchmark script (#16978) + - change the wording and log level to be more in line with the general use (#16626) + - Updated logos. (#16719) + - Pinning rvm version to satisfy Jekyll build (#18016) + - Workaround gnu_tls handshake error on Ubuntu 14.04 Nvidia Docker (#18044) ## 1.6.0 diff --git a/docs/static_site/src/.htaccess b/docs/static_site/src/.htaccess index 679b613b7581..4444d1d6d2fe 100644 --- a/docs/static_site/src/.htaccess +++ b/docs/static_site/src/.htaccess @@ -19,15 +19,15 @@ RewriteOptions AllowNoSlash # Web fonts ExpiresByType application/font-woff "access plus 1 month" - + -# Set default website version to current stable (v1.6) +# Set default website version to current stable (v1.7) RewriteCond %{REQUEST_URI} !^/versions/ RewriteCond %{HTTP_REFERER} !mxnet.apache.org RewriteCond %{HTTP_REFERER} !mxnet.incubator.apache.org RewriteCond %{HTTP_REFERER} !mxnet.cdn.apache.org -RewriteRule ^(.*)$ /versions/1.6/$1 [r=307,L] +RewriteRule ^(.*)$ /versions/1.7/$1 [r=307,L] # Redirect Chinese visitors to Chinese CDN, temporary solution for slow site speed in China RewriteCond %{ENV:GEOIP_COUNTRY_CODE} ^CN$ diff --git a/docs/static_site/src/_includes/get_started/get_started.html b/docs/static_site/src/_includes/get_started/get_started.html index 8123bf29dcc0..865d0f4784a3 100644 --- a/docs/static_site/src/_includes/get_started/get_started.html +++ b/docs/static_site/src/_includes/get_started/get_started.html @@ -1,7 +1,7 @@