From f2e90a22c7190e86441a415b1dbdb47ad3603f30 Mon Sep 17 00:00:00 2001
From: ciyong <ciyong.chen@intel.com>
Date: Thu, 3 Sep 2020 01:15:46 +0800
Subject: [PATCH] Update NEWS, README and website for 1.7.0 (#19047)

* update NEWS.md and README.md

* update get_started and pip for 1.7.0

* update website for 1.7.0

* update download link for source package

* Use archive.apache.org for the previous releases and mirror for the latest release

* update description for oneDNN and add native binary installation command

* update news
---
 NEWS.md                                       | 815 ++++++++++++++----
 docs/static_site/src/.htaccess                |   6 +-
 .../_includes/get_started/get_started.html    |   7 +-
 .../get_started/linux/python/cpu/pip.md       |  60 +-
 .../get_started/linux/python/gpu/pip.md       |   9 +-
 .../src/_includes/get_started/pip_snippet.md  |   2 +-
 .../src/pages/get_started/download.md         |   3 +-
 7 files changed, 717 insertions(+), 185 deletions(-)

diff --git a/NEWS.md b/NEWS.md
index 5e301d1f65a3..0ba22152d4e0 100644
--- a/NEWS.md
+++ b/NEWS.md
@@ -18,87 +18,128 @@
 MXNet Change Log
 ================
 - [MXNet Change Log](#mxnet-change-log)
-  * [1.6.0](#160)
-    + [Deprecation of Python 2](#deprecation-of-python-2)
-    + [New Features](#new-features)
-      - [NumPy compatible interface and using TVM to generate operators](#numpy-compatible-interface-and-using-tvm-to-generate-operators)
-      - [Graph optimizations](#graph-optimizations)
+  - [1.7.0](#170)
+    - [New features](#new-features)
+      - [MXNet Extensions: custom operators, partitioning, and graph passes](#mxnet-extensions-custom-operators-partitioning-and-graph-passes)
+      - [OpPerf utility enabled in the binary distribution](#opperf-utility-enabled-in-the-binary-distribution)
+      - [MKL-DNN](#mkl-dnn)
+        - [MKL-DNN as the default CPU backend in binary distribution](#mkl-dnn-as-the-default-cpu-backend-in-binary-distribution)
+        - [Branding change to DNNL](#branding-change-to-dnnl)
+        - [Support bfloat16 datatype](#support-bfloat16-datatype)
       - [New operators](#new-operators)
-    + [Feature improvements](#feature-improvements)
-      - [Automatic Mixed Precision](#automatic-mixed-precision)
-      - [Gluon Fit API](#gluon-fit-api)
-      - [MKLDNN](#mkldnn)
+    - [Feature improvements](#feature-improvements)
+      - [Numpy compatible interface(experimental)](#numpy-compatible-interfaceexperimental)
       - [Large tensor support](#large-tensor-support)
+      - [MKL-DNN enhancement](#mkl-dnn-enhancement)
       - [TensorRT integration](#tensorrt-integration)
-      - [Higher order gradient support](#higher-order-gradient-support)
-      - [Operator improvements](#operator-improvements)
+      - [Quantization](#quantization)
       - [Profiler](#profiler)
-      - [ONNX import/export](#onnx-importexport)
+      - [ONNX](#onnx)
+      - [New models](#new-models)
+      - [Operator improvements](#operator-improvements)
       - [Bug fixes](#bug-fixes)
-    + [Front end API](#front-end-api)
+    - [Front end API](#front-end-api)
       - [Gluon](#gluon)
       - [Symbol](#symbol)
-    + [Language Bindings](#language-bindings)
+    - [Language Bindings](#language-bindings)
       - [Python](#python)
       - [C/C++](#cc)
+      - [R](#r)
       - [Clojure](#clojure)
       - [Julia](#julia)
       - [Perl](#perl)
       - [Scala](#scala)
-    + [Performance improvements](#performance-improvements)
-    + [Examples and tutorials](#examples-and-tutorials)
-    + [Website and documentation](#website-and-documentation)
-    + [CI/CD](#cicd)
-    + [Misc](#misc)
-  * [1.5.1](#151)
-    + [Bug-fixes](#bug-fixes-1)
-  * [1.5.0](#150)
-    + [New Features](#new-features-1)
+    - [Performance improvements](#performance-improvements)
+    - [Example and tutorials](#example-and-tutorials)
+    - [Website and documentation](#website-and-documentation)
+    - [CI/CD](#cicd)
+    - [License](#license)
+    - [Miscellaneous changes](#miscellaneous-changes)
+  - [1.6.0](#160)
+    - [Deprecation of Python 2](#deprecation-of-python-2)
+    - [New features](#new-features-1)
+      - [NumPy compatible interface and using TVM to generate operators](#numpy-compatible-interface-and-using-tvm-to-generate-operators)
+      - [Graph optimizations](#graph-optimizations)
+        - [Pointwise fusion for GPU](#pointwise-fusion-for-gpu)
+        - [Eliminate common subexpressions](#eliminate-common-subexpressions)
+        - [Default MKLDNN Subgraph fusion](#default-mkldnn-subgraph-fusion)
+      - [New operators](#new-operators-1)
+    - [Feature improvements](#feature-improvements-1)
+      - [Automatic Mixed Precision](#automatic-mixed-precision)
+      - [Gluon Fit API](#gluon-fit-api)
+      - [MKLDNN](#mkldnn)
+      - [Large tensor support](#large-tensor-support-1)
+      - [TensorRT integration](#tensorrt-integration-1)
+      - [Higher order gradient support](#higher-order-gradient-support)
+      - [Operator improvements](#operator-improvements-1)
+      - [Profiler](#profiler-1)
+      - [ONNX import/export](#onnx-importexport)
+      - [Runtime discovery of features](#runtime-discovery-of-features)
+      - [Bug fixes](#bug-fixes-1)
+    - [Front end API](#front-end-api-1)
+      - [Gluon](#gluon-1)
+      - [Symbol](#symbol-1)
+    - [Language Bindings](#language-bindings-1)
+      - [Python](#python-1)
+      - [C/C++](#cc-1)
+      - [Clojure](#clojure-1)
+      - [Julia](#julia-1)
+      - [Perl](#perl-1)
+      - [Scala](#scala-1)
+    - [Performance improvements](#performance-improvements-1)
+    - [Examples and tutorials](#examples-and-tutorials)
+    - [Website and documentation](#website-and-documentation-1)
+    - [CI/CD](#cicd-1)
+    - [Misc](#misc)
+  - [1.5.1](#151)
+    - [Bug-fixes](#bug-fixes-2)
+  - [1.5.0](#150)
+    - [New Features](#new-features-2)
       - [Automatic Mixed Precision(experimental)](#automatic-mixed-precisionexperimental)
       - [MKL-DNN Reduced precision inference and RNN API support](#mkl-dnn-reduced-precision-inference-and-rnn-api-support)
       - [Dynamic Shape(experimental)](#dynamic-shapeexperimental)
-      - [Large Tensor Support](#large-tensor-support-1)
+      - [Large Tensor Support](#large-tensor-support-2)
       - [Dependency Update](#dependency-update)
       - [Gluon Fit API(experimental)](#gluon-fit-apiexperimental)
-      - [New Operators](#new-operators-1)
-    + [Feature Improvements](#feature-improvements-1)
+      - [New Operators](#new-operators-2)
+    - [Feature Improvements](#feature-improvements-2)
       - [Operators](#operators)
       - [MKLDNN](#mkldnn-1)
-      - [ONNX](#onnx)
+      - [ONNX](#onnx-1)
       - [TensorRT](#tensorrt)
       - [FP16 Support](#fp16-support)
-      - [Deep Graph Library(DGL) support](#deep-graph-library-dgl--support)
+      - [Deep Graph Library(DGL) support](#deep-graph-librarydgl-support)
       - [Horovod Integration](#horovod-integration)
       - [Dynamic Shape](#dynamic-shape)
       - [Backend Engine](#backend-engine)
-      - [Large Tensor Support](#large-tensor-support-2)
-      - [Quantization](#quantization)
-      - [Profiler](#profiler-1)
+      - [Large Tensor Support](#large-tensor-support-3)
+      - [Quantization](#quantization-1)
+      - [Profiler](#profiler-2)
       - [CoreML](#coreml)
-    + [Front End API](#front-end-api-1)
-      - [Gluon](#gluon-1)
-      - [Python](#python-1)
-    + [Language Bindings](#language-bindings-1)
-      - [Scala](#scala-1)
+    - [Front End API](#front-end-api-2)
+      - [Gluon](#gluon-2)
+      - [Python](#python-2)
+    - [Language Bindings](#language-bindings-2)
+      - [Scala](#scala-2)
       - [Java](#java)
       - [C++](#c)
-      - [Clojure](#clojure-1)
-      - [Julia](#julia-1)
-      - [Perl:](#perl-1)
-      - [R](#r)
-    + [Performance Improvements](#performance-improvements-1)
-    + [Example and Tutorials](#example-and-tutorials)
-    + [Website](#website)
-    + [Documentation](#documentation)
-    + [Build and Test](#build-and-test)
-    + [Bug-fixes](#bug-fixes-2)
-    + [License](#license)
-    + [Depreciations](#depreciations)
-    + [Known Issues](#known-issues)
-  * [1.4.1](#141)
-    + [Bug-fixes](#bug-fixes-3)
-  * [1.4.0](#140)
-    + [New Features](#new-features-2)
+      - [Clojure](#clojure-2)
+      - [Julia](#julia-2)
+      - [Perl:](#perl-2)
+      - [R](#r-1)
+    - [Performance Improvements](#performance-improvements-2)
+    - [Example and Tutorials](#example-and-tutorials-1)
+    - [Website](#website)
+    - [Documentation](#documentation)
+    - [Build and Test](#build-and-test)
+    - [Bug-fixes](#bug-fixes-3)
+    - [License](#license-1)
+    - [Depreciations](#depreciations)
+    - [Known Issues](#known-issues)
+  - [1.4.1](#141)
+    - [Bug-fixes](#bug-fixes-4)
+  - [1.4.0](#140)
+    - [New Features](#new-features-3)
       - [Java Inference API](#java-inference-api)
       - [Julia API](#julia-api)
       - [Control Flow Operators (experimental)](#control-flow-operators-experimental)
@@ -106,32 +147,32 @@ MXNet Change Log
       - [Subgraph API (experimental)](#subgraph-api-experimental)
       - [JVM Memory Management](#jvm-memory-management)
       - [Topology-aware AllReduce (experimental)](#topology-aware-allreduce-experimental)
-      - [MKLDNN backend: Graph optimization and Quantization (experimental)](#mkldnn-backend--graph-optimization-and-quantization-experimental)
-        * [Graph Optimization](#graph-optimization)
-        * [Quantization](#quantization-1)
-    + [New Operators](#new-operators-2)
-    + [Feature improvements](#feature-improvements-2)
+      - [MKLDNN backend: Graph optimization and Quantization (experimental)](#mkldnn-backend-graph-optimization-and-quantization-experimental)
+        - [Graph Optimization](#graph-optimization)
+        - [Quantization](#quantization-2)
+    - [New Operators](#new-operators-3)
+    - [Feature improvements](#feature-improvements-3)
       - [Operator](#operator)
       - [Optimizer](#optimizer)
       - [Sparse](#sparse)
-      - [ONNX](#onnx-1)
+      - [ONNX](#onnx-2)
       - [MKLDNN](#mkldnn-2)
       - [Inference](#inference)
       - [Other](#other)
-    + [Frontend API updates](#frontend-api-updates)
-      - [Gluon](#gluon-2)
-      - [Symbol](#symbol-1)
-    + [Language API updates](#language-api-updates)
+    - [Frontend API updates](#frontend-api-updates)
+      - [Gluon](#gluon-3)
+      - [Symbol](#symbol-2)
+    - [Language API updates](#language-api-updates)
       - [Java](#java-1)
-      - [R](#r-1)
-      - [Scala](#scala-2)
-      - [Clojure](#clojure-2)
-      - [Perl](#perl-2)
-      - [Julia](#julia-2)
-    + [Performance benchmarks and improvements](#performance-benchmarks-and-improvements)
-    + [Bug fixes](#bug-fixes-4)
-    + [Licensing updates](#licensing-updates)
-    + [Improvements](#improvements)
+      - [R](#r-2)
+      - [Scala](#scala-3)
+      - [Clojure](#clojure-3)
+      - [Perl](#perl-3)
+      - [Julia](#julia-3)
+    - [Performance benchmarks and improvements](#performance-benchmarks-and-improvements)
+    - [Bug fixes](#bug-fixes-5)
+    - [Licensing updates](#licensing-updates)
+    - [Improvements](#improvements)
       - [Tutorial](#tutorial)
       - [Example](#example)
       - [Documentation](#documentation-1)
@@ -140,98 +181,544 @@ MXNet Change Log
       - [Installation](#installation)
       - [Build and CI](#build-and-ci)
       - [3rd party](#3rd-party)
-        * [TVM:](#tvm)
-        * [CUDNN:](#cudnn)
-        * [Horovod:](#horovod)
-    + [Deprications](#deprications)
-    + [Other](#other-1)
-    + [How to build MXNet](#how-to-build-mxnet)
-    + [List of submodules used by Apache MXNet (Incubating) and when they were updated last](#list-of-submodules-used-by-apache-mxnet--incubating--and-when-they-were-updated-last)
-  * [1.3.1](#131)
-    + [Bug fixes](#bug-fixes-5)
-    + [Documentation fixes](#documentation-fixes)
-    + [Other Improvements](#other-improvements)
-    + [Submodule updates](#submodule-updates)
-    + [Known issues](#known-issues)
-  * [1.3.0](#130)
-    + [New Features - Gluon RNN layers are now HybridBlocks](#new-features---gluon-rnn-layers-are-now-hybridblocks)
-    + [MKL-DNN improvements](#mkl-dnn-improvements)
-    + [New Features - Gluon Model Zoo Pre-trained Models](#new-features---gluon-model-zoo-pre-trained-models)
-    + [New Features - Clojure package (experimental)](#new-features---clojure-package-experimental)
-    + [New Features - Synchronized Cross-GPU Batch Norm (experimental)](#new-features---synchronized-cross-gpu-batch-norm-experimental)
-    + [New Features - Sparse Tensor Support for Gluon (experimental)](#new-features---sparse-tensor-support-for-gluon-experimental)
-    + [New Features - Control flow operators (experimental)](#new-features---control-flow-operators-experimental)
-    + [New Features - Scala API Improvements (experimental)](#new-features---scala-api-improvements-experimental)
-    + [New Features - Rounding GPU Memory Pool for dynamic networks with variable-length inputs and outputs (experimental)](#new-features---rounding-gpu-memory-pool-for-dynamic-networks-with-variable-length-inputs-and-outputs-experimental)
-    + [New Features - Topology-aware AllReduce (experimental)](#new-features---topology-aware-allreduce-experimental)
-    + [New Features - Export MXNet models to ONNX format (experimental)](#new-features---export-mxnet-models-to-onnx-format-experimental)
-    + [New Features - TensorRT Runtime Integration (experimental)](#new-features---tensorrt-runtime-integration-experimental)
-    + [New Examples - Scala](#new-examples---scala)
-    + [Maintenance - Flaky Tests improvement effort](#maintenance---flaky-tests-improvement-effort)
-    + [Maintenance - MXNet Model Backwards Compatibility Checker](#maintenance---mxnet-model-backwards-compatibility-checker)
-    + [Maintenance - Integrated testing for "the Straight Dope"](#maintenance---integrated-testing-for-the-straight-dope)
-    + [Bug-fixes](#bug-fixes-6)
-    + [Performance Improvements](#performance-improvements-2)
-    + [API Changes](#api-changes)
-    + [Other features](#other-features)
-    + [Usability Improvements](#usability-improvements)
-  * [1.2.0](#120)
-    + [New Features - Added Scala Inference APIs](#new-features---added-scala-inference-apis)
-    + [New Features - Added a Module to Import ONNX models into MXNet](#new-features---added-a-module-to-import-onnx-models-into-mxnet)
-    + [New Features - Added Support for Model Quantization with Calibration](#new-features---added-support-for-model-quantization-with-calibration)
-    + [New Features - MKL-DNN Integration](#new-features---mkl-dnn-integration)
-    + [New Features - Added Exception Handling Support for Operators](#new-features---added-exception-handling-support-for-operators)
-    + [New Features - Enhanced FP16 support](#new-features---enhanced-fp16-support)
-    + [New Features - Added Profiling Enhancements](#new-features---added-profiling-enhancements)
-    + [Breaking Changes](#breaking-changes)
-    + [Bug Fixes](#bug-fixes-7)
-    + [Performance Improvements](#performance-improvements-3)
-    + [API Changes](#api-changes-1)
-    + [Sparse Support](#sparse-support)
-    + [Deprecations](#deprecations)
-    + [Other Features](#other-features)
-    + [Usability Improvements](#usability-improvements-1)
-    + [Known Issues](#known-issues-1)
-  * [1.1.0](#110)
-    + [Usability Improvements](#usability-improvements-2)
-    + [Bug-fixes](#bug-fixes-8)
-    + [New Features](#new-features-3)
-    + [API Changes](#api-changes-2)
-    + [Deprecations](#deprecations-1)
-    + [Performance Improvements](#performance-improvements-4)
-    + [Known Issues](#known-issues-2)
-  * [1.0.0](#100)
-    + [Performance](#performance)
-    + [New Features - Gradient Compression [Experimental]](#new-features---gradient-compression-experimental)
-    + [New Features - Support of NVIDIA Collective Communication Library (NCCL) [Experimental]](#new-features---support-of-nvidia-collective-communication-library--nccl--experimental)
-    + [New Features - Advanced Indexing [General Availability]](#new-features---advanced-indexing-general-availability)
-    + [New Features - Gluon [General Availability]](#new-features---gluon-general-availability)
-    + [New Features - ARM / Raspberry Pi support [Experimental]](#new-features---arm---raspberry-pi-support-experimental)
-    + [New Features - NVIDIA Jetson support [Experimental]](#new-features---nvidia-jetson-support-experimental)
-    + [New Features - Sparse Tensor Support [General Availability]](#new-features---sparse-tensor-support-general-availability)
-    + [Bug-fixes](#bug-fixes-9)
-    + [Doc Updates](#doc-updates)
-  * [0.12.1](#0121)
-    + [Bug-fixes](#bug-fixes-10)
-  * [0.12.0](#0120)
-    + [Performance](#performance-1)
-    + [New Features - Gluon](#new-features---gluon)
-    + [New Features - Autograd](#new-features---autograd)
-    + [New Features - Sparse Tensor Support](#new-features---sparse-tensor-support)
-    + [Other New Features](#other-new-features)
-    + [API Changes](#api-changes-3)
-    + [Bug-fixes](#bug-fixes-11)
-  * [0.11.0](#0110)
-    + [Major Features](#major-features)
-    + [API Changes](#api-changes-4)
-    + [Performance Improvements](#performance-improvements-5)
-    + [Bugfixes](#bugfixes)
-    + [Refactors](#refactors)
-  * [0.10.0](#0100)
-  * [0.9.3](#093)
-  * [v0.8](#v08)
-  * [v0.7](#v07)
-  * [v0.5 (initial release)](#v05-initial-release)
+        - [TVM:](#tvm)
+        - [CUDNN:](#cudnn)
+        - [Horovod:](#horovod)
+    - [Deprications](#deprications)
+    - [Other](#other-1)
+    - [How to build MXNet](#how-to-build-mxnet)
+    - [List of submodules used by Apache MXNet (Incubating) and when they were updated last](#list-of-submodules-used-by-apache-mxnet-incubating-and-when-they-were-updated-last)
+  - [1.3.1](#131)
+    - [Bug fixes](#bug-fixes-6)
+    - [Documentation fixes](#documentation-fixes)
+    - [Other Improvements](#other-improvements)
+    - [Submodule updates](#submodule-updates)
+    - [Known issues](#known-issues-1)
+  - [1.3.0](#130)
+    - [New Features - Gluon RNN layers are now HybridBlocks](#new-features---gluon-rnn-layers-are-now-hybridblocks)
+    - [MKL-DNN improvements](#mkl-dnn-improvements)
+    - [New Features - Gluon Model Zoo Pre-trained Models](#new-features---gluon-model-zoo-pre-trained-models)
+    - [New Features - Clojure package (experimental)](#new-features---clojure-package-experimental)
+    - [New Features - Synchronized Cross-GPU Batch Norm (experimental)](#new-features---synchronized-cross-gpu-batch-norm-experimental)
+    - [New Features - Sparse Tensor Support for Gluon (experimental)](#new-features---sparse-tensor-support-for-gluon-experimental)
+    - [New Features - Control flow operators (experimental)](#new-features---control-flow-operators-experimental)
+    - [New Features - Scala API Improvements (experimental)](#new-features---scala-api-improvements-experimental)
+    - [New Features - Rounding GPU Memory Pool for dynamic networks with variable-length inputs and outputs (experimental)](#new-features---rounding-gpu-memory-pool-for-dynamic-networks-with-variable-length-inputs-and-outputs-experimental)
+    - [New Features - Topology-aware AllReduce (experimental)](#new-features---topology-aware-allreduce-experimental)
+    - [New Features - Export MXNet models to ONNX format (experimental)](#new-features---export-mxnet-models-to-onnx-format-experimental)
+    - [New Features - TensorRT Runtime Integration (experimental)](#new-features---tensorrt-runtime-integration-experimental)
+    - [New Examples - Scala](#new-examples---scala)
+    - [Maintenance - Flaky Tests improvement effort](#maintenance---flaky-tests-improvement-effort)
+    - [Maintenance - MXNet Model Backwards Compatibility Checker](#maintenance---mxnet-model-backwards-compatibility-checker)
+    - [Maintenance - Integrated testing for "the Straight Dope"](#maintenance---integrated-testing-for-%22the-straight-dope%22)
+    - [Bug-fixes](#bug-fixes-7)
+    - [Performance Improvements](#performance-improvements-3)
+    - [API Changes](#api-changes)
+    - [Other features](#other-features)
+    - [Usability Improvements](#usability-improvements)
+  - [1.2.0](#120)
+    - [New Features - Added Scala Inference APIs](#new-features---added-scala-inference-apis)
+    - [New Features - Added a Module to Import ONNX models into MXNet](#new-features---added-a-module-to-import-onnx-models-into-mxnet)
+    - [New Features - Added Support for Model Quantization with Calibration](#new-features---added-support-for-model-quantization-with-calibration)
+    - [New Features - MKL-DNN Integration](#new-features---mkl-dnn-integration)
+    - [New Features - Added Exception Handling Support for Operators](#new-features---added-exception-handling-support-for-operators)
+    - [New Features - Enhanced FP16 support](#new-features---enhanced-fp16-support)
+    - [New Features - Added Profiling Enhancements](#new-features---added-profiling-enhancements)
+    - [Breaking Changes](#breaking-changes)
+    - [Bug Fixes](#bug-fixes-8)
+    - [Performance Improvements](#performance-improvements-4)
+    - [API Changes](#api-changes-1)
+    - [Sparse Support](#sparse-support)
+    - [Deprecations](#deprecations)
+    - [Other Features](#other-features-1)
+    - [Usability Improvements](#usability-improvements-1)
+    - [Known Issues](#known-issues-2)
+  - [1.1.0](#110)
+    - [Usability Improvements](#usability-improvements-2)
+    - [Bug-fixes](#bug-fixes-9)
+    - [New Features](#new-features-4)
+    - [API Changes](#api-changes-2)
+    - [Deprecations](#deprecations-1)
+    - [Performance Improvements](#performance-improvements-5)
+    - [Known Issues](#known-issues-3)
+  - [1.0.0](#100)
+    - [Performance](#performance)
+    - [New Features - Gradient Compression [Experimental]](#new-features---gradient-compression-experimental)
+    - [New Features - Support of NVIDIA Collective Communication Library (NCCL) [Experimental]](#new-features---support-of-nvidia-collective-communication-library-nccl-experimental)
+    - [New Features - Advanced Indexing [General Availability]](#new-features---advanced-indexing-general-availability)
+    - [New Features - Gluon [General Availability]](#new-features---gluon-general-availability)
+    - [New Features - ARM / Raspberry Pi support [Experimental]](#new-features---arm--raspberry-pi-support-experimental)
+    - [New Features - NVIDIA Jetson support [Experimental]](#new-features---nvidia-jetson-support-experimental)
+    - [New Features - Sparse Tensor Support [General Availability]](#new-features---sparse-tensor-support-general-availability)
+    - [Bug-fixes](#bug-fixes-10)
+    - [Doc Updates](#doc-updates)
+  - [0.12.1](#0121)
+    - [Bug-fixes](#bug-fixes-11)
+  - [0.12.0](#0120)
+    - [Performance](#performance-1)
+    - [New Features - Gluon](#new-features---gluon)
+    - [New Features - Autograd](#new-features---autograd)
+    - [New Features - Sparse Tensor Support](#new-features---sparse-tensor-support)
+    - [Other New Features](#other-new-features)
+    - [API Changes](#api-changes-3)
+    - [Bug-fixes](#bug-fixes-12)
+  - [0.11.0](#0110)
+    - [Major Features](#major-features)
+    - [API Changes](#api-changes-4)
+    - [Performance Improvements](#performance-improvements-6)
+    - [Bugfixes](#bugfixes)
+    - [Refactors](#refactors)
+  - [0.10.0](#0100)
+  - [0.9.3](#093)
+  - [v0.8](#v08)
+  - [v0.7](#v07)
+  - [v0.5 (initial release)](#v05-initial-release)
+
+## 1.7.0
+
+### New features
+#### MXNet Extensions: custom operators, partitioning, and graph passes
+
+Adds support for extending MXNet with custom operators, partitioning strategies, and graph passes. All implemented in a library easily compiled separately from the MXNet codebase, and dynamically loaded at runtime into any prebuilt installation of MXNet.
+
+ - fix for number of inputs/outputs for backward custom ops (#17069)
+ - Enhancements for custom subgraph op (#17194)
+ - Disable flaky test_custom_op_fork (#17481)
+ - fix custom op makefile (#17516)
+ - Update CustomOp doc with changes for GPU support (#17486)
+ - [WIP] MXNet Extensions enhancements (#17885) (#18128)
+ - Dynamic subgraph property (#17034)
+ - Dynamic subgraph property doc (#17585)
+ - [1.7] Backport MXNet Extension PRs (#17623, #17569, #17762) #18063 (#18069)
+
+#### OpPerf utility enabled in the binary distribution
+ - [OpPerf] Add Neural network loss ops (#17482)
+ - [OpPerf] Fixes the issue when you pass NDArray to run_perf_test (#17508)
+ - [OpPerf] Fix markdown for native profile and add profile param in function desc (#17494)
+ - [OpPerf] Add Indexing ops (#16253)
+ - [OpPerf] Implement remaining random sampling ops (#17502)
+ - [OpPerf] Implement remaining GEMM ops (#17501)
+ - [OpPerf] Implement all linalg ops (#17528)
+ - [OpPerf] Fixed native output ordering, added warmup & runs command line args (#17571)
+ - [OpPerf] Add norm, cast ops, remaining optimizer ops (#17542)
+ - [OpPerf] Fixed Python profiler bug (#17642)
+
+#### MKL-DNN
+##### MKL-DNN as the default CPU backend in binary distribution
+##### Branding change to DNNL
+ - Upgrade MKL-DNN dependency to v1.1 (#16823)
+
+##### Support bfloat16 datatype
+ - Add bfloat16 floating-point format support based on AMP  (#17265)
+
+#### New operators
+ - [New Op] Add deformable conv v2 (#16341)
+ - Add MXNet Ops for fast multihead attention (#16408)
+ - Support boolean elemwise/broadcast binary add, multiply and true_divide (#16728)
+ - add gammaln, erf, erfinv (#16811)
+ - add aligned roi introduced in Detectron2 (#16619)
+ - Implement atleast_1d/2d/3d (#17099)
+ - Interleaved MHA for CPU path (#17138)
+ - Lamb optimizer update (#16715)
+ - Quantized Embedding (#16691)
+ - Add gelu fuse ops (#18082) (#18092)
+
+### Feature improvements
+#### Numpy compatible interface(experimental)
+ - [NumPy] NumPy support for linalg.inv (#16730)
+ - add numpy op nan_to_num (#16717)
+ - [Numpy] Add sampling method for bernoulli (#16638)
+ - Fix numpy-compatible mean output type for integer inputs (#16792)
+ - [Numpy] Fix collect_params().zero_grad() in gluon numpy interface (#16716)
+ - [Numpy][Operator] 'where' Implementation in MXNet (#16829)
+ - [Numpy] Random.normal() with backward (#16330)
+ - Add OP diag [numpy] (#16786)
+ - Mixed precison binary op backward (use in) for numpy (#16791)
+ - add numpy op diagflat [numpy] (#16813)
+ - add op bitwise_or [numpy] (#16801)
+ - [Numpy] Implementation npx.{sample}_n (#16876)
+ - [Numpy] Add NumPy support for np.linalg.det and np.linalg.slogdet (#16800)
+ - Op Unravel_index PR [Numpy] (#16862)
+ - [Numpy] Fix imperative basic indexing in numpy (#16902)
+ - [Numpy] Basic indexing in symbolic interface of DeepNumpy (#16621)
+ - [Numpy] add op full_like, c++ impl, fix zeros_like, ones_like type inference (#16804)
+ - [Numpy] Implement numpy operator 'average' (#16720)
+ - [Bugfix] [Numpy] Add `kAddTo` and kNullOp to Transpose (#16979)
+ - set rtol = 1e-2 and atol = 1e-4 when dtype == np.float32 in test_numpy_op.py:test_np_linalg_solve (#17025)
+ - Op_Diagonal [Numpy] (#16989)
+ - numpy bincount (#16965)
+ - [numpy] add op bitwise_not (#16947)
+ - [Numpy ]Modify np.random.shuffle to enable inplace by default (#17133)
+ - [numpy] fix argsort typo (#17150)
+ - [numpy] add op round (#17175)
+ - [numpy]Add op delete (#17023)
+ - [numpy] add op flipud, fliplr (#17192)
+ - [CI] Re-enable testing with numpy 1.18 (#17200)
+ - [Numpy] Add broadcast_to scalar case (#17233)
+ - [Numpy] Random.gamma() implemented (#16152)
+ - [Numpy] add row_stack (=vstack) (#17171)
+ - [Numpy] Add infra for performing constraint check (#17272)
+ - porting numpy-compatible hstack to master and add dstack for interoperability (#17030)
+ - adding asnumpy() to output of gather(implicitly called) to fix gather test in large vector and tensor tests (#17290)
+ - [numpy] add op random.exponential (#17280)
+ - [NumPy] Add NumPy support for norm (#17014)
+ - [numpy]add op random.lognormal  (#17415)
+ - Add numpy random weibull operator (#17505)
+ - [numpy] Add np.random.pareto and np.random.power (#17517)
+ - [Numpy] Add sort op (#17393)
+ - [numpy]implement exponential backward (#17401)
+ - [Numpy] Where operator scalar version (#17249)
+ - [numpy] add op matmul (#16990)
+ - [numpy]add op random.logistic, random.gumbel (#17302)
+ - [numpy][Do Not Review]add op insert (#16865)
+ - [numpy] add op random.rayleigh (#17541)
+ - [numpy] add fallback ops (#17609)
+ - [numpy] add op pad (#17328)
+ - [numpy] add op fabs, sometrue, round_ (#17619)
+ - Add arange_like to npx (#16883)
+ - try to move shape_array to npx (#16897)
+ - support np.argsort (#16949)
+ - np.broadcast_to extension (#17358)
+ - support bitwise_and (#16861)
+ - fix np.argmax/argmin output data type (#17476)
+ - add op random.beta (#17390)
+ - add op isnan isinf (#17535)
+ - array_split pr (#17032)
+ - Mixed data type binary ops (#16699)
+ - randn implemented (#17141)
+ - refactor and reduce float types for some functions, also add bitwise_xor (#16827)
+ - any/all (#17087)
+ - amax (#17176)
+ - fix format (#17100)
+ - add op empty_like, add nan_to_num to dispatch (#17169)
+ - handle array_like fill_value for np.full; add unit test coverage (#17245)
+ - add np.amin (#17538)
+ - add npx.gather_nd (#17477)
+ - add np.random.chisquare (#17524)
+ - add polyval (#17416)
+ - add isposinf isneginf isfinite (#17563)
+ - Support broadcast assign for `npi_boolean_mask_assign_tensor` (#17131)
+ - Implement Weibull backward (#17590)
+ - support np.dsplit, fix some error msgs and corner cases for hsplit and vsplit, add interoperability tests for h/v/dsplit (#17478)
+ - add np.product (#17489)
+ - Implement np.random.pareto backward (#17607)
+ - add np.ediff1d (#17624)
+ - more support for boolean indexing and assign (#18352)
+ - Fix einsum gradient (#18482)
+ - [v1.7.x] Backport PRs of numpy features (#18653)
+ - [v1.7.x] backport mixed type binary ops to v1.7.x (#18649)
+ - revise activations (#18700)
+
+#### Large tensor support
+ - [Large Tensor] Add support to Random Sample & Pdf ops (#17445)
+ - [Large Tensor] Add LT support for NN optimizers and 1 activation function (#17444)
+ - [Large Tensor] Fixed SoftmaxActivation op (#17634)
+ - [Large Tensor] Fixed col2im op (#17622)
+ - [Large Tensor] Fixed Spatial Transformer op (#17617)
+ - [Large Tensor] Fix ravel_multi_index op (#17644)
+ - Sparse int64 Large tensor support (#16898)
+ - Re-Enabling Large Tensor Nightly on GPU (#16164)
+ - enabling build stage gpu_int64 to enable large tensor nightly runs (#17546)
+ - [Large Tensor] Fixed Embedding op (#17599)
+
+#### MKL-DNN enhancement
+ - MKLDNN FC : Add error info when mkldnn fc bias dimension is wrong (#16692)
+ - [MKLDNN] support mkldnn gelu (#16710)
+ - [MKLDNN] Fix int8 convolution/fc bias overflow (#16734)
+ - [MKLDNN] use dim_t instead of int in slice/transpose operators (#16737)
+ - Mkldnn fullyConnect bwd bug fix (#16890)
+ - Revert Mkldnn fullyConnect bwd bug fix (#16890) (#16907)
+ - [MKLDNN] Use MKLDNNRun (#16772)
+ - [MKLDNN] mkldnn RNN operator enhancement (#17075)
+ - [MKLDNN] enable MaxPooling with full pooling convention (#16860)
+ - update mkldnn to v1.1.2 (#17165)
+ - improve mkldnn doc (#17198)
+ - [MKLDNN] Fix _copyto  (#17173)
+ - [MKLDNN] Support channel wise quantization for FullyConnected (#17187)
+ - fixed seed for mkldnn test (#17386)
+ - add mkldnn softmax backward  (#17170)
+ - cmake: copy dnnl headers to include/mkldnn (#17647)
+ - [mkldnn]Mkldnn bn opt backport from master to 1.7x (#18009)
+ - [v1.x] Update 3rdparty/mkldnn remote URL and pin to v1.3 (#17972) (#18033)
+ - [v1.x] backport #17900 [MKLDNN] support using any format in pooling backward (#18067)
+ - Static link MKL-DNN library (#16731)
+ - Add large tensor nightly tests for MKL-DNN operators (#16184)
+ -  [MKL-DNN] Enable and Optimization for s8 eltwise_add (#16931)
+ - [MKL-DNN] Enhance Quantization Method (#17161)
+ - Static Build and CD for mxnet-cu102/mxnet-cu102mkl (#17074)
+ - MKL-DNN RNN backward path enhancement (#17183)
+ - cmake: check USE_OPENMP and pass proper MKL-DNN build flags (#17356)
+ - update mkl to 2020.0 (#17355)
+ - Enable MKL-DNN by default in pip packages (#16899)
+ - Enable MKL-DNN FullyConnected backward (#17318)
+ - Softmax primitive cache and in-place computation (#17152)
+ - boolean_mask_assign with start_axis (#16886)
+ - use identity_with_cast (#16913)
+ - change error tolerance for bf16 bn (#18110)
+ - [v1.x] Backport #17689 and #17884 to v1.x branch (#18064)
+ - refactor codes and add an option to skip/check weight's version to reduce overhead (#17707) (#18039)
+ - [v1.x] Backport #17702 and #17872 to v1.x branch (#18038)
+
+#### TensorRT integration
+ - Update TensorRT tutorial to build-from-source. (#14860)
+ - Minor fix, use RAII for TensorRT builder and network object (#17189)
+
+#### Quantization
+ - Add silent option to quantization script (#17094)
+
+#### Profiler
+ - Implemented final two binary ops, added default params for functionality (#17407)
+ - Implement remaining nn_activation ops in opperf (#17475)
+ - Implement all miscellaneous ops (#17511)
+ - Implement remaining nn_basic ops in opperf (#17456)
+
+#### ONNX
+ - Fix memory leak reported by ASAN in NNVM to ONNX conversion (#15516)
+ - ONNX export: Gather (#15995)
+ - ONNX export: Slice op - Handle None value for ends (#14942)
+
+#### New models
+ - [Model] Implement Neural Collaborative Filtering with MXNet (#16689)
+ - Further optimization for NCF model (#17148)
+ - HMM Model (#17120)
+
+#### Operator improvements
+ - Faster GPU NMS operator (#16542)
+ - [MXNET-1421] Added (CuDNN)BatchNorm operator to the list of mirrored operators (#16022)
+ - dynamic custom operator support (#15921)
+ - Multi Precision Lamb Update operator (#16885)
+ - Add im2col and col2im operator (#16502)
+ - Quantized Elemwise Mul Operator (#17147)
+ - Enhancements for MXTensor for custom operators (#17204)
+ - Enabling large tensor support for binary broadcast operators (#16755)
+ - Fix operators lying about their number of inputs (#17049)
+ - [WIP] Fallback mechanism for mx.np operators (#16923)
+ - Dynamic custom operator GPU support (#17270)
+ - Fix flaky - test_operator_gpu.test_np_insert (#17620)
+ - MXNet FFI for Operator Imperative Invocation (#17510)
+ - [MXNET-978] Higher Order Gradient Support `logp1`, `expm1`, `square`. (#15416)
+ - [MXNET-978] Higher Order Gradient Support `arcsin`, `arccos`. (#15515)
+ - [MXNET-978] Higher Order Gradient Support `rsqrt`, `rcbrt`. (#15476)
+ - gather_nd: check bound and wrap negative indices (#17208)
+ - Remove dilation restriction for conv3d (#17491)
+ - Fix storage type infer of softmax backward (#17576)
+ - Fix and optimize handling of vectorized memory accesses (#17767) (#18113)
+ - Cherry-pick of #17995 and #17937 to 1.x branch (#18041)
+ - No tensor cores for fp32 interleaved attention, remove div by 8 restriction (#17994) (#18085)
+ - GPU gemms true fp16 (#17466) (#18023)
+ - Add support for boolean inputs to FusedOp (#16796)
+
+#### Bug fixes
+ - [BUG FIX] Always preserve batch dimension in batches returned from dataloader (#16233)
+ - Fix SliceChannel Type inference (#16748)
+ - change _generate_op_module_signature get_module_file open with encoding=utf-8,it fix some encode error in Chinese windows system. (#16738)
+ - Fix rtrue_divide grad (#16769)
+ - fix inv test flakiness using random matrices generated by SVD (#16782)
+ - [MXNET-1426] Fix the wrong result of sum, mean, argmin, argmax when inputs contain inf or nan (#16234)
+ - Fix (#16781)
+ - fix expand_dims fall back when input's ndim is 0 (#16837)
+ - [fix] missing input log higher order. (#15331)
+ - Fix IndentationError in setup.py (#16857)
+ - Fix a few np issues (#16849)
+ - Fix InferAttr/InferShapeAttr not calling inference for all nodes in a graph (#16836)
+ - fix for enable model parallelism for non-fp32 data (#16683)
+ - Fix NDArrayIter iteration bug when last_batch_handle='pad' (#16166)
+ - Fix crashing on Windows in ObjectPool ~ctor (#16941)
+ - Fix NDArrayIter cant pad when size is large (#17001)
+ - fix axis=-1 bug (#17016)
+ - Fix CUDNN detection for CMake build (#17019)
+ - Fix omp assert issue (#17039)
+ - mshadow: fix vector access (#17021)
+ - [BUGFIX] Fix race condition in kvstore.pushpull (#17007)
+ - [BUGFIX] Fix trainer param order (#17068)
+ - [BugFix] fix filter channel calculation in ModulatedDeformableConvV2 (#17070)
+ - Fix reshape interoperability test (#17155)
+ - fix norm sparse fallback (#17149)
+ - fix py27 quantization (#17153)
+ - fix int8 add ut (#17166)
+ - Fix and clean up Ubuntu build from source instructions (#17229)
+ - fix lstm layer with projection save params (#17266)
+ - Fix rendering of ubuntu_setup.md codeblocks (#17294)
+ - Fix #17267, add expected and got datatype for concat error msgs (#17271)
+ - [BUGFIX] fix model zoo parallel download (#17372)
+ - fix use int8, uint8, int32, int64 (#17188)
+ - [Fix] Add ctx to the original ndarray and revise the usage of context to ctx (#16819)
+ - Fix ndarray indexing bug (#16895)
+ - fix requantize flaky test (#16709)
+ - Initial checkin (#16856)
+ - Fix flakey test_ndarray.py:test_reduce (#17312)
+ - fix flaky test: boolean index and fix bugs (#17222)
+ - Fix IOT Devices section of Get Started page (#17326)
+ - add logic for no batch size while getting data arrays from executors (#17772) (#18122)
+ - Fix reverse shape inference in LayerNorm (#17683)
+ - fix full and full_like when input is boolean (#17668)
+ - Fix MBCC inference (#17660)
+ - Additional fix for vector access. (#17230)
+ - Cherrypick Fix nightly large_vector test caused by incorrect with_seed path (#18178) (#18220)
+ - [1.7] Pass args fix3 (#18237)
+ - fixing batch_norm and layer_norm for large tensors (#17805) (#18261)
+ - [1.7.x] Backport of LSTM and GRU fix (#17898) and RNN op (#17632) (#18316)
+ - [v1.7.x] backport #18500 - [Bug Fixed] Fix batch norm when grad_req is `add` (#18517)
+ - Fix the monitor_callback invalid issue during calibration with variable input shapes (#18632) (#18703)
+
+### Front end API
+ - Fix the problem in printing feature in c++ API examples : feature_extract (#15686)
+ - updating MXNet version to 1.6.0 in base.h for C APIs (#16905)
+ - [API] unified API for custom kvstores (#17010)
+ - fix parameter names in the estimator api (#17051)
+ - adding docs for 64bit C APIs of large tensor (#17309)
+ - Add API docs to INT64 APIs (#16617)
+
+#### Gluon
+ - [Quantization] Enhance gluon quantization API (#16695)
+ - [Gluon] Improve estimator usability and fix logging logic (#16810)
+ - Fix test_gluon.py:test_sync_batchnorm when number of GPUS > 4 (#16834)
+ - [Gluon] Update contrib.Estimator LoggingHandler to support logging per batch interval (#16922)
+ - Include eval_net the validation model in the gluon estimator api (#16957)
+ - Fix Gluon Estimator nightly test (#17042)
+ - [MXNET-1431] Multiple channel support in Gluon PReLU (#16262)
+ - Fix gluon.Trainer regression if no kvstore is used with sparse gradients (#17199)
+ - refactor gluon.utils.split_data() following np.array_split() (#17123)
+ - Add RandomApply in gluon's transforms (#17242)
+ - Partitioning Gluon HybridBlocks (#15969)
+ - Random rotation (#16794)
+ - bump up atol for gradient check (#16843)
+ - Extend estimator.evaluate() to support event handlers (#16971)
+ - [MXNET-1438] Adding SDML loss function (#17298)
+
+#### Symbol
+ - Add unoptimized symbol to executor for sharing (#16798)
+ - Enforces NDArray type in get_symbol (#16871)
+ - Fix #17164 symbolblock with BatchNorm inside during cast to fp16 (#17212)
+ - autograd video and image link fixes and removing symbol tutorials (#17227)
+ - Fix CosineEmbeddingLoss in when symbol API is used (#17308)
+ - Fix Horovod build error due to missing exported symbols (#17348)
+ - Update symbol.py (#17408)
+ - update symbol to json (#16948)
+
+### Language Bindings
+#### Python
+ - Python 2 compatibility fix in base.py
+ - adding stacktrace in Jenkinsfile_utils.groovy to inspect Python2 failure cause in CI (#17065)
+ - Fix image display in python autograd tutorial (#17243)
+ - Fix Python 3 compatibility in example/speech_recognition (#17354)
+ - Stop testing Python 2 on CI (#15990)
+ - Docs: Python tutorials doc fixes (#17435)
+ - pin python dependencies (#17556)
+ - Python 2 cleanup (#17583)
+
+#### C/C++
+ - Simplify C++ flags (#17413)
+
+#### R
+ - fix R docs (#16733)
+ - [R package] Make R package compilation support opencv 4.0 (#16934)
+ - Support R-package with cmake build and fix installation instructions (#17228)
+ - Fix R-package/src/Makevars for OpenCV 4 (#17404)
+ - Fix typo in Install the MXNet Package for R (#17340)
+
+#### Clojure
+
+#### Julia
+ - [MXNET-1440] julia: porting `current_context` (#17142)
+ - julia: porting `context.empty_cache` (#17172)
+ - pin Markdown version to 3.1 in Julia doc build (#17549)
+
+#### Perl
+ - [Perl] - ndarray operator overloading enhancements (#16779)
+ - MXNET-1447 [Perl] Runtime features and large tensor support. (#17610)
+
+#### Scala
+ - Fix scala publish & nvidia-docker cublas issue (#16968)
+ - Fix publishing scala gpu with cpu instance (#16987)
+ - swap wget to curl in Scala scripts (#17041)
+ - [Scala/Java] Remove unnecessary data slicing (#17544)
+ - quantile_scalar (#17572)
+ - Fix get_started scala gpu (#17434)
+ - Fix MBCC & scala publish pipeline (#17643)
+ - Bump up additional scala 1.x branch to 1.7.0 (#17765)
+
+### Performance improvements
+ - Build.py improvement (#16976)
+ - Improvements to config.cmake (#17639)
+ - [Done] BilinearResize2D optimized (#16292)
+ - Speed fused_op compilation by caching ptx and jit-compiled functions (#16783)
+ - Improve the speed of the pointwise fusion graph pass (#17114)
+ - broadcast_axis optimization (#17091)
+ - Optimize AddTakeGrad Tensor Sum (#17906) (#18045)
+
+### Example and tutorials
+ - Add CustomOp tutorial doc (#17241)
+ - Correct the grammar in 1-ndarray tutorial (#17513)
+
+### Website and documentation
+ - Website edits (#17050)
+ - [Website 2.0] Nightly Build for v1.x (#17956)
+ - [docs] Fix runtime feature detection documentation (#16746)
+ - Adding user guidelines for using MXNet built with Large Tensor Support (#16894)
+ - fix typo and doc (#16921)
+ - large tensor faq doc fix (#16953)
+ - [DOC] Add a few tips for running horovod (#17235)
+ - Update NOTICE to fix copyright years (#17330)
+ - [DOC] Fix tutorial link, and better error msg (#17057)
+ - doc fix for argmax & argmin (#17604)
+
+### CI/CD
+ - support mixed-precision true_divide (#16711)
+ - Try to fix CI (#16908)
+ - mixed precision for power (#16859)
+ - Fix desired precision for test_ndarray.py:test_reduce (#16992)
+ - [reproducibility] multi_sum_sq review, AtomicAdd removal (#17002)
+ - fix precision problem in linalg_solve, linalg_tensorinv, linalg_cholesky op test (#16981)
+ - grouping large array tests based on type and updating nightly CI function (#17305)
+ - [LICENSE] fix cpp predcit license (#17377)
+ - [CI] Fix static build pipeline (#17474)
+ - skipping tests that cannot fit in nightly CI machine corrected imports (#17450)
+ - Update Windows CI scripts to use syntax compatible with Win 2019 server powershell. (#17526)
+ - Fix Non-ASCII character in docstring (#17600)
+ - [CI] Follow redirects when downloading apache-maven-3.3.9-bin.tar.gz (#17608)
+ - [CI] Upgrade sphinx and autodocsumm (#17594)
+ - Reduce load on CI due to excessive log flood (#17629)
+ - Enable users to specify BLAS (#17648)
+ - [CI] Add AMI id to instance info on builds (#17649)
+ - [v1.7.x] Backport staggered CI builds (#17999 & #18119) (#18142)
+ - [v1.7.x] Backport #17177 to 1.7.x (Fix incorrect calculation results when the C locale is set to a locale that uses commas as the decimal separator) (#18147)
+ - Fix formatting and typos in CD README.md (#16703)
+ - [CD] dynamic libmxet pipeline fix + small fixes (#16966)
+ - [CD] enable s3 publish for nightly builds in cd (#17112)
+ - [CD] fix CD pipeline (#17259)
+ - [CD] update publish path (#17453)
+ - fix CD and remove leftover from #15990 (#17551)
+ - Fix nightly build (#16773)
+ - Update pypi_publish.py to disable nighlty build upload to Pypi (#17082)
+ - [v1.7.x] update jetson dockerfile to support CUDA 10.0 (#18339)
+ - Remove manually created symbolic link to ninja-build (#18437) (#18456)
+ - Increase staggered build timeout to 180 min (#18568) (#18585)
+
+### License
+ - Don't relicense FindCUDAToolkit.cmake (#17334)
+ - fix license and copyright issues (#17364)
+ - Update ps-lite LICENSE (#17351)
+ - remove unused file with license issue (#17371)
+ - Update LICENSE for fonts (#17365)
+ - license np_einsum file under bsd (#17367)
+ - Update Apache License for mshadow (#18109) (#18134)
+ - Julia: remove downloading of the non-ASF binary build (#18489) (#18502)
+ - Add missing license header for md files (#18541)
+ - [v1.7.x]License checker enhancement (#18478)
+
+### Miscellaneous changes
+ - Link fixes4 (#16764)
+ - Refactoring names for mxnet version of nnvm to avoid conflicting with the original tvm/nnvm. (#15303)
+ - minor typo fix (#17008)
+ - Add micro averaging strategy to pearsonr metric (#16878)
+ - introduce  gradient update handler to the  base estimator (#16900)
+ - fix latency calculation and print issue (#17217)
+ - add inference benchmark script (#16978)
+ - change the wording and log level to be more in line with the general use (#16626)
+ - Updated logos. (#16719)
+ - Pinning rvm version to satisfy Jekyll build (#18016)
+ - Workaround gnu_tls handshake error on Ubuntu 14.04 Nvidia Docker (#18044)
 
 ## 1.6.0
 
diff --git a/docs/static_site/src/.htaccess b/docs/static_site/src/.htaccess
index 679b613b7581..4444d1d6d2fe 100644
--- a/docs/static_site/src/.htaccess
+++ b/docs/static_site/src/.htaccess
@@ -19,15 +19,15 @@ RewriteOptions AllowNoSlash
 
   # Web fonts
   ExpiresByType application/font-woff     "access plus 1 month"
-  
+
 </IfModule>
 
-# Set default website version to current stable (v1.6)
+# Set default website version to current stable (v1.7)
 RewriteCond %{REQUEST_URI} !^/versions/
 RewriteCond %{HTTP_REFERER} !mxnet.apache.org
 RewriteCond %{HTTP_REFERER} !mxnet.incubator.apache.org
 RewriteCond %{HTTP_REFERER} !mxnet.cdn.apache.org
-RewriteRule ^(.*)$ /versions/1.6/$1 [r=307,L]
+RewriteRule ^(.*)$ /versions/1.7/$1 [r=307,L]
 
 # Redirect Chinese visitors to Chinese CDN, temporary solution for slow site speed in China
 RewriteCond %{ENV:GEOIP_COUNTRY_CODE} ^CN$
diff --git a/docs/static_site/src/_includes/get_started/get_started.html b/docs/static_site/src/_includes/get_started/get_started.html
index 8123bf29dcc0..865d0f4784a3 100644
--- a/docs/static_site/src/_includes/get_started/get_started.html
+++ b/docs/static_site/src/_includes/get_started/get_started.html
@@ -1,7 +1,7 @@
 <script>
     /** Defaults **/
     /** See options.js for the full ugly script **/
-    var versionSelect = defaultVersion = 'v1.6.0';
+    var versionSelect = defaultVersion = 'v1.7.0';
     var platformSelect = 'linux';
     var languageSelect = 'python';
     var processorSelect = 'cpu';
@@ -24,13 +24,14 @@ <h2>Platform and use-case specific instructions for using MXNet</h2>
             <div class="col-9 install-right">
                 <div class="dropdown" id="version-dropdown-container">
                     <button class="current-version dropbtn btn" type="button" data-toggle="dropdown">
-                        v1.6.0
+                        v1.7.0
                         <svg class="dropdown-caret" viewBox="0 0 32 32" class="icon icon-caret-bottom" aria-hidden="true">
                             <path class="dropdown-caret-path" d="M24 11.305l-7.997 11.39L8 11.305z"></path>
                         </svg>
                     </button>
                     <ul class="opt-group version-dropdown">
-                        <li class="opt active versions"><a href="#">v1.6.0</a></li>
+                        <li class="opt active versions"><a href="#">v1.7.0</a></li>
+                        <li class="opt versions"><a href="#">v1.6.0</a></li>
                         <li class="opt versions"><a href="#">v1.5.1</a></li>
                         <li class="opt versions"><a href="#">v1.4.1</a></li>
                         <li class="opt versions"><a href="#">v1.3.1</a></li>
diff --git a/docs/static_site/src/_includes/get_started/linux/python/cpu/pip.md b/docs/static_site/src/_includes/get_started/linux/python/cpu/pip.md
index 3ec4f65c591a..8f9d4e04b188 100644
--- a/docs/static_site/src/_includes/get_started/linux/python/cpu/pip.md
+++ b/docs/static_site/src/_includes/get_started/linux/python/cpu/pip.md
@@ -9,17 +9,49 @@ page](https://mxnet.apache.org/get_started/download).
 
 Run the following command:
 
-<div class="v1-6-0">
+<div class="v1-7-0">
 {% highlight bash %}
 pip install mxnet
 {% endhighlight %}
 
+Start from 1.7.0 release, oneDNN(previously known as: MKL-DNN/DNNL) is enabled
+in pip packages by default.
+
+oneAPI Deep Neural Network Library (oneDNN) is an open-source cross-platform
+performance library of basic building blocks for deep learning applications.
+The library is optimized for Intel Architecture Processors, Intel Processor
+Graphics and Xe architecture-based Graphics. Support for other architectures
+such as Arm* 64-bit Architecture (AArch64) and OpenPOWER* Power ISA (PPC64) is
+experimental.
+
+oneDNN is intended for deep learning applications and framework developers
+interested in improving application performance on Intel CPUs and GPUs, more
+details can be found <a href="https://github.com/oneapi-src/oneDNN">here</a>.
+
+You can find performance numbers in the
+<a href="https://mxnet.apache.org/versions/1.6/api/faq/perf.html#intel-cpu">
+MXNet tuning guide</a>.
+
+To install native MXNet without oneDNN, run the following command:
+
+{% highlight bash %}
+pip install mxnet-native
+{% endhighlight %}
+
+</div> <!-- End of v1-7-0 -->
+
+<div class="v1-6-0">
+{% highlight bash %}
+pip install mxnet==1.6.0
+{% endhighlight %}
+
 MKL-DNN enabled pip packages are optimized for Intel hardware. You can find
-performance numbers
-in the <a href="https://mxnet.io/api/faq/perf#intel-cpu">MXNet tuning guide</a>.
+performance numbers in the
+<a href="https://mxnet.apache.org/versions/1.6/api/faq/perf.html#intel-cpu">
+MXNet tuning guide</a>.
 
 {% highlight bash %}
-pip install mxnet-mkl
+pip install mxnet-mkl==1.6.0
 {% endhighlight %}
 
 </div> <!-- End of v1-6-0 -->
@@ -30,8 +62,9 @@ pip install mxnet==1.5.1
 {% endhighlight %}
 
 MKL-DNN enabled pip packages are optimized for Intel hardware. You can find
-performance numbers
-in the <a href="https://mxnet.io/api/faq/perf#intel-cpu">MXNet tuning guide</a>.
+performance numbers in the
+<a href="https://mxnet.apache.org/versions/1.6/api/faq/perf.html#intel-cpu">
+MXNet tuning guide</a>.
 
 {% highlight bash %}
 pip install mxnet-mkl==1.5.1
@@ -46,8 +79,9 @@ pip install mxnet==1.4.1
 {% endhighlight %}
 
 MKL-DNN enabled pip packages are optimized for Intel hardware. You can find
-performance numbers
-in the <a href="https://mxnet.io/api/faq/perf#intel-cpu">MXNet tuning guide</a>.
+performance numbers in the
+<a href="https://mxnet.apache.org/versions/1.6/api/faq/perf.html#intel-cpu">
+MXNet tuning guide</a>.
 
 {% highlight bash %}
 pip install mxnet-mkl==1.4.1
@@ -61,8 +95,9 @@ pip install mxnet==1.3.1
 {% endhighlight %}
 
 MKL-DNN enabled pip packages are optimized for Intel hardware. You can find
-performance numbers
-in the <a href="https://mxnet.io/api/faq/perf#intel-cpu">MXNet tuning guide</a>.
+performance numbers in the
+<a href="https://mxnet.apache.org/versions/1.6/api/faq/perf.html#intel-cpu">
+MXNet tuning guide</a>.
 
 {% highlight bash %}
 pip install mxnet-mkl==1.3.1
@@ -76,8 +111,9 @@ pip install mxnet==1.2.1
 {% endhighlight %}
 
 MKL-DNN enabled pip packages are optimized for Intel hardware. You can find
-performance numbers
-in the <a href="https://mxnet.io/api/faq/perf#intel-cpu">MXNet tuning guide</a>.
+performance numbers in the
+<a href="https://mxnet.apache.org/versions/1.6/api/faq/perf.html#intel-cpu">
+MXNet tuning guide</a>.
 
 {% highlight bash %}
 pip install mxnet-mkl==1.2.1
diff --git a/docs/static_site/src/_includes/get_started/linux/python/gpu/pip.md b/docs/static_site/src/_includes/get_started/linux/python/gpu/pip.md
index 82049eba9b1c..91d32aed1ccd 100644
--- a/docs/static_site/src/_includes/get_started/linux/python/gpu/pip.md
+++ b/docs/static_site/src/_includes/get_started/linux/python/gpu/pip.md
@@ -10,13 +10,20 @@ page](https://mxnet.apache.org/get_started/download).
 
 Run the following command:
 
-<div class="v1-6-0">
+<div class="v1-7-0">
 {% highlight bash %}
 $ pip install mxnet-cu102
 {% endhighlight %}
 
 </div> <!-- End of v1-6-0 -->
 
+<div class="v1-6-0">
+{% highlight bash %}
+$ pip install mxnet-cu102==1.6.0
+{% endhighlight %}
+
+</div> <!-- End of v1-6-0 -->
+
 <div class="v1-5-1">
 {% highlight bash %}
 $ pip install mxnet-cu101==1.5.1
diff --git a/docs/static_site/src/_includes/get_started/pip_snippet.md b/docs/static_site/src/_includes/get_started/pip_snippet.md
index 7278044407e0..9e67acce5eda 100644
--- a/docs/static_site/src/_includes/get_started/pip_snippet.md
+++ b/docs/static_site/src/_includes/get_started/pip_snippet.md
@@ -1,7 +1,7 @@
 You can then <a href="/get_started/validate_mxnet.html">validate your MXNet installation</a>.
 
 <div style="text-align: center">
-    <img src="https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/install/pip-packages-1.6.0.png"
+    <img src="https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/install/pip-packages-1.7.0.png"
     alt="pip packages"/>
 </div>
 
diff --git a/docs/static_site/src/pages/get_started/download.md b/docs/static_site/src/pages/get_started/download.md
index eb07b3b7dceb..7444a2e9d8e2 100644
--- a/docs/static_site/src/pages/get_started/download.md
+++ b/docs/static_site/src/pages/get_started/download.md
@@ -35,7 +35,8 @@ encouraged to contribute to our development version on
 
 | Version | Source                                                                                                      | PGP                                                                                                             | SHA                                                                                                                |
 |---------|-------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------|
-| 1.6.0   | [Download](https://www.apache.org/dyn/closer.cgi/incubator/mxnet/1.6.0/apache-mxnet-src-1.6.0-incubating.tar.gz)   | [Download](https://downloads.apache.org/incubator/mxnet/1.6.0/apache-mxnet-src-1.6.0-incubating.tar.gz.asc)    |  [Download](https://downloads.apache.org/incubator/mxnet/1.6.0/apache-mxnet-src-1.6.0-incubating.tar.gz.sha512)    |
+| 1.7.0   | [Download](http://www.apache.org/dyn/closer.lua?filename=incubator/mxnet/1.7.0/apache-mxnet-src-1.7.0-incubating.tar.gz&action=download)   | [Download](https://downloads.apache.org/incubator/mxnet/1.7.0/apache-mxnet-src-1.7.0-incubating.tar.gz.asc)    |  [Download](https://downloads.apache.org/incubator/mxnet/1.7.0/apache-mxnet-src-1.7.0-incubating.tar.gz.sha512)    |
+| 1.6.0   | [Download](https://archive.apache.org/dist/incubator/mxnet/1.6.0/apache-mxnet-src-1.6.0-incubating.tar.gz)   | [Download](https://archive.apache.org/dist/incubator/mxnet/1.6.0/apache-mxnet-src-1.6.0-incubating.tar.gz.asc)    |  [Download](https://archive.apache.org/dist/incubator/mxnet/1.6.0/apache-mxnet-src-1.6.0-incubating.tar.gz.sha512)    |
 | 1.5.1   | [Download](https://archive.apache.org/dist/incubator/mxnet/1.5.1/apache-mxnet-src-1.5.1-incubating.tar.gz)   | [Download](https://archive.apache.org/dist/incubator/mxnet/1.5.1/apache-mxnet-src-1.5.1-incubating.tar.gz.asc)    |  [Download](https://archive.apache.org/dist/incubator/mxnet/1.5.1/apache-mxnet-src-1.5.1-incubating.tar.gz.sha512)     |
 | 1.5.0   | [Download](https://archive.apache.org/dist/incubator/mxnet/1.5.0/apache-mxnet-src-1.5.0-incubating.tar.gz)   | [Download](https://archive.apache.org/dist/incubator/mxnet/1.5.0/apache-mxnet-src-1.5.0-incubating.tar.gz.asc)    |  [Download](https://archive.apache.org/dist/incubator/mxnet/1.5.0/apache-mxnet-src-1.5.0-incubating.tar.gz.sha512)     |
 | 1.4.1   | [Download](https://archive.apache.org/dist/incubator/mxnet/1.4.1/apache-mxnet-src-1.4.1-incubating.tar.gz)   | [Download](https://archive.apache.org/dist/incubator/mxnet/1.4.1/apache-mxnet-src-1.4.1-incubating.tar.gz.asc)    | [Download](https://archive.apache.org/dist/incubator/mxnet/1.4.1/apache-mxnet-src-1.4.1-incubating.tar.gz.sha512)      |