Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[MXNET-744] Docs build tools update #11990

Merged
merged 22 commits into from
Aug 15, 2018
Merged

Conversation

aaronmarkham
Copy link
Contributor

@aaronmarkham aaronmarkham commented Aug 2, 2018

Description

While trying figure out how to debug the Sphinx website theme for #11916, I found the build tools frustratingly slow and nearly impossible to use for problems related to a full site build with the versions dropdown. So, I rewrote parts of it so you can do amazing and practical things like:

⭐ Decide what documentation sets to build on a per version basis!
⭐ Speed up the full version site build by 2.5x (was 43 minutes, now 17 minutes)!
⭐ Make incremental front-end site updates and build 14.3x faster (was 43 minutes, now 3 minutes)!
⭐ Optimize Sphinx configurations and pass these updates to the older versions of the site!
⭐ Not want to rage quit every time the docs build fails 40 minutes in!
⭐ Actually run the R docs build!

I'm pretty stoked. I hope you are too.

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)

Preview

I'm actively running tests here, but you can see the current output (probably): http://34.201.8.176/

Changes

💥 Each version build happens in its own folder. This lets the MXNet build be cached if there's no update to the library. Big win for time savings here. Tradeoff is it takes up more space.
💥 When you build Sphinx docs with make html you can pass a new param, BUILD_VER={tag}, like BUILD_VER=1.2.0, and it'll load the settings according to that version
💥 Settings per version? Yes, in a new file at /docs/settings.ini we now have configs for each version and each document set. I could have also used the Makefile, but this seemed to be better for handling all of the version support variations. You can configure the build to build what you want, and more importantly, turn it off for a version where it isn't possible... like for Clojure and any previous version tag. Only the most recent version or master or your fork might have it.
💥 What's this about your fork? Yes, you can now build docs with your own fork and output different branches of your fork to represent the different version on the website output. ✨ Cool, right? ✨ Gawd, why didn't I do this earlier!?

And I tweaked the css a bit so the h1-h4 tags pop a little more. The mxnet.css file gets pulled from the artifacts folder now, but we can alter this design down the line.

Usage

Building Docs with Sphinx

From the docs folder:
You can just use make html and it'll load the defaults which will build everything:

[document_sets_default]
clojure_docs = 1
doxygen_docs = 1
r_docs = 1
scala_docs = 1

Or you can use a specific version and it'll load the settings for it:

make html USE_OPENMP=1 BUILD_VER=1.0.0

From settings.ini:

[document_sets_1.0.0]
clojure_docs = 0
doxygen_docs = 1
r_docs = 0
scala_docs = 0

Building the Full Site with Your Fork

The previous example is a precursor to the more complicated process of building the full site and its many versions.
The scripts used by CI and that can be used yourself for building dev version of the site are in build_version_doc and are still build_all_version.sh and update_all_version.sh. Now they take optional params for your fork.

These scripts call make docs which in turn calls make html from the docs directory. So, they inherit what I just described above using the settings file. The version tags get passed down from the shell script into the docs Makefile invocation, which then is read in the Sphinx plugin mxdoc.py, which reads the settings file, which is what then calls all of the other docs generation tools. I had to bubble up the version info all the way to this plugin.

Note: I would certainly like to see this less complicated, but this is the current architecture, so... until we can possible refactor the whole thing with CMAKE, this is what we have to contend with...**

Building Each Version and Optionally Using Your Fork

Build the content of the 1.2.0 branch in the main project repo to the 1.2.1 folder:

./build_all_version.sh "1.2.0" "1.2.1"

Using the main project repo, map the 1.2.0 branch to output to a 1.2.1 directory; others as is:

./build_all_version.sh "1.2.0;1.1.0;master" "1.2.1;1.1.0;master"

Using a custom branch and fork of the repo, map the branch to master, map 1.2.0 branch to 1.2.1 and leave 1.1.0 in 1.1.0:

./build_all_version.sh "sphinx_error_reduction;1.2.0;1.1.0" "master;1.2.1;1.1.0" https://github.com/aaronmarkham/incubator-mxnet.git

Updating Each Version and Optionally Using Your Fork

Assuming you build 1.2.1, 1.1.0, 1.0.0, and master, you need to inject the versions dropdown to reflect these options for each page on the site:

./update_all_version.sh "1.2.1;1.1.0;1.0.0;master" master http://mxnet.incubator.apache.org/

It doesn't use your fork at this point, but it will pull the latest project README.md from your current branch and use that for the root of the output site.

Comments

Building R Docs

I'm not adding the R deps and making CI changes here. That's just too much in one PR. However, in a followup PR, or if you want test the pipeline with the R docs build turned on you need the following in addition to the R deps already in the docs setup scripts in CI or the one in build_version_doc:

sudo apt-get install \
    texinfo \
    texlive \
    texlive-fonts-extra \

You would also want to change settings.ini and enable the build of the R docs (per version you want).

Building Scala Docs

The current Scala config is incompatible with version 1.1.0 or earlier. You'll get an error about the namespace change. I'm disabling the build for this in those earlier versions in the provided settings.ini. Please discuss with me if this is a bad choice and if we need to use a custom config for those earlier versions. The function in mxdoc.py for scaladocs could use the new version param, and make some conditional statements for how it builds...

@aaronmarkham aaronmarkham requested a review from szha as a code owner August 2, 2018 01:47
@aaronmarkham
Copy link
Contributor Author

@marcoabreu @ankkhedia @kpmurali @szha - Please review.
@vandanavk / @cclauss - I made this compatible with the R docs changes in PR #11970 - this can stack or supersede as needed.

@aaronmarkham
Copy link
Contributor Author

@marcoabreu this does impact CI. The scripts I modified should work seamlessly with the current jobs and reduce the website build times by half.
We should chat about regularly running docs builds for PRs. The new modes would allow us to configure it to run specific sets, so if you update a Python file you don't run scaladocs. Or those are run separately...

@marcoabreu
Copy link
Contributor

We are already running docs as part of the PR stage, right? We could move the doc generation into the build stage and then run all flavours. WDYT?

@aaronmarkham
Copy link
Contributor Author

aaronmarkham commented Aug 2, 2018

@marcoabreu I don't know much about the PR checking logic. Can you point me to it?
It seems like we could scan a PR and if it matches *.scala then you run make scalapkg then scaladocs. (In this PR's features, you'd update the settings file accordingly. I could extend things to use env vars or the settings file, so then Jenkins could set the env var, and that would override the default settings.)
And do a similar match for each kind of doc, rather than run all docs for every update.
Conversely, if I make a .md, .js, or .html change, we don't run the full suite of CI on that PR. If docs changes are a separate pipeline, wouldn't this be possible?

@marcoabreu
Copy link
Contributor

https://github.com/apache/incubator-mxnet/blob/master/Jenkinsfile#L1088
https://github.com/apache/incubator-mxnet/blob/master/ci/docker/runtime_functions.sh#L785

I think your method is something we will get to in the future, but it requires quite some work until we are there.

anirudh2290 and others added 16 commits August 7, 2018 11:25
* Replace cublassgemm with cublassgemmex for >= 7.5

* Add comment for cublassgemmex

Remove fixed seed for test_sparse_nd_save_load (apache#11920)

* Remove fixed seed for test_sparse_nd_save_load

* Add comments related to the commit

Corrections to profiling tutorial (apache#11887)

Corrected a race condition with stopping profiling. Added mx.nd.waitall to ensure all operations have completed, including GPU operations that might otherwise be missing.

Also added alternative code for context selection GPU vs CPU, that had error before on machines with nvidia-smi.

Fix image classification scripts and Improve Fp16 tutorial (apache#11533)

* fix bugs and improve tutorial

* improve logging

* update benchmark_score

* Update float16.md

* update link to dmlc web data

* fix train cifar and add random mirroring

* set aug defaults

* fix whitespace

* fix typo
* adding param for list of tags to display on website

* using new website display argument for artifact placement in version folder

* adding display logic

* remove restricted setting for testing

* update usage instructions

* reverted Jenkinsfile to use restricted nodes

[MXAPPS-581] Fixes for broken Straight Dope tests. (apache#11923)

* Update relative paths pointing to the data directory to point to the
  correct place in the testing temporary folder.

* Enable the notebooks that were previously broken because of relative
  file paths not pointing to the correct place.

* Move some notebooks we do not plan to test to the whitelist. These
  notebooks are not published in the Straight Dope book.

* Clean-up: Convert print statements to info/warn/error logging
  statements. Add some logging statements for better status.

Disable flaky test: test_spatial_transformer_with_type (apache#11930)

apache#11839

Add linux and macos MKLDNN Building Instruction (apache#11049)

* add linux and macos doc

* update doc

* Update MKL_README.md

* Update MKL_README.md

Add convolution code to verify mkldnn backend

* add homebrew link

* rename to MKLDNN_README

* add mkl verify

* trigger

* trigger

* set mac complier to gcc47

* add VS2017 support experimentally

* improve quality

* improve quality

* modify mac build instruction since prepare_mkldnn.sh has been rm

* trigger

* add some improvement

[MXNET-531] Add download util (apache#11866)

* add changes to example

* place the file to the util

* add retry scheme

* fix the retry logic

* change the DownloadUtil to Util

* Trigger the CI

[MXNET-11241] Avoid use of troublesome cudnnFind() results when grad_req='add' (apache#11338)

* Add tests that fail due to issue 11241

* Fix apache#11241 Conv1D throws CUDNN_STATUS_EXECUTION_FAILED

* Force algo 1 when grad_req==add with large c.  Expand tests.

* Shorten test runtimes.

Improving documentation and error messages for Async distributed training with Gluon (apache#11910)

* Add description about update on kvstore

* add async check for gluon

* only raise error if user set update_on_kvstore

* fix condition

* add async nightly test

* fix case when no kvstore

* add example for trainer creation in doc

[MXNET-641] fix R windows install docs (apache#11805)

* fix R windows install docs

* addressed PR comments

* PR comments

* PR comments

* fixed line wrappings

* fixed line wrappings

a hot fix for mkldnn link (apache#11939)

re-enabling randomized test_l2_normalization (apache#11900)

[MXNET-651] MXNet Model Backwards Compatibility Checker (apache#11626)

* Added MNIST-MLP-Module-API models to check model save and load_checkpoint methods

* Added LENET with Conv2D operator training file

* Added LENET with Conv2d operator inference file

* Added LanguageModelling with RNN training file

* Added LamguageModelling with RNN inference file

* Added hybridized LENET Gluon Model training file

* Added hybridized LENET gluon model inference file

* Added license headers

* Refactored the model and inference files and extracted out duplicate code in a common file

* Added runtime function for executing the MBCC files

* Added JenkinsFile for MBCC to be run as a nightly job

* Added boto3 install for s3 uploads

* Added README for MBCC

* Added license header

* Added more common functions from lm_rnn_gluon_train and inference files into common.py to clean up code

* Added scripts for training models on older versions of MXNet

* Added check for preventing inference script from crashing in case no trained models are found

* Fixed indentation issue

* Replaced Penn Tree Bank Dataset with Sherlock Holmes Dataset

* Fixed indentation issue

* Removed training in models and added smaller models. Now we are simply checking a forward pass in the model with dummy data.

* Updated README

* Fixed indentation error

* Fixed indentation error

* Removed code duplication in the training file

* Added comments for runtime_functions script for training files

* Merged S3 Buckets for storing data and models into one

* Automated the process to fetch MXNet versions from git tags

* Added defensive checks for the case where the data might not be found

* Fixed issue where we were performing inference on state model files

* Replaced print statements with logging ones

* Removed boto install statements and move them into ubuntu_python docker

* Separated training and uploading of models into separate files so that training runs in Docker and upload runs outside Docker

* Fixed pylint warnings

* Updated comments and README

* Removed the venv for training process

* Fixed indentation in the MBCC Jenkins file and also separated out training and inference into two separate stages

* Fixed indendation

* Fixed erroneous single quote

* Added --user flag to check for Jenkins error

* Removed unused methods

* Added force flag in the pip command to install mxnet

* Removed the force-re-install flag

* Changed exit 1 to exit 0

* Added quotes around the shell command

* added packlibs and unpack libs for MXNet builds

* Changed PythonPath from relative to absolute

* Created dedicated bucket with correct permission

* Fix for python path in training

* Changed bucket name to CI bucket

* Added set -ex to the upload shell script

* Now raising an exception if no models are found in the S3 bucket

* Added regex to train models script

* Added check for performing inference only on models trained on same major versions

* Added set -ex flags to shell scripts

* Added multi-version regex checks in training

* Fixed typo in regex

* Now we will train models for all the minor versions for a given major version by traversing the tags

* Added check for validating current_version

[MXNET-531] NeuralStyle Example for Scala (apache#11621)

* add initial neuralstyle and test coverage

* Add two more test and README

* kill comments

* patch on memory leaks fix

* fix formatting issues

* remove redundant files

* disable the Gan example for now

* add ignore method

* add new download scheme to match the changes
make executable
switched to boolean types and added debug messaging

build will copy current config files to each version build

build will copy current config files to each version build

build will copy current config files to each version build

build will copy current config files to each version build

path fix
[MXNET-750] fix nested call on CachedOp. (apache#11951)

* fix nested call on cachedop.

* fix.

extend reshape op to allow reverse shape inference (apache#11956)

Improve sparse embedding index out of bound error message; (apache#11940)

[MXNET-770] Remove fixed seed in flaky test (apache#11958)

* Remove fixed seed in flaky test

* Remove fixed seed in flaky test

Update ONNX docs with the latest supported ONNX version (apache#11936)

Reduced test to 3 epochs and made gpu only (apache#11863)

* Reduced test to 3 epochs and made GPU only

* Moved logger variable so that it's accessible

Fix flaky tests for test_laop_4 (apache#11972)

Updating R client docs (apache#11954)

* Updating R client docs

* Forcing build

Fix install instructions for MXNET-R (apache#11976)

* fix install instructions for MXNET-R

* fix install instructions for MXNET-R

* fix default cuda version for MXNet-R

[MXNET-751] fix ce_loss flaky (apache#11971)

* add xavier initializer

* remove comment line

[MXNET-769] set MXNET_HOME as base for downloaded models through base.data_dir() (apache#11636)

* set MXNET_DATA_DIR as base for downloaded models through base.data_dir()
push joblib to save containers so is not required when running

* MXNET_DATA_DIR -> MXNET_HOME

[MXNET-748] linker fixed on Scala issues (apache#11989)

* put force load back as a temporary solution

* use project.basedir as relative path for OSX linker

[MXNET-772] Re-enable test_module.py:test_module_set_params (apache#11979)

[MXNET-771] Fix Flaky Test test_executor.py:test_dot (apache#11978)

* use assert_almost_equal, increase rtol, reduce matrix size

* remove seed in test_bind

* add seed 0 to test_bind, it is still flaky

* add comments for tracking

remove mod from arity 2 version of load-checkpoint in clojure-package (apache#11808)

* remove mod from arity 2 version of load-checkpoint

* load-checkpoint arity 2 test

Add unit test stage for mxnet cpu in debug mode (apache#11974)

Website broken link fixes (apache#12014)

* fix broken link

* fix broken link

* switch to .md links

* fix broken link

removed seed from flaky test (apache#11975)

Disable ccache log print due to threadunsafety (apache#11997)

Added default tolerance levels for regression checks for MBCC (apache#12006)

* Added tolerance level for assert_almost_equal for MBCC

* Nudge to CI

Disable flaky mkldnn test_requantize_int32_to_int8 (apache#11748)

[MXNET-769] Usability improvements to windows builds (apache#11947)

* Windows scripted build
Adjust Jenkins builds to use ci/build_windows.py

Issues:

    apache#8714
    apache#11100
    apache#10166
    apache#10049

* Fix bug

* Fix non-portable ut

* add xunit

Fix import statement (apache#12005)

array and multiply are undefined. Importing them from
ndarray

Disable flaky test test_random.test_gamma_generator (apache#12022)

[MXNET-770] Fix flaky test: test_factorization_machine_module (apache#12023)

* Remove fixed seed in flaky test

* Remove fixed seed in flaky test

* Update random seed to reproduce the issue

* Fix Flaky unit test and add a training test

* Remove fixed seed in flaky test

* Update random seed to reproduce the issue

* Fix Flaky unit test and add a training test

* Increase accuracy check

disable opencv threading for forked process (apache#12025)

Bug fixes in control flow operators (apache#11942)

Fix data narrowing warning on graph_executor.cc (apache#11969)

Fix flaky tests for test_squared_hinge_loss (apache#12017)

Fix flaky tests for test_hinge_loss (apache#12020)

remove fixed seed for test_sparse_ndarray/test_operator_gpu.test_sparse_nd_pickle (apache#12012)

Removed fixed seed from , test_loss:test_ctc_loss_train (apache#11985)

Removed fixed seed from , test_loss:test_sample_weight_loss (apache#11986)

Fix reduce_kernel_M1 (apache#12026)

* Fix reduce_kernel_M1

* Improve test_norm

Update test_loss.py to remove fixed seed (apache#11995)

[MXNET-23] Adding support to profile kvstore server during distributed training  (apache#11215)

* server profiling

merge with master

cleanup old code

added a check and better info message

add functions for C compatibility

fix doc

lint fixes

fix compile issues

lint fix

build error

update function signatures to preserve compatibility

fix comments

lint

* add part1 of test

* add integration test

Re-enabling test_ndarray/test_cached (apache#11950)

Test passes on CPU and GPU (10000 runs)

make gluon rnn layers hybrid blocks (apache#11482)

* make Gluon RNN layer hybrid block

* separate gluon gpu tests

* remove excess assert_raises_cudnn_disabled usage

* add comments and refactor

* add bidirectional test

* temporarily remove hybridize in test_gluon_rnn.test_layer_fill_shape

[MXNET-751] fix bce_loss flaky (apache#11955)

* add fix to bce_loss

* add comments

* remove unecessary comments

Doc fix for a few optimizers (apache#12034)

* Update optimizer.py

* Update optimizer.py
@aaronmarkham
Copy link
Contributor Author

So assuming this passes CI, this PR is good to go. It passed a test here:
http://jenkins.mxnet-ci.amazon-ml.com/job/test-website-build/15/

@lebeg
Copy link
Contributor

lebeg commented Aug 15, 2018

LGTM

@nswamy nswamy merged commit c220974 into apache:master Aug 15, 2018
@aaronmarkham
Copy link
Contributor Author

That switching to separate Python 2 and 3 in CI is breaking this. I'm working on a patch.

@aaronmarkham
Copy link
Contributor Author

It was actually the switch to Python 3 in CI that overlapped on this PR. Docs has been building in Python 2 prior to last week. Anyways, I believe I fixed it. Here's my test run:
http://jenkins.mxnet-ci.amazon-ml.com/job/test-website-build/22/console

See #12195. It needs a review.

@nswamy
Copy link
Member

nswamy commented Aug 16, 2018

aaronmarkham understand CI changing to Py3 caused your code to break, however we cannot have CI blocked. Thanks for acting swiftly on this. I'll review #12195 and merge once it passes CI

aaronmarkham added a commit to aaronmarkham/incubator-mxnet that referenced this pull request Aug 17, 2018
adding tutorial index pages to whitelist

added custom fork feature

adding settings to turn off/on doc sets

using custom fork directory for artifacts

automate upstream branch refresh

switched to boolean types and added debug messaging

build will copy current config files to each version build

build will copy current config files to each version build

stashing config files before checking out new version

put mxnet.css as artifact to be copied during build

fix formatting issues in h tags

refactored to build each version in a different folder

grab latest README from local fork

using settings.ini for document sets per version

fix R doc config for mxnet root

matching conf.py updates to current and excluding 3rdparty folder

align R doc gen bug fix with other PR 11970

pass the current tag in the make args and set to default if empty

fix bug for default version and add BUILD_VER to make html call

turning off scala docs for versions less than 1.2.0

turning off r docs until CI can handle it

enabling new docs build capability in CI

failover to fetching remote branch

Remove stale Keras-MXNet tests from MXNet repo (apache#11902)

Disable flaky cpp test (apache#12056)

Adjusting tolerance level and removing fixed seed for tests: test_ifft, test_fft (apache#12010)

* adjusting tolerance level and removing fixed seed

* CI retrigger

* removing status

[MXNET-774] Flaky test in test_executor.py:test_bind (apache#12016)

* fix test bind, remove fixed seed

* add tracking info

* remove tracking info

fix flaky test_quantization.test_get_optimal_thresholds (apache#12004)

removed fixed seed 1234 (apache#12072)

tested with 100k runs, no failures

improve error message of cudnn operators (apache#11886)

Fix for undefined variable errors (apache#12037)

* Undefined name in initializer

* Fix undefined name in test_mkldnn

* Fix for undefined names in examples

Fix undefined_variable lint errors in examples (apache#12052)

* Fix lint errors in dqn example

* Fix lint error in gluon example

* Fix undefined error in autoencoder example

MXNET-776 [Perl] Better documentation/bug fixes. (apache#12038)

* MXNET-776
1) Several new metric classes.
2) Improved documentation.
3) Bugfixes.

* added links and fixed a typo.

Redesign Jenkinsfiles (apache#12000)

* Rework Jenkinsfile

* Add functionality to assign node labels dynamically

* Extract functions into util file

* Change all Jenkinsfiles to use utils

* Make a new commit...

* Address review comments 1

* Address review comments 2

fix unidirectional model's parameter format (apache#12055)

* fix unidirectional model's parameter format

* Update rnn_layer.py

Fix syntax errors in Jenkinsfiles (apache#12095)

[MXAPPS-581] Straight Dope nightly fixes. (apache#11934)

Enable 3 notebooks that were failing tests after making updates to the
Straight Dope book. We also add pandas required by one of these
notebooks.

Fix jenkinsfile syntax errors (apache#12096)

remove fixed seed for test_triplet_loss (apache#12011)

got rid of fixed seed for test_optimizer/test_operator_gpu.test_ftml (apache#12003)

[MXNET-696] Fix undefined variable errors (apache#11982)

* Fix undefined error in image segmentation

ctx is used undefined. Setting the default ctx to cpu and
editing the comment to let the user know that it can be
changed to GPU as required.

* Fix undefined names in SSD example

maskUtils is disabled. Remove code referencing it.
Initializing start_offset.

got rid of fixed seed for test_optimizer/test_operator_gpu.test_nag (apache#11981)

Fix flaky test for elementwise_sum (apache#11959)

Re-enabling test_operator.test_binary_math_operators (apache#11712) (apache#12053)

Test passes on CPU and GPU (10000 runs)

update docs to explain CPU incompatibilities (apache#11931)

removed fixed from test_optimizer.test_signum (apache#12088)

Add missing object to tests/nightly/model_backwards_compatibility_check/JenkinsfileForMBCC (apache#12108)

Add GetName function in Symbol class for cpp pack (apache#12076)

Add unique number of parameters to summary output in Gluon Block (apache#12077)

* add unique parameters in summary output

* rebuild

Update fully_connected.cc documentation (apache#12097)

[MXNET-244] Update RaspberryPI instructions (apache#11562)

* Update RaspberryPI instructions

[MXNET-749] Correct usages of `CutSubgraph` in 3 control flow operators (apache#12078)

* Fix cut graph

* Copy only when necessary

* Add unittest for while_loop

* Add unittest for foreach

* Add unittest for cond

* Avoid magic number: 0 => kUndefinedStorage

[MXNET-703] TensorRT runtime integration (apache#11325)

* [MXNET-703] TensorRT runtime integration

Co-authored-by: Clement Fuji-Tsang <caenorst@hotmail.com>
Co-authored-by: Kellen Sunderland <kellen.sunderland@gmail.com>

* correctly assign self._optimized_symbol in executor

* declare GetTrtCompatibleSubsets and ReplaceSubgraph only if MXNET_USE_TENSORRT

* add comments in ReplaceSubgraph

* Addressing Haibin's code review points

* Check that shared_buffer is not empty when USE_TENSORRT is set

* Added check that TensorRT binding is for inference only

* Removed redundant decl.

* WIP Refactored TRT integration and tests

* Add more build guards, remove unused code

* Remove ccache report

* Remove redundant const in declaration

* Clean Cmake TRT files

* Remove TensorRT env var usage

We don't want to use environment variables with TensorRT yet, the
logic being that we want to try and have as much fwd compatiblity as
possible when working on an experimental feature.  Were we to add
env vars they would have to be gaurenteed to work in the future until
a major version change.  Moving the functionality to a contrib call
reduces this risk.

* Use contrib optimize_graph instaed of bind

* Clean up cycle detector

* Convert lenet test to contrib optimize

* Protect interface with trt build flag

* Fix whitespace issues

* Add another build guard to c_api

* Move get_optimized_symbol to contrib area

* Ignore gz files in test folder

* Make trt optimization implicit

* Remove unused declaration

* Replace build guards with runtime errors

* Change default value of TensorRT to off

This is change applies to both TensorRT and non-TensorRT builds.

* Warn user when TRT not active at runtime

* Move TensorRTBind declaration, add descriptive errors

* Test TensorRT graph execution, fix bugs

* Fix lint and whitespace issues

* Fix typo

* Removed default value for set_use_tensorrt

* Improved documentation and fixed spacing issues

* Move static exec funcs to util files

* Update comments to match util style

* Apply const to loop element

* Fix a few namespace issues

* Make static funcs inline to avoid compiler warning

* Remove unused inference code from lenet5_train

* Add explicit trt contrib bind, update tests to use it

* Rename trt bind call

* Remove documentation that is not needed for trt

* Reorder arguments, allow position calling

Decrease success rate to make test more stable (apache#12092)

I have added this test back to unit test coverage and decreased success rate even more, to make sure that fails would happen even more rare

Add Clojure to website nav (apache#12075)

* adding clojure to API navigation

* adding clojure to the sidebar

* switched order

Fix flaky tests for quantize and requantize (apache#12040)

[MXNET-703] Use relative path for symbol import (apache#12124)

Fix shared memory with gluon dataloader, add option pin_memory (apache#11908)

* use threading for mp dataloader fetching, allow pin_memory option

* allow pin tuple of data into cpu_pinned

* fix as_in_context if not cpu_pinned

* fix cpu_pinned

* fix unittest for windows, update doc that windows mp is available

* fix pin_memory

* fix lint

* always use simplequeue for data queue

* remove main thread clearing for data_queue

* do not use outside folder as pythonpath but run nosetests inside

* use :MXNET_LIBRARY_PATH= to locate dll

* fix dll path

* correct dll path

reduce a copy for rowsparse parameter.reduce (apache#12039)

GPU Memory Query to C API (apache#12083)

* add support for GPU memory query

* remove lint

take custom dataset into consideration (apache#12093)

[MXNET-782] Fix Custom Metric Creation in R tutorial (apache#12117)

* fix tutorial

* install instructions

* fix typo

[MXAPPS-805] Notebook execution failures in CI. (apache#12068)

* [MXAPPS-805] Notebook execution failures in CI.

* Add a retry policy when starting a notebook executor to handle the failure to
 start a notebook executor (due to a port collision, kernel taking too
 long to start, etc.).

* Change logging level for tests to INFO so that we have more
 informative test output.

* Make retry logic for Jupyter notebook execution specific to the error
message we are looking for to prevent false positives in the retry logic.

rm wrong infertype for AdaptiveAvgPool and BilinearReisze2D (apache#12098)

Document MXNET_LIBRARY_PATH environment variable which was not documented explicitly. (apache#12074)

Generalized reshape_like operator (apache#11928)

* first commit

* fix documentation

* changed static_cast<bool>(end) to end.has_value()
fixed documentation issues

* change begin from int to optional

* test None as lhs

fix cython nnvm include path (apache#12133)

CI scripts refinements. Separate Py2 and Py3 installs cripts. Fix perms. (apache#12125)

 zipfian random sampler without replacement  (apache#12113)

* code compiles

* update doc

* fix bug and add test

* fix lint

update dmlc-core (apache#12129)

Fix quantized graphpass bug (apache#11937)

* fix quantized graphpass bug

* add residual quantization testcase

* handle dtype and backend issues

support selu activation function (apache#12059)

Fix flaky test test_operator_gpu:deformable_conv and deformable_psroi_pooling (apache#12070)

[MXNET-767] Fix flaky test for kl_loss (apache#11963)

* Fix flaky test for kl_loss

* remove comment.

[MXNET-788] Fix for issue apache#11733 pooling op test (apache#12067)

* added support to check_consistency function to generate random numbers for a specific datatype (ie. fp16)
this ensures that for tests that compare results among different precisions, that data is generated in the least precise type and casted to the most precise

changed test_pooling_with_type test case to specify fp16 precision for random input data
renamed the 2nd test_pooling_with_type function to test_pooling_with_type2 so it doesnt redefine the first and both are tested

fixed equation formatting issue in pooling operator description

Added myself to the contributors readme file

* updated from latest in master (had old version of the file)

* shortened lines per lint spec

* renamed default_type argument to rand_type for clarity
updated function docstring with argument description

removed rand_type setting for non-max pooling tests

* cleaned up check_consistency function docstring

Do not show "needs to register block" warning for registered blocks. (apache#12130)

Fix precision issue of test case test_rnnrelu_bidirectional (apache#12099)

* adjust tolerance only for relu for fixing test case bug

* only adjust torence for test_rnnrelu_bidirectional and adjust back on test_rnnrelu_sym

Accelerate the performance of topk for CPU side (apache#12085)

* Accelerate the performance of topk for CPU side

* Add comments for the code changes

Remove unused TensorRT code (apache#12147)

Removing some python code that isn't in the current TensorRT execution paths.
This should make the code more readable and avoid potential linting errors.

Thanks to @vandanavk for pointing out the dead code and @cclauss for a quick
alternative fix.

Co-authored-by: Vandana Kannan <vandanavk@users.noreply.github.com>
Co-authored-by: cclauss <cclauss@bluewin.ch>

Disable test_io.test_CSVIter (apache#12146)

Fix RAT license checker which is broken in trunk (apache#12148)

Remove obsolete CI folder

set bind flag after bind completes (apache#12155)

Fix MXPredReshape in the c_predict_api (apache#11493)

* Fix MXPredReshape in the c_predict_api.

* Add unittest for the C predict API.

* Fix path in the test.

* Fix for Windows.

* Try again to fix for Windows.

* One more try to fix test on Windows.

* Try again with CI.

* Try importing from mxnet first if cannot find the amalgamation lib.

* Add a log message when libmxnet_predict.so is not found.

* Set specific rtol and atol values.

* Fix missing rtol and atol values.

* Empty commit.

* Try again with CI.

* One more try with CI.

* Retry CI.

[Flaky Test] Fix test_gluon_model_zoo.test_models when MXNET_MKLDNN_DEBUG=1  (apache#12069)

* reorder inputs

* use function flatten vs build in method

* update similar array atoi to 0.01

* fix reorder

* enable MXNET_MKLDNN_DEBUG in CI

* add exclude debug flag

* fix lint

* add warning log for excluded op

* retrigger

RAT check readme updated (apache#12170)

update ndarray stack Doc for apache#11925 (apache#12015)

* update ndarray stack Doc

Add worker_fn argument to multiworker function (apache#12177)

* add worker_fn argument to multiworker function

* fix pylin

Remove fixed seed for test_huber tests (apache#12169)

Removed fixed seed and increased learning rate and tolerance for test_nadam (apache#12164)

documentation changes. added full reference (apache#12153)

* documentation changes. added full reference

* fixing lint

* fixing more lint

* jenkins

* adding the coding line utf-8

Partially enable flaky test for norm operator (apache#12027)

add examples for slicing option (apache#11918)

Module predict API can accept NDArray as input (apache#12166)

* forward and predict can accept nd.array np.array

[MXNET-744] Docs build tools update (apache#11990)

[MXNET-744] Docs build tools update (apache#11990)

[MXNET-696] Fix undefined name errors (apache#12137)

* Fix undefined name error in neural style example

* Fix import exception error

* Fix undefined name in AUCMetric

* Fix undefined name in a3c example

Fix profiler executer when memonger is used (apache#12152)

add handling for grad req type other than kNullOp for indices (apache#11983)

Fix a minor bug in deformable_im2col.cuh (apache#12060)

Function `deformable_col2im_coord ` called deformable_col2im_coord_gpu_kernel but check the deformable_col2im_gpu_kernel.

[MXNet-744] Fix website build pipeline Python 3 issues (apache#12195)

* Fix website build pipeline Python 3 issues (apache#12195)

Fix MKLDNNSum cpp test failure (apache#12080)

bump timeout on Jenkins for docs/website to 120 min (apache#12199)

* bump timeout on Jenkins to 120 min

* add branches to settings using v notation; apply appropiate settings

Fixing typo in python/mxnet/symbol/image.py (apache#12194)

Fixing typo in python/mxnet/symbol/image.py

Fix the topk regression issue (apache#12197) (apache#12202)

* Fix the topk regression issue (apache#12197)

* Add comments

pull changes in from master
XinYao1994 pushed a commit to XinYao1994/incubator-mxnet that referenced this pull request Aug 29, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Build Doc pr-awaiting-review PR is waiting for code review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants