Deep Learning - The Straight Dope

Abstract

This repo contains an incremental sequence of notebooks designed to teach deep learning, MXNet, and the gluon interface. Our goal is to leverage the strengths of Jupyter notebooks to present prose, graphics, equations, and code together in one place. If we're successful, the result will be a resource that could be simultaneously a book, course material, a prop for live tutorials, and a resource for plagiarising (with our blessing) useful code. To our knowledge there's no source out there that teaches either (1) the full breadth of concepts in modern deep learning or (2) interleaves an engaging textbook with runnable code. We'll find out by the end of this venture whether or not that void exists for a good reason.

Another unique aspect of this book is its authorship process. We are developing this resource fully in the public view and are making it available for free in its entirety. While the book has a few primary authors to set the tone and shape the content, we welcome contributions from the community and hope to coauthor chapters and entire sections with experts and community members. Already we've received contributions spanning typo corrections through full working examples.

Implementation with Apache MXNet

Throughout this book, we rely upon MXNet to teach core concepts, advanced topics, and a full complement of applications. MXNet is widely used in production environments owing to its strong reputation for speed. Now with gluon, MXNet's new imperative interface (alpha), doing research in MXNet is easy.

Dependencies

To run these notebooks, you'll want to build MXNet from source. Fortunately, this is easy (especially on Linux) if you follow these instructions. You'll also want to install Jupyter and use Python 3 (because it's 2017).

Slides

The authors (& others) are increasingly giving talks that are based on the content in this books. Some of these slide-decks (like the 6-hour KDD 2017) are gigantic so we're collecting them separately in this repo. Contribute there if you'd like to share tutorials or course material based on this books.

Translation

As we write the book, large stable sections are simultaneously being translated into 中文, available in a web version and via GitHub source.

Chapter 1: Crash course
Chapter 2: Introduction to supervised learning
Chapter 3: Deep neural networks (DNNs)
- Multilayer perceptrons (from scratch)
- Multilayer perceptrons (with gluon)
- Dropout regularization (from scratch)
- Dropout regularization (with gluon)
- Introduction to gluon.Block and gluon.nn.Sequential()
- Writing custom layers with gluon.Block
- Serialization: saving and loading models
- Advanced Data IO
- Debugging your neural networks
Chapter 4: Convolutional neural networks (CNNs)
Chapter 5: Recurrent neural networks (RNNs)
- Simple RNNs (from scratch)
- LSTMS RNNs (from scratch)
- GRUs (from scratch)
- RNNs (with gluon)
- Roadmap Dropout for recurrent nets
- Roadmap Zoneout regularization
Chapter 6: Optimization
- Introduction to optimization
- Gradient descent and stochastic gradient descent
- SGD with Momentum
- Roadmap AdaGrad
- Roadmap RMSProp
- Roadmap Adam
- Roadmap AdaDelta
- Roadmap SGLD / SGHNT
Chapter 7: Distributed & high-performance learning
- Fast & flexible: combining imperative & symbolic nets with HybridBlocks
- Training with multiple GPUs (from scratch)
- Training with multiple GPUs (with gluon)
- Training with multiple machines
- Roadmap Asynchronous SGD
- Roadmap Elastic SGD

Part 2: Applications

Chapter 8: Computer vision (CV)
- Roadmap Network of networks (inception & co)
- Roadmap Residual networks
- Object detection
- Roadmap Fully-convolutional networks
- Roadmap Siamese (conjoined?) networks
- Roadmap Embeddings (pairwise and triplet losses)
- Roadmap Inceptionism / visualizing feature detectors
- Roadmap Style transfer
- Visual-question-answer
- Fine-tuning
Chapter 9: Natural language processing (NLP)
- Roadmap Word embeddings (Word2Vec)
- Roadmap Sentence embeddings (SkipThought)
- Roadmap Sentiment analysis
- Roadmap Sequence-to-sequence learning (machine translation)
- Roadmap Sequence transduction with attention (machine translation)
- Roadmap Named entity recognition
- Roadmap Image captioning
- Tree-LSTM for semantic relatedness
Chapter 10: Audio processing
- Roadmap Intro to automatic speech recognition
- Roadmap Connectionist temporal classification (CSC) for unaligned sequences
- Roadmap Combining static and sequential data
Chapter 11: Recommender systems
- Introduction to recommender systems
- Roadmap Latent factor models
- Roadmap Deep latent factor models
- Roadmap Bilinear models
- Roadmap Learning from implicit feedback
Chapter 12: Time series
- Roadmap Forecasting
- Roadmap Modeling missing data
- Roadmap Combining static and sequential data

Part 3: Advanced Methods

Chapter 13: Unsupervised learning
- Roadmap Introduction to autoencoders
- Roadmap Convolutional autoencoders (introduce upconvolution)
- Roadmap Denoising autoencoders
- Roadmap Variational autoencoders
- Roadmap Clustering
Chapter 14: Generative adversarial networks (GANs)
- Introduction to GANs
- Deep convolutional GANs (DCGANs)
- Roadmap Wasserstein-GANs
- Roadmap Energy-based GANS
- Roadmap Conditional GANs
- Image transduction GANs (Pix2Pix)
- Roadmap Learning from Synthetic and Unsupervised Images
Chapter 15: Adversarial learning
- Roadmap Two Sample Tests
- Roadmap Finding adversarial examples
- Roadmap Adversarial training
Chapter 16: Tensor Methods
- Introduction to tensor methods
- Roadmap Tensor decomposition
- Roadmap Tensorized neural networks
Chapter 17: Deep reinforcement learning (DRL)
- Roadmap Introduction to reinforcement learning
- Roadmap Deep contextual bandits
- Deep Q-networks
- Roadmap Policy gradient
- Roadmap Actor-critic gradient
Chapter 18: Variational methods and uncertainty
- Roadmap Dropout-based uncertainty estimation (BALD)
- Weight uncertainty (Bayes by Backprop)
- Roadmap Variational autoencoders

Appendices

Appendix 1: Cheatsheets
- Roadmap gluon
- Roadmap PyTorch to MXNet
- Roadmap Tensorflow to MXNet
- Roadmap Keras to MXNet
- Roadmap Math to MXNet

Choose your own adventure

We've designed these tutorials so that you can traverse the curriculum in more than one way.

Anarchist - Choose whatever you want to read, whenever you want to read it.
Imperialist - Proceed through all tutorials in order. In this fashion you will be exposed to each model first from scratch, writing all the code ourselves but for the basic linear algebra primitives and automatic differentiation.
Capitalist - If you don't care how things work (or already know) and just want to see working code in gluon, you can skip (from scratch!) tutorials and go straight to the production-like code using the high-level gluon front end.

Authors

This evolving creature is a collaborative effort (see contributors tab). The lead writers, assimilators, and coders include:

Zachary C. Lipton (@zackchase)
Mu Li (@mli)
Alex Smola (@smolix)
Sheng Zha (@szha)
Aston Zhang (@astonzhang)
Joshua Z. Zhang (@zhreshold)
Eric Junyuan Xie (@piiswrong)
Jean Kossaifi (@JeanKossaifi)
Stephan Rabanser (@steverab)

Inspiration

In creating these tutorials, we've have drawn inspiration from some the resources that allowed us to learn deep / machine learning with other libraries in the past. These include:

Contribute

Already, in the short time this project has been off the ground, we've gotten some helpful PRs from the community with pedagogical suggestions, typo corrections, and other useful fixes. If you're inclined, please contribute!

Name		Name	Last commit message	Last commit date
Latest commit History 993 Commits
_static		_static
chapter01_crashcourse		chapter01_crashcourse
chapter02_supervised-learning		chapter02_supervised-learning
chapter03_deep-neural-networks		chapter03_deep-neural-networks
chapter04_convolutional-neural-networks		chapter04_convolutional-neural-networks
chapter05_recurrent-neural-networks		chapter05_recurrent-neural-networks
chapter06_optimization		chapter06_optimization
chapter07_distributed-learning		chapter07_distributed-learning
chapter08_computer-vision		chapter08_computer-vision
chapter09_natural-language-processing		chapter09_natural-language-processing
chapter11_recommender-systems		chapter11_recommender-systems
chapter14_generative-adversarial-networks		chapter14_generative-adversarial-networks
chapter16_tensor_methods		chapter16_tensor_methods
chapter17_deep-reinforcement-learning		chapter17_deep-reinforcement-learning
chapter18_variational-methods-and-uncertainty		chapter18_variational-methods-and-uncertainty
cheatsheets		cheatsheets
data		data
docs		docs
img		img
media		media
.gitattributes		.gitattributes
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
conf.py		conf.py
environment.yml		environment.yml
index.rst		index.rst
proto-P02-C02.6-loss.ipynb		proto-P02-C02.6-loss.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Learning - The Straight Dope

Abstract

Implementation with Apache MXNet

Dependencies

Slides

Translation

Table of contents

Part 1: Deep Learning Fundamentals

Part 2: Applications

Part 3: Advanced Methods

Appendices

Choose your own adventure

Authors

Inspiration

Contribute

About

Releases

Packages

Languages

yuxiangw/mxnet-the-straight-dope

Folders and files

Latest commit

History

Repository files navigation

Deep Learning - The Straight Dope

Abstract

Implementation with Apache MXNet

Dependencies

Slides

Translation

Table of contents

Part 1: Deep Learning Fundamentals

Part 2: Applications

Part 3: Advanced Methods

Appendices

Choose your own adventure

Authors

Inspiration

Contribute

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages