Releases · NVIDIA/sentiment-discovery

14 Dec 19:44

raulpuric

v0.3.large_batch_stable

7f5ab28

v0.3.large_batch_stable: Code necessary to reproduce results from our large batch training paper Latest

Latest

This release is used to reproduce results from our Large Scale LM paper.

Assets 2

06 Apr 19:41

raulpuric

v0.3

8d1cee1

v0.3 Release: Speed & Memory Usage improvements + PyTorch 0.5 updates

We've switched our mLSTM model to internally used PyTorch's fused LSTM cell which provides significantly improved GPU memory usage (allowing for larger batch size training) and slight improvements to speed compared to the unfused version we had included in earlier versions.

In order to convert any models you've trained in the past to be usable with this version, please see this issue.

We've also updated our distributed code to address the recent April 3rd changes made to PyTorch's Tensors and Variables.

Assets 2

13 Mar 19:41

raulpuric

v0.2

c978691

v0.2 Release: FP16, Distributed, and Usability updates

Our main goal with this release is two-fold:

address concerns around usability
Update repo with new code for FP16, distributed training

Usability

We've brought our training/generation code more in line with the pytorch word language model example
Provide PyTorch classifier module/function for classifying sentiment from input text tensor
- Provide pretrained classifiers/language models for this module
- Provide simple standalone classifier script/example capable of classifying an input csv/json and writing results to other csv/jsons
Flattening our directory structure to make code easier to find
Putting reusable PyTorch functionality (new RNN api, weight norm functionality, eventually all fp16 functionality) in its own standalone python module to be published at a later date

FP16 + Distributed

FP16 optimizer wrapper for optimizating FP16 models according to our [best practices] (https://github.com/NVIDIA/sentiment-discovery/blob/master/analysis/reproduction.md#fp16-training)
- available in fp16/fp16.py
Lightweight distributed wrapper for all reducing gradients across multiple gpus with either nccl or gloo backends
- model/distributed.py
distributed worker launch script
- multiproc.py

Assets 2

12 Dec 19:35

raulpuric

v0.1

388757f

Main v0 release

Module updates

Fused LSTM kernels in mLSTM module with fuse_lstm flags
Model updates
improved model serialization size and options
- no saving of gradients
- saving optimizer is optional
- reloading weights trained with weight norm is more stable
  Weight Norm/Reparameterization update
modified hooks to work with fused LSTM kernel
Data updates
Parses dataset types (csv, json, etc) automatically. Only need to specify supervised vs unsupervised
Added loose json functionality
Tested csv datasets more thoroughly
Save Names of processed results fixed so that original file's name stays the same now.
Fixed DataParallel/DistributedDP batching of evaluation datasets
Made it easier to specify validation/test datasets
Made it easier to specify dataset shards
Added negative sequence lengths for datasets.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Usability

FP16 + Distributed

Releases: NVIDIA/sentiment-discovery

v0.3.large_batch_stable: Code necessary to reproduce results from our large batch training paper

v0.3 Release: Speed & Memory Usage improvements + PyTorch 0.5 updates

v0.2 Release: FP16, Distributed, and Usability updates

Usability

FP16 + Distributed

Main v0 release