OpenSeq2Seq: toolkit for distributed and mixed precision training of sequence-to-sequence models

This is a research project, not an official NVIDIA product.

Documentation: https://nvidia.github.io/OpenSeq2Seq/

OpenSeq2Seq main goal is to allow researchers to most effectively explore various sequence-to-sequence models. The efficiency is achieved by fully supporting distributed and mixed-precision training. OpenSeq2Seq is built using TensorFlow and provides all the necessary building blocks for training encoder-decoder models for neural machine translation and automatic speech recognition. We plan to extend it with other modalities in the future.

Features

Sequence to sequence learning
1. Neural Machine Translation
2. Automatic Speech Recognition
Data-parallel distributed training
1. Multi-GPU
2. Multi-node
Mixed precision training for NVIDIA Volta GPUs

Requirements

TensorFlow >= 1.7
Horovod >= 0.12.0 (using Horovod is not required, but is highly recommended for multi-GPU setup)

Acknowledgments

Speech-to-text workflow uses some parts of Mozilla DeepSpeech project.

Text-to-text workflow uses some functions from Tensor2Tensor and Neural Machine Translation (seq2seq) Tutorial.

Related resources

Paper

If you use OpenSeq2Seq, please cite this paper

@article{openseq2seq,
  title={
OpenSeq2Seq: extensible toolkit for distributed and mixed precision training of sequence-to-sequence models},
  author={Kuchaiev, Oleksii and Ginsburg, Boris and Gitman, Igor and Lavrukhin, Vitaly and  Case, Carl and Micikevicius, Paulius},
  journal={arXiv preprint arXiv:1805.10387},
  year={2018}
}

Name		Name	Last commit message	Last commit date
Latest commit History 638 Commits
ctc_decoder_with_lm		ctc_decoder_with_lm
docker		docker
docs		docs
example_configs		example_configs
open_seq2seq		open_seq2seq
.gitignore		.gitignore
AUTHORS		AUTHORS
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
Dockerfile_public		Dockerfile_public
LICENSE		LICENSE
README.md		README.md
build_lm.py		build_lm.py
create_toy_data.sh		create_toy_data.sh
detokenizer.perl		detokenizer.perl
download_lm.sh		download_lm.sh
get_wmt16_en_dt.sh		get_wmt16_en_dt.sh
import_librivox.py		import_librivox.py
multi-bleu.perl		multi-bleu.perl
requirements.txt		requirements.txt
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OpenSeq2Seq: toolkit for distributed and mixed precision training of sequence-to-sequence models

Features

Requirements

Acknowledgments

Related resources

Paper

About

Releases

Packages

Languages

License

VahidooX/OpenSeq2Seq

Folders and files

Latest commit

History

Repository files navigation

OpenSeq2Seq: toolkit for distributed and mixed precision training of sequence-to-sequence models

Features

Requirements

Acknowledgments

Related resources

Paper

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages