Skip to content

Mrpatekful/translate

Repository files navigation

Pytorch-MT

Introduction

Pytroch-MT is a framework for creation and evaluation of neural machine translation algorithms. The purpose of this project is to provide a general interface for experiments with different methods, but with an emphasis on implementing a particular translation approach.

Data

For training these machine translation models, large text corpora is required. The scripts provided for preprocessing data were tested on WMT-2014 english and french corpora. generate.py contains the sequence of functions, which creates the data with the vocabulary and alignment files that are required for the already implemented translation experiment.

Usage

The entry point of the application is the main.py file.

After constructing the experiment configuration files (described in this page), the training will start by python main.py <config> -t. Upon interruption, the model will start from the latest epoch, or in case of a non memory related error, the training will continue from the latest state in an epoch.

By python main.py <config> -t -c the training will always start from an untrained state, and deletes all previous state and output files.

After training the model for sufficient number of steps, it can be evaluated or tested python main.py <config> --test or python main.py <config> --evaluate.

For visualization of the outputs, see model-evaluation notebook.

Dependencies

Documentation

For more detailed information, see Wiki or the Documentation.

Releases

No releases published

Packages

No packages published