Skip to content

faster tar reading; recipes, stats; multiligual source or target support

Compare
Choose a tag to compare
@thammegowda thammegowda released this 29 Oct 01:58
· 183 commits to master since this release
4380a10
  • mtdata [list|get]-recipe :: Add support for recipes; list-recipe get-recipe subcommands added
  • mtdata stats:: add support for viewing stats of dataset; words, chars, segs
  • FIX url for UN dev and test sets (source was updated so we updated too)
  • Multilingual experiment support; ISO 639-3 code mul implies multilingual; e.g. mul-eng or eng-mul
  • --dev accepts multiple datasets, and merges it (useful for multilingual experiments)
  • tar files are extracted before read (performance improvements)
  • setup.py: version and descriptions accessed via regex