AmericasNLP 2021 Shared Task on Open Machine Translation

This is the official repository for the AmericasNLP 2021 Shared Task on Open Machine Translation. All scripts have been tested with Python 3.8.5, and requirements will be updated accordingly.

A example of data in the shared task's format can be found in pilot_data/, and evaluate.py is an example of the metrics and evaluations that will be used for submitted MT systems.

Data sources

If you use one or more of the datasets included in this repository, please do not forget to cite each of te original papers.

Nahuatl: Gutierrez-Vasques, X., Sierra, G., & Pompa, I. H. (2016). Axolotl: a Web Accessible Parallel Corpus for Spanish-Nahuatl. In LREC.
Hñähñu online corpus: https://tsunkua.elotl.mx/about/
Wixarika: Mager, M., Carrillo, D., & Meza, I. (2018). Probabilistic finite-state morphological segmenter for wixarika (huichol) language. Journal of Intelligent & Fuzzy Systems, 34(5), 3081-3087.
Guaraní: Chiruzzo, L., Amarilla, P., Ríos, A., & Lugo, G. G. (2020, May). Development of a Guarani-Spanish Parallel Corpus. In Proceedings of The 12th Language Resources and Evaluation Conference (pp. 2629-2633).
Feldman, I., & Coto-Solano, R. (2020, December). Neural Machine Translation Models with Back-Translation for the Extremely Low-Resource Indigenous Language Bribri. In Proceedings of the 28th International Conference on Computational Linguistics (pp. 3965-3976).
Quechua: Agic, Ž., & Vulic, I. (2020). JW300: A wide-coverage parallel corpus for low-resource languages.. ACL 2019.
Aymara (GlobalVoices): Tiedemann, J. (2012, May). Parallel Data, Tools and Interfaces in OPUS. In LREC (Vol. 2012, pp. 2214-2218).
Shipibo-konibo: Galarreta, A. P., Melgar, A., & Oncevay-Marcos, A. (2017, September). Corpus Creation and Initial SMT Experiments between Spanish and Shipibo-konibo. In RANLP (pp. 238-244).

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
baseline_system		baseline_system
data		data
pilot_data/wixarika-spanish		pilot_data/wixarika-spanish
test_data		test_data
.gitignore		.gitignore
README.md		README.md
evaluate.py		evaluate.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AmericasNLP 2021 Shared Task on Open Machine Translation

Data sources

About

Releases

Packages

Contributors 6

Languages

AmericasNLP/americasnlp2021

Folders and files

Latest commit

History

Repository files navigation

AmericasNLP 2021 Shared Task on Open Machine Translation

Data sources

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 6

Languages

Packages