Skip to content

Latest commit

 

History

History
68 lines (45 loc) · 2.28 KB

README.markdown

File metadata and controls

68 lines (45 loc) · 2.28 KB

PyMEANT

PyMEANT is a proof-of-concept Python implementation of a simplified version of the MEANT machine translation evaluation metric presented in Lo et al. (2012). It was originally submitted as a final course project for the Machine Translation class at Johns Hopkins University in spring 2014. You may wish to read the project writeup for more details.

Caveats

Before using PyMEANT, please note the following:

  • PyMEANT is an unoptimized pure-Python implementation, and as a result can be very slow even on modest data sets.

  • Predicate and argument weighting are not implemented. Thus, PyMEANT's results cannot be directly compared with MEANT's.

  • Jaccard similarity is used as the lexical similarity measurement, as described in Tumuluru et al. (2012), instead of the MinMax-MI metric outlined in the original paper.

Usage

To install, use setup.py:

$ python setup.py install

Before scoring translation hypotheses, you will need to train a lexical similarity model using python -m pymeant train. A parser for Gigaword corpus files is included for convenience:

$ python -m pymeant.formats.gigaword nyt199504.gz | python -m pymeant train - lexsim.pkl

To perform the actual scoring, use python -m pymeant score, passing in the hypotheses and reference sentences as both plain text (one per line) and ASSERT-tagged parse files:

$ python -m pymeant score lexsim.pkl hypotheses.{txt,parse} reference.{txt,parse}

For further information, pass the --help option.

References

  • Chi-kiu Lo, Anand Karthik Tumuluru, and Dekai Wu. 2012. Fully automatic semantic MT evaluation. In Proceedings of the 7th Workshop of Statistical Machine Translation, pages 243–252. Association for Computational Linguistics.

  • Anand Karthik Tumuluru, Chi-kiu Lo, and Dekai Wu. 2012. Accuracy and robustness in measuring the lexical similarity of semantic role fillers for automatic semantic MT evaluation. In 26th Pacific Asia Conference on Language, Information and Computation, pages 574–581.