Bridging Background Knowledge Gaps in Translation with Automatic Explicitation

This repository contains our WikiExpl dataset, a semi-automatic collection of naturally occurring explicitations in Wikipedia bitext corpus annotated by human translators, from our EMNLP 2023 main conference paper (arXiv).

The json files contain the candidates extracted by our detection algorithm. Each candidate is annotated by three annotators and we assign the label based on the majority vote. We consider the candidates as final explicitation if two or more annotators agree. The list of final explicitation is in expl_idx_list. We merge the annotated span of explicitation from different annotators by maximizing the span coverage.

We provide simple tools for easy exploration:

$ python show.py

The output example : Here the red part in the source text (green) is that which is to be performed explicitation in the corresponding target translation, and the red part in the target text (blue) is its explicitation.

Reference

@inproceedings{han-etal-2023-auto-explicitation,
    title = "Bridging Background Knowledge Gaps in Translation with Automatic Explicitation",
    author = "Han, HyoJung  and Boyd-Graber, Jordan  and Carpuat, Marine",
    booktitle = "Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing",
    month = dec,
    year = "2023",
    address = "Singapore, Singapore",
    publisher = "Association for Computational Linguistics",
    url = "https://openreview.net/pdf?id=PBvSGqYCSa",
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
doc		doc
wikiexpl		wikiexpl
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
show.py		show.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bridging Background Knowledge Gaps in Translation with Automatic Explicitation

Reference

About

Releases

Packages

Languages

License

h-j-han/automatic_explicitation

Folders and files

Latest commit

History

Repository files navigation

Bridging Background Knowledge Gaps in Translation with Automatic Explicitation

Reference

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages