Skip to content

Dataset for TALLIP2019 paper "Ancient-Modern Chinese Translation with a New Large Training Dataset"

Notifications You must be signed in to change notification settings

dayihengliu/a2m_chineseNMT

Repository files navigation

Ancient-Modern Chinese Translation with a New Large Training Dataset

This repo contains the dataset built in the following paper:

Ancient-Modern Chinese Translation with a New Large Training Dataset. Dayiheng Liu, Kexin Yang, Qian Qu, Jiancheng Lv, TALLIP 2019 [arXiv]

Overview

We create a new large-scale Ancient-Modern Chinese parallel corpus which contains 1.24M bilingual pairs. To our best knowledge, this is the first large high-quality Ancient-Modern Chinese dataset.

Dataset

We plan to gradually release the dataset.

The dataset can be downloaded at the [link].

About

Dataset for TALLIP2019 paper "Ancient-Modern Chinese Translation with a New Large Training Dataset"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published