In this paper, we provide a bilingual parallel human-to-human recommendation dialog dataset, DuRecDial 2.0, to enable researchers to explore the challenging task of multilingual and cross-lingual conversational recommendation. The difference between DuRecDial 2.0 and existing conversational recommendation datasets is that the data item (Profile, Goal, Knowledge, Context, Response) in DuRecDial 2.0 is annotated in two languages, both English and Chinese, while other datasets are built with the setting of a single language. We collect 8.2k dialogues aligned across English and Chinese languages (16.5k dialogs and 255k utterances in total) that are annotated by crowdsourced workers with strict quality control procedure. DuRecDial 2.0 provides a challenging testbed for future studies of monolingual, multilingual, and cross-lingual conversational recommendation. For a detailed introduction of DuRecDial 2.0, please refer to DuRecDial on IEEE Xplore, ACL Anthology and arXiv.
Our paper on ACL Anthology and arXiv . If the corpus is helpful to your research, please kindly cite our paper:
@inproceedings{liu-etal-2021-durecdial,
title = "{D}u{R}ec{D}ial 2.0: A Bilingual Parallel Corpus for Conversational Recommendation",
author = "Liu, Zeming and
Wang, Haifeng and
Niu, Zheng-Yu and
Wu, Hua and
Che, Wanxiang",
booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing",
month = nov,
year = "2021",
address = "Online and Punta Cana, Dominican Republic",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2021.emnlp-main.356",
doi = "10.18653/v1/2021.emnlp-main.356",
pages = "4335--4347",
}
Note:If the first goal is "Greetings/寒暄", the seeker starts the conversation, otherwise, the user starts the conversation.
An example of the conversation in DuRecDial 2.0:
DuRecDial 2.0 is an extension of the DuRecDial. Specifically, we extend the DuRecDial to English by crowdsourced workers with strict quality control procedure.
If DuRecDial is helpful to your research, please kindly cite our papers:
@inproceedings{liu-etal-2020-towards-conversational,
title = "Towards Conversational Recommendation over Multi-Type Dialogs",
author = "Liu, Zeming and
Wang, Haifeng and
Niu, Zheng-Yu and
Wu, Hua and
Che, Wanxiang and
Liu, Ting",
booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics",
month = jul,
year = "2020",
address = "Online",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2020.acl-main.98",
doi = "10.18653/v1/2020.acl-main.98",
pages = "1036--1049",
}
@ARTICLE{9699426,
author={Liu, Zeming and Zhou, Ding and Liu, Hao and Wang, Haifeng and Niu, Zheng-Yu and Wu, Hua and Che, Wanxiang and Liu, Ting and Xiong, Hui},
journal={IEEE Transactions on Knowledge and Data Engineering},
title={Graph-Grounded Goal Planning for Conversational Recommendation},
year={2023},
volume={35},
number={5},
pages={4923-4939},
doi={10.1109/TKDE.2022.3147210}
}
Apache License 2.0 and CC BY-NC-SA 4.0.
Since DuRecDial 2.0 is licensed under CC BY-NC-SA 4.0. Note the dataset may not be adopted for commercial use.