German version of the Stanford Natural Language Inference (SNLI) data set, machine-translated using DeepL.
The training set has been downsampled to 100 000 examples, while the development and test set were kept at their original sizes of 10 000 examples each. The format is similar to the original SNLI data set, which can be found at https://nlp.stanford.edu/projects/snli/. The constituency parsing has been created using the Stanford CoreNLP parser.
If you use this data set in your research, please cite the following paper:
@inproceedings{cidm2019,
author = {Sifa, Rafet and Pielka, Maren and Ramamurthy, Rajkumar and Ladi, Anna and Hillebrand, Lars and Bauckhage, Christian},
year = {2019},
title = {Towards Contradiction Detection in German: A Translation-driven Approach},
booktitle={Proc. of IEEE SSCI 2019},
}