Releases: autogoal/datasets
Semeval 2023 Task 8.1
Semeval 2023 Task 8
SemEval-2023 task 8 consists of two two sub-tasks. Subtask 1 focuses on the identification of causal claims, experience, etc. in a provided multi (or single) sentence text snippet and subtask 2 focuses on the extraction of the PIO frame related to identified causal claim in the provided text snippet.
Subtask 1: Causal claim identification:
For the provided snippet of text, the first subtask aims to identify the span of text that is either a claim, experience, experience_based_claim or a question. These four categories can be defined as follow:
- Claim: Commmunicates a causal interaction between an intervention and an outcome.
- Experience: Relates a specific outcome/symptom to an intervention or population based on someone's experience.
- Experience based claim: A claim based on someone's experince.
- Question: Poses a question.
Participants can work on it at sentence level and try to classify sentences in one of the given classes but many times claim (or other class) is just a part of the sentence. But this maybe only a baseline as in many examples only a part of sentence is annotated as one of these category. Please check the image below for more clarity.
IMDB 50k movie reviews - Text Classification
Large Movie Review Dataset
This is a dataset for binary sentiment classification containing substantially more data than previous benchmark datasets. We provide 25,000 highly polar movie reviews for training and 25,000 for testing. There is additional unlabeled data for use as well. Raw text and an already processed bag of word formats are provided. Please take a look at the README file contained in the release for more details. Large Movie Review Dataset v1.0 When using this dataset, please cite our ACL 2011 paper [bib].
bib:
@InProceedings{maas-EtAl:2011:ACL-HLT2011,
author = {Maas, Andrew L. and Daly, Raymond E. and Pham, Peter T. and Huang, Dan and Ng, Andrew Y. and Potts, Christopher},
title = {Learning Word Vectors for Sentiment Analysis},
booktitle = {Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies},
month = {June},
year = {2011},
address = {Portland, Oregon, USA},
publisher = {Association for Computational Linguistics},
pages = {142--150},
url = {http://www.aclweb.org/anthology/P11-1015}
}
Contact
For comments or questions on the dataset, please get in touch with Andrew Maas. As you publish papers using the dataset, please let us know so we can post a link on this page.
Publications Using the Dataset
Andrew L. Maas, Raymond E. Daly, Peter T. Pham, Dan Huang, Andrew Y. Ng, and Christopher Potts. (2011). Learning Word Vectors for Sentiment Analysis. The 49th Annual Meeting of the Association for Computational Linguistics (ACL 2011).
Release files for MNIST (CSV format)
All data in this release belongs to its authors.
Please make sure to read the previous link for copyright, citation and redistribution info.
The authors of AutoGOAL make no claim to own the data in this release and take no responsibility for its content.
Release files for UCI Wine Quality
All data in this release belongs to its authors.
Please make sure to read the previous link for copyright, citation and redistribution info.
The authors of AutoGOAL make no claim to own the data in this release and take no responsibility for its content.
Release files for CIFAR 10
All data in this release belongs to its authors.
Please make sure to read the previous link for copyright, citation and redistribution info.
The authors of AutoGOAL make no claim to own the data in this release and take no responsibility for its content.
Release files for Yeast Corpus
All data in this release belongs to its authors.
Please make sure to read the previous link for copyright, citation and redistribution info.
The authors of AutoGOAL make no claim to own the data in this release and take no responsibility for its content.
Release files for Shuttle Corpus
All data in this release belongs to its authors.
Please make sure to read the previous link for copyright, citation and redistribution info.
The authors of AutoGOAL make no claim to own the data in this release and take no responsibility for its content.
Release files for Gisette Corpus
All data in this release belongs to its authors.
Please make sure to read the previous link for copyright, citation and redistribution info.
The authors of AutoGOAL make no claim to own the data in this release and take no responsibility for its content.
Release files for UCI German Credit
All data in this release belongs to its authors.
Please make sure to read the previous link for copyright, citation and redistribution info.
The authors of AutoGOAL make no claim to own the data in this release and take no responsibility for its content.
Release files for Dorothea Corpus
All data in this release belongs to its authors.
Please make sure to read the previous link for copyright, citation and redistribution info.
The authors of AutoGOAL make no claim to own the data in this release and take no responsibility for its content.