Generative Cross-Domain Data Augmentation for Aspect and Opinion Co-Extraction

This repository contains code for our NAACL2022 paper:
Generative Cross-Domain Data Augmentation for Aspect and Opinion Co-Extraction

Datasets

The training data comes from three domains: Restaurant(R) 、 Laptop(L) 、 Device(D).
We follow the previous work and remove the sentences that have no aspects and opinions when device is the source domain.

The in-domain corpus(used for training BERT-E) come from yelp and amazon reviews.

Click here to get BERT-E (BERT-Extented) , and the extraction code is by0i. (Please specify the directory where BERT is stored in modelconfig.py.)

environment

transformers==4.2.2
pytorch==1.10.0

code

Firstly, we run the following code to achieve the target pseudo labeled data:

cd aeoe
cd ae_oe_bert_crf
bash ./run_bert_e_sdl.sh

Then, we run the following code to achieve masked data:

cd ..
bash ./process_data.sh

After that, we train the bart for data generation:

cd ..
cd da
bash ./test.sh
bash ./post_process.sh

finally, we filter the generated data and train it for downstreamtask:

cd ..
cd aeoe
cd ae_oe_bert_crf
bash ./run_bert_e_da_filter.sh
bash ./run_co_guess.sh
bash ./run_bert_e_da_train.sh

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
aeoe		aeoe
da		da
run_out		run_out
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Generative Cross-Domain Data Augmentation for Aspect and Opinion Co-Extraction

Datasets

environment

code

About

Releases

Packages

Languages

NUSTM/GCDDA

Folders and files

Latest commit

History

Repository files navigation

Generative Cross-Domain Data Augmentation for Aspect and Opinion Co-Extraction

Datasets

environment

code

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages