Correlation Networks for Extreme Multi-label Text Classification
- python==3.6.3
- pytorch==1.2.0
- torchgpipe==0.0.5
- click==7.0
- ruamel.yaml==0.16.5
- numpy==1.16.2
- scipy==1.2.1
- scikit-learn==0.20.3
- gensim==3.7.2
- nltk==3.2.4
- tqdm==4.31.1
- joblib==0.13.2
- logzero==1.5.0
Pretrained Word Embeddings in gensim format
Preprocess (the EUR-Lex dataset is already tokenized in advance)
or (the other datasets need to be tokenized using NLTK)
Train and evaluate
The codes for the baseline models are adapted from the following repositories: XML-CNN, BERT, MeSHProbeNet, and AttentionXML.