MeSHProbeNet

MeSHProbeNet: a self-attentive probe net for MeSH indexing

Prerequisites

python==3.6.3
pytorch==1.2.0
torchtext==0.2.1
numpy==1.16.2
scipy==1.2.1

Input data format

Take ./toy_data/ as an example.

train.tsv: The training set, where each line is a document. Each document is represented as content word ids separated by spaces + '\t' + journal id + '\t' + MeSH ids separated by spaces
validation.tsv: The validation set in the same format as train.tsv
vocab_w.txt: The vocabulary file for context words, where each line is content word id + '\t' + content word
vocab_j.txt: The vocabulary file for journal names, where each line is journal id + '\t' + journal name
vocab_m.txt: The vocabulary file for MeSH terms, where each line is MeSH id + '\t' + MeSH term

Validation is optional. Vocabulary id 0 is reserved for the padding token.

Run

Run on the toy data

python main_train.py \
  --do_save \
  --do_eval \
  --train_path ./toy_data/train.tsv \
  --dev_path ./toy_data/validation.tsv \
  --src_vocab_pt ./toy_data/vocab_w.txt \
  --jrnl_vocab_pt ./toy_data/vocab_j.txt \
  --tgt_vocab_pt ./toy_data/vocab_m.txt \
  --expt_path ./toy_data/save \
  --learning_rate 0.0025 \
  --weight_decay 5e-10

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
toy_data		toy_data
LICENSE		LICENSE
README.md		README.md
dataset.py		dataset.py
main_train.py		main_train.py
meshprobenet.py		meshprobenet.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MeSHProbeNet

Prerequisites

Input data format

Run

About

Releases

Packages

Languages

License

XunGuangxu/MeSHProbeNet

Folders and files

Latest commit

History

Repository files navigation

MeSHProbeNet

Prerequisites

Input data format

Run

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages