Download pre-trained models and organize them as following:
code_root/
└── model/
└── pretrained_model/
├── vl-bert-base-e2e.model
├── vl-bert-large-e2e.model
├── vl-bert-base-prec.model
├── vl-bert-large-prec.model
├── bert-base-uncased/
│ ├── vocab.txt
│ ├── bert_config.json
│ └── pytorch_model.bin
├── bert-large-uncased/
│ ├── vocab.txt
│ ├── bert_config.json
│ └── pytorch_model.bin
└── resnet101-pt-vgbua-0000.model
Model Name | Download Link |
---|---|
vl-bert-base-e2e | GoogleDrive / BaiduPan |
vl-bert-large-e2e | GoogleDrive / BaiduPan |
vl-bert-base-prec | GoogleDrive / BaiduPan |
vl-bert-large-prec | GoogleDrive / BaiduPan |
Note: models with suffix "e2e" means parameters of Fast-RCNN is tuned during pre-training, while "prec" means Fast-RCNN is fixed during pre-training and for effeciency the visual features is precomputed using bottom-up-attention.
Download following pre-trained BERT and ResNet and place them under this folder.
- BERT: GoogleDrive / BaiduPan
- ResNet101 pretrained on Visual Genome: GoogleDrive / BaiduPan (converted from caffe model)