Here are codes and dataset for our ACM MM2023 paper: Fine-Grained Multimodal Named Entity Recognition and Grounding with a Generative Framework
-
Our dataset is built on the GMNER dataset.
- The preprocessed CoNLL format files are provided in this repo. For each tweet, the first line is its image id, and the following lines are its textual contents.
- Download each tweet's associated images via this link (https://drive.google.com/file/d/1PpvvncnQkgDNeBMKVgG2zFYuRhbL873g/view)
- Use VinVL to identify all the candidate objects, and put them under the folder named "twitterFMNERG_vinvl_extract36". We have uploaded the features extracted by VinVL to Google Drive and Baidu Netdisk (code: TwVi).
python T5_data/format_data.py
sh run.sh
sh eval.sh
- Using the dataset means you have read and accepted the copyrights set by Twitter and original dataset providers.
- Some codes are based on the codes of VL-T5, thanks a lot!