-
Notifications
You must be signed in to change notification settings - Fork 263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add MultipanelVQA and POPE vision-language scenarios #2517
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ImKeTT Thanks for adding these! I had a few minor comments. Could you also add the conf file you used to run in the PR description?
Thanks for reviewing @teetone ! I've re-framed POPE to the MCQA task and added more detailed descriptions for these two scenarios. Here are the configuration files I used for this PR.
For POPE
|
Thanks @ImKeTT! could you address one last comment in |
Hello, this PR is to add two vision-language scenarios to VHELM --- MultipanelVQA from https://arxiv.org/abs/2401.15847 and the POPE benchmark from https://aclanthology.org/2023.emnlp-main.20/.
There are two
subjects
(synthetic or real-world) and twoquestion_type
(multiple-choice or open) for MultipanelVQA, I useget_short_answer_generation_adapter_spec
for open-ended generation andget_multiple_choice_joint_adapter_spec
for multiple-choice type questions. For both scenarios, I useget_exact_match_metric_specs
for evaluation.Here's a screenshot after running
./pre-commit.sh
Here're several screenshots and the
scenario_state.json
of toy runs on two scenarios (Qwen-VL-Chat on 25 instances):POPE
pope_scenario_state.json
MultipanelVQA-real-world
mpvqa-real-open-scenario_state.json
mpvqa-real-mc-scenario_state.json
MultipanelVQA-synthetic
mpvqa-syn-open-scenario_state.json
mpvqa-syn-mc-scenario_state.json
Please let me know how I can improve it.
Thanks!