Add MMMU scenario and support multimodal multiple choice adaptation #2259

teetone · 2024-01-18T19:10:27Z

Resolves #2114, #2068

…into vlm_models

…into heim_human

src/helm/benchmark/presentation/run_specs_vhelm.conf

src/helm/benchmark/scenarios/vision_language/mmmu_scenario.py

JosselinSomervilleRoberts · 2024-01-19T22:08:40Z

src/helm/benchmark/scenarios/vision_language/mmmu_scenario.py

+            options: List[str] = row["options"]
+            answer: str = row["answer"]
+
+            # Create the question. Questions can have text and images interleaved


Oh, I am pretty sure in the PR for Llava and OpenFlamingo we assumed the image was at the top

I don't think we can evaluate llava and openflamingo with MMMU. There can be multiple images (up to 7) in the question and even in the answer choices.

src/helm/benchmark/scenarios/vision_language/mmmu_scenario.py

src/helm/benchmark/vlm_run_specs.py

src/helm/benchmark/scenarios/vision_language/mmmu_scenario.py

teetone · 2024-01-20T00:31:03Z

@JosselinSomervilleRoberts I answered your concerns. Let me know if something is unclear!

…into mmmu

JosselinSomervilleRoberts

Thanks for the changes, just make sure to make the change when loading the huggingface dataset, otherwise looks good!

…2259)

teetone added 20 commits December 26, 2023 14:43

use the python client for AlephAlpha

5e45d41

fix test

2dba9e8

test

ae13175

test

d5344c0

initialize client at request time

b1e9a29

multimodels

97fbfad

cleanup adapter specs

5e84b19

fix VizWizScenario

6d19c21

fix VizWizScenario

17c8def

tst

6f261ca

Merge branch 'main' of https://github.com/stanford-crfm/benchmarking …

e3768fd

…into vlm_models

Merge branch 'main' of https://github.com/stanford-crfm/benchmarking …

6262535

…into vlm_models

heim human eval + multimodal mc adapter

712845b

resolve merge conflicts

3899f20

Merge branch 'main' of https://github.com/stanford-crfm/benchmarking …

cb9ad77

…into heim_human

Merge branch 'main' of https://github.com/stanford-crfm/benchmarking …

fe36f76

…into heim_human

short answer adapter spec

e42d74c

mmmu

447ad1d

mmmu

536985e

mmmu

0071cd3

teetone added the VHELM Holistic Evaluation of Vision-Language Models (VLM) label Jan 18, 2024

teetone assigned JosselinSomervilleRoberts Jan 18, 2024

teetone requested a review from JosselinSomervilleRoberts January 18, 2024 19:10

teetone unassigned JosselinSomervilleRoberts Jan 18, 2024

JosselinSomervilleRoberts requested changes Jan 19, 2024

View reviewed changes

JosselinSomervilleRoberts reviewed Jan 20, 2024

View reviewed changes

src/helm/benchmark/scenarios/vision_language/mmmu_scenario.py Outdated Show resolved Hide resolved

JosselinSomervilleRoberts reviewed Jan 20, 2024

View reviewed changes

src/helm/benchmark/scenarios/vision_language/mmmu_scenario.py Outdated Show resolved Hide resolved

teetone requested a review from JosselinSomervilleRoberts January 20, 2024 00:30

add vhelm lite conf

78c318a

teetone added 3 commits January 24, 2024 11:36

Merge branch 'main' of https://github.com/stanford-crfm/benchmarking …

bc01cea

…into mmmu

Merge branch 'main' of https://github.com/stanford-crfm/benchmarking …

7766643

…into mmmu

fix image choices format

ea663f9

JosselinSomervilleRoberts approved these changes Jan 24, 2024

View reviewed changes

load specific mmmu subject

f5ff9ee

teetone merged commit a876bde into main Jan 24, 2024
6 checks passed

teetone deleted the mmmu branch January 24, 2024 23:13

teetone mentioned this pull request Jan 24, 2024

Add MMMU, a Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark #2068

Closed

brianwgoldman pushed a commit that referenced this pull request Feb 6, 2024

Add MMMU scenario and support multimodal multiple choice adaptation (#…

75ecaba

…2259)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add MMMU scenario and support multimodal multiple choice adaptation #2259

Add MMMU scenario and support multimodal multiple choice adaptation #2259

teetone commented Jan 18, 2024

JosselinSomervilleRoberts Jan 19, 2024

teetone Jan 20, 2024

teetone commented Jan 20, 2024

JosselinSomervilleRoberts left a comment

Add MMMU scenario and support multimodal multiple choice adaptation #2259

Add MMMU scenario and support multimodal multiple choice adaptation #2259

Conversation

teetone commented Jan 18, 2024

JosselinSomervilleRoberts Jan 19, 2024

Choose a reason for hiding this comment

teetone Jan 20, 2024

Choose a reason for hiding this comment

teetone commented Jan 20, 2024

JosselinSomervilleRoberts left a comment

Choose a reason for hiding this comment