Skip to content

CMMMU-Benchmark/CMMMU

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CMMMU

🌐 Homepage | 🤗 Paper | 📖 arXiv | 🤗 Dataset | 🏆 EvalAI | GitHub

This repo contains the evaluation code for the paper "CMMMU: A Chinese Massive Multi-discipline Multimodal Understanding Benchmark"

News

🔥[2024-04-26]: Thanks to the support of VLMEvalKit team, now everyone can use VLMEvalKit to easily conduct evaluations!

🔥[2024-03-14]: Thanks to the support of lmms-eval team, now everyone can use lmms-eval to easily conduct evaluations!

🌟[2024-03-06]: Our evaluation server for the test set is now available on EvalAI. We welcome all submissions and look forward to your participation! 😆

Introduction

CMMMU includes 12k manually collected multimodal questions from college exams, quizzes, and textbooks, covering six core disciplines: Art & Design, Business, Science, Health & Medicine, Humanities & Social Science, and Tech & Engineering, like its companion, MMMU. These questions span 30 subjects and comprise 39 highly heterogeneous image types, such as charts, diagrams, maps, tables, music sheets, and chemical structures.

Alt text

Evaluation

Please refer to our eval folder for more details.

🏆 Mini-Leaderboard

Model Val (900) Test (11K)
GPT-4V(ision) (Playground) 42.5 43.7
Qwen-VL-PLUS* 39.5 36.8
Yi-VL-34B 36.2 36.5
Yi-VL-6B 35.8 35.0
InternVL-Chat-V1.1* 34.7 34.0
Qwen-VL-7B-Chat 30.7 31.3
SPHINX-MoE* 29.3 29.5
InternVL-Chat-ViT-6B-Vicuna-7B 26.4 26.7
InternVL-Chat-ViT-6B-Vicuna-13B 27.4 26.1
CogAgent-Chat 24.6 23.6
Emu2-Chat 23.8 24.5
Chinese-LLaVA 25.5 23.4
VisCPM 25.2 22.7
mPLUG-OWL2 20.8 22.2
Frequent Choice 24.1 26.0
Random Choice 21.6 21.6

*: results provided by the authors.

Disclaimers

The guidelines for the annotators emphasized strict compliance with copyright and licensing rules from the initial data source, specifically avoiding materials from websites that forbid copying and redistribution. Should you encounter any data samples potentially breaching the copyright or licensing regulations of any site, we encourage you to contact us. Upon verification, such samples will be promptly removed.

Contact

Citation

BibTeX:

@article{zhang2024cmmmu,
        title={CMMMU: A Chinese Massive Multi-discipline Multimodal Understanding Benchmark},
        author={Ge, Zhang and Xinrun, Du and Bei, Chen and Yiming, Liang and Tongxu, Luo and Tianyu, Zheng and Kang, Zhu and Yuyang, Cheng and Chunpu, Xu and Shuyue, Guo and Haoran, Zhang and Xingwei, Qu and Junjie, Wang and Ruibin, Yuan and Yizhi, Li and Zekun, Wang and Yudong, Liu and Yu-Hsuan, Tsai and Fengji, Zhang and Chenghua, Lin and Wenhao, Huang and Jie, Fu},
        journal={arXiv preprint arXiv:2401.20847},
        year={2024},
      }

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages