Skip to content
/ EVE Public

EVE Series: Encoder-Free Vision-Language Models from BAAI

License

Notifications You must be signed in to change notification settings

baaivision/EVE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

EVE Series: Encoder-Free VLMs from BAAI

  • EVEv1 - Unveiling Encoder-Free Vision-Language Models (NeurIPS 2024, 2024/09)

  • EVEv2 - EVEv2: Improved Baselines for Encoder-Free Vision-Language Models (ArXiv 2025, 2025/02)

💡 Motivation

  • Can we remove vision encoder from VLMs?

  • How to transfer an LLM to an encoder-free VLM efficiently and stably?

  • How to bridge the performance gap between encoder-free and encoder-based VLMs?

📜 News

[2025/02/09] 🔥🔥🔥 The paper, weights, and code of EVEv2 are released ! 💥
[2024/09/26] Our EVE has been accepted by NeurIPS 2024 (spotlight) ! 💥
[2024/06/18] The paper, weights, and code of EVE are released ! 💥

💡 Highlights

  • 🔥 Superior Capability: An originated encoder-free LVLM with arbitrary image aspect ratio, outperforming the counterparts and approaching existing modular encoder-based LVLMs.

  • 🔥 Data Efficiency: Filter and recaption solely <100M publicly avaliable data from OpenImages, SAM, LAION, Datacomp for pre-training.

  • 🔥 Pioneering Route: We attempt to provide an efficient, transparent, and practical training strategy and procedure for developing a pure decoder-only architecture across modalities.

✒️ Citation

If EVE is helpful for your research, please consider star ⭐ and citation 📝 :

@article{diao2024EVE,
  title={Unveiling Encoder-Free Vision-Language Models},
  author={Diao, Haiwen and Cui, Yufeng and Li, Xiaotong and Wang, Yueze and Lu, Huchuan and Wang, Xinlong},
  journal={arXiv preprint arXiv:2406.11832},
  year={2024}
}
@article{diao2025EVEv2,
  title={EVEv2: Improved Baselines for Encoder-Free Vision-Language Models},
  author={Diao, Haiwen and Li, Xiaotong and Cui, Yufeng and Wang, Yueze and Deng, Haoge and Pan, Ting and Wang, Wenxuan and Lu, Huchuan and Wang, Xinlong},
  journal={arXiv preprint arXiv:2502.06788},
  year={2025}
}

📄 License

The content of this project itself is licensed under LICENSE.