Skip to content

InternVideo: General Video Foundation Models via Generative and Discriminative Learning (https://arxiv.org/abs/2212.03191)

License

Notifications You must be signed in to change notification settings

Richard-61/InternVideo

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

48 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

InternVideo

PWC PWC PWC PWC PWC PWC PWC PWC PWC PWC PWC PWC PWC PWC PWC PWC PWC PWC PWC PWC PWC PWC PWC PWC

This repo gives the official implmentation of 'InternVideo: General Video Foundation Models via Generative and Discriminative Learning', by Yi Wang, Kunchang Li, Yizhuo Li, Yinan He, Bingkun Huang, Zhiyu Zhao, Jilan Xu, Yi Liu, Zun Wang, Sen Xing, Guo Chen, Junting Pan, Jashuo Yu, Hongjie Zhang, Yali Wang, Limin Wang, and Yu Qiao.

Updates

  • Jan 18, 2023: The code of vision-language navigation is released.
  • Jan 16, 2023: The code of video question answering, zero-shot action recognition, and zero-shot multiple choice is released.
  • Jan 1, 2023: The code & model of spatio-temporal action localiztion are released.
  • Dec 27, 2022: The code & model of partial pretraining (VideoMAE) and downstream applications (video-text retrieval, temporal action localization, open-set action recognition, and ego4d related tasks) are released.
  • Dec 6, 2022: The technical report of InternVideo is released.
  • Sep 2, 2022: Press releases (official | 163 news | qq news).

Code & model

Citation

If this work is helpful for your research, please consider citing InternVideo.

@article{wang2022internvideo,
  title={InternVideo: General Video Foundation Models via Generative and Discriminative Learning},
  author={Wang, Yi and Li, Kunchang and Li, Yizhuo and He, Yinan and Huang, Bingkun and Zhao, Zhiyu and Zhang, Hongjie and Xu, Jilan and Liu, Yi and Wang, Zun and Xing, Sen and Chen, Guo and Pan, Junting and Yu, Jiashuo and Wang, Yali and Wang, Limin and Qiao, Yu},
  journal={arXiv preprint arXiv:2212.03191},
  year={2022}
}

About

InternVideo: General Video Foundation Models via Generative and Discriminative Learning (https://arxiv.org/abs/2212.03191)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 93.2%
  • Shell 3.6%
  • Jupyter Notebook 1.0%
  • C 1.0%
  • C++ 0.6%
  • Cuda 0.5%
  • Other 0.1%