Skip to content
#

video-captioning

Here are 89 public repositories matching this topic...

这是一个基于Pytorch平台、Transformer框架实现的视频描述生成 (Video Captioning) 深度学习模型。 视频描述生成任务指的是:输入一个视频,输出一句描述整个视频内容的文字(前提是视频较短且可以用一句话来描述)。本repo主要目的是帮助视力障碍者欣赏网络视频、感知周围环境,促进“无障碍视频”的发展。

  • Updated Mar 12, 2022
  • Python

We introduce temporal working memory (TWM), which aims to enhance the temporal modeling capabilities of Multimodal foundation models (MFMs). This plug-and-play module can be easily integrated into existing MFMs. With our TWM, nine state-of-the-art models exhibit significant performance improvements across QA, captioning, and retrieval tasks.

  • Updated Jan 26, 2025
  • Python

Improve this page

Add a description, image, and links to the video-captioning topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the video-captioning topic, visit your repo's landing page and select "manage topics."

Learn more