Coin Stream Set #18

shachoi · 2024-07-29T08:15:40Z

Hi @chenjoya,
First of all, thank you for your prompt response and support!!

I’m also looking to train using the Coin dataset. Currently, the only released Coin-related resource is for its benchmark, correct?
Could you release the Coin stream set, which generates online dialogue from COIN offline annotations?

chenjoya · 2024-07-29T08:35:31Z

Hi shachoi, many thanks for your interests! Due to the company policy, we cannot release anything about the YouTube data. But I have released the streaming dialogue generation scripts https://github.com/showlab/videollm-online/tree/main/data/livechat. Could you try it and, I will promptly response you if there are any problems!

shachoi · 2024-07-29T09:41:27Z

Thanks for the response!

yankee624 · 2024-09-08T08:38:08Z

@chenjoya According to section 3.2 in paper (and also the code), the "Offline Annotations to Video Streaming Dialogue" is done by inserting multiple queries randomly in the video. But in Supplementary B.1 and B.2 COIN Stream Example, only single query is inserted in the beginning. Can you provide code/explanation on how this single query dataset is generated? Is it just using a single fixed template The video is about to [action description]. Please remind me when the related action starts, summarizes when it ends, as well as forecasts the next action. for all videos?

chenjoya · 2024-09-08T13:28:42Z

Ok seems you asked the COIN stream set. It is very simple,

Video: COIN video (no segment);
Prompt: [action description] = COIN task label
Annotated Response: (1) After the frame of a step label begins, make the prompt "The action of {steps[i]} starts." (2) After the frame of a step label ends, make the prompt "The action of {steps[i]} ends. The next is {steps[i+1]}".

As you see, it is not using LLM, just rule-based. Livechat generation is more recommended.

yankee624 · 2024-09-08T21:46:53Z

@chenjoya Thank you for the prompt reply!
Livechat is good for better-quality response, but as you said in the paper, multi-turn dialogue is hard to evaluate, so I was wondering how to generate the COIN stream set for stable evalution.

For the initial prompt (user query), did you just use the raw task label (e.g., "InstallCeilingFan")? Or did you use some kind of template like The video is about to install ceiling fan. Please remind me when the related action starts, summarizes when it ends, as well as forecasts the next action. as shown in the paper.
Do you think training and evaluating the model only with this COIN Stream set will be unreliable? (maybe the metrics are unreliable because the dataset is too fixed structure?)

chenjoya · 2024-09-09T02:23:33Z

Hi thank you!

Yes, the task label has been split by

videollm-online/data/coin/coin.py

Line 54 in 755e265

def _clean_task(text):
If using only the COIN Stream set, the data size is too small. Mixed training is recommended.

yankee624 · 2024-09-09T05:24:56Z

Do you mean that the user query is just "Install ceiling fan"?
Then it would go as follows?

[System prompt]
User: Install ceiling fan
[F][F]...[F] Assistant: xxx
[F][F]...[F] Assistant: xxx

This is different from the example in the paper, which includes some sentences in the user query ("The video is about to..."). Is this example just for demonstratation purpose, not the real example?

chenjoya · 2024-09-09T05:46:17Z

Sorry for the confusion. What I mentioned is the code to split task label. The prompt in paper is correct.

yankee624 · 2024-09-09T21:38:11Z

Thank you!

shachoi closed this as completed Jul 29, 2024

chenjoya mentioned this issue Sep 17, 2024

coin conversation data #34

Open

zhangyl4 mentioned this issue Sep 17, 2024

COIN narration performance #37

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Coin Stream Set #18

Coin Stream Set #18

shachoi commented Jul 29, 2024

chenjoya commented Jul 29, 2024

shachoi commented Jul 29, 2024

yankee624 commented Sep 8, 2024

chenjoya commented Sep 8, 2024

yankee624 commented Sep 8, 2024 •

edited

Loading

chenjoya commented Sep 9, 2024

yankee624 commented Sep 9, 2024

chenjoya commented Sep 9, 2024

yankee624 commented Sep 9, 2024

Coin Stream Set #18

Coin Stream Set #18

Comments

shachoi commented Jul 29, 2024

chenjoya commented Jul 29, 2024

shachoi commented Jul 29, 2024

yankee624 commented Sep 8, 2024

chenjoya commented Sep 8, 2024

yankee624 commented Sep 8, 2024 • edited Loading

chenjoya commented Sep 9, 2024

yankee624 commented Sep 9, 2024

chenjoya commented Sep 9, 2024

yankee624 commented Sep 9, 2024

yankee624 commented Sep 8, 2024 •

edited

Loading