Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coin Stream Set #18

Closed
shachoi opened this issue Jul 29, 2024 · 9 comments
Closed

Coin Stream Set #18

shachoi opened this issue Jul 29, 2024 · 9 comments

Comments

@shachoi
Copy link

shachoi commented Jul 29, 2024

Hi @chenjoya,
First of all, thank you for your prompt response and support!!

I’m also looking to train using the Coin dataset. Currently, the only released Coin-related resource is for its benchmark, correct?
Could you release the Coin stream set, which generates online dialogue from COIN offline annotations?

@chenjoya
Copy link
Collaborator

Hi shachoi, many thanks for your interests! Due to the company policy, we cannot release anything about the YouTube data. But I have released the streaming dialogue generation scripts https://github.com/showlab/videollm-online/tree/main/data/livechat. Could you try it and, I will promptly response you if there are any problems!

@shachoi
Copy link
Author

shachoi commented Jul 29, 2024

Thanks for the response!

@shachoi shachoi closed this as completed Jul 29, 2024
@yankee624
Copy link

@chenjoya According to section 3.2 in paper (and also the code), the "Offline Annotations to Video Streaming Dialogue" is done by inserting multiple queries randomly in the video. But in Supplementary B.1 and B.2 COIN Stream Example, only single query is inserted in the beginning. Can you provide code/explanation on how this single query dataset is generated? Is it just using a single fixed template The video is about to [action description]. Please remind me when the related action starts, summarizes when it ends, as well as forecasts the next action. for all videos?

@chenjoya
Copy link
Collaborator

chenjoya commented Sep 8, 2024

Ok seems you asked the COIN stream set. It is very simple,

  1. Video: COIN video (no segment);
  2. Prompt: [action description] = COIN task label
  3. Annotated Response: (1) After the frame of a step label begins, make the prompt "The action of {steps[i]} starts." (2) After the frame of a step label ends, make the prompt "The action of {steps[i]} ends. The next is {steps[i+1]}".

As you see, it is not using LLM, just rule-based. Livechat generation is more recommended.

@yankee624
Copy link

yankee624 commented Sep 8, 2024

@chenjoya Thank you for the prompt reply!
Livechat is good for better-quality response, but as you said in the paper, multi-turn dialogue is hard to evaluate, so I was wondering how to generate the COIN stream set for stable evalution.

  • For the initial prompt (user query), did you just use the raw task label (e.g., "InstallCeilingFan")? Or did you use some kind of template like The video is about to install ceiling fan. Please remind me when the related action starts, summarizes when it ends, as well as forecasts the next action. as shown in the paper.

  • Do you think training and evaluating the model only with this COIN Stream set will be unreliable? (maybe the metrics are unreliable because the dataset is too fixed structure?)

@chenjoya
Copy link
Collaborator

chenjoya commented Sep 9, 2024

Hi thank you!

  • Yes, the task label has been split by

    def _clean_task(text):

  • If using only the COIN Stream set, the data size is too small. Mixed training is recommended.

@yankee624
Copy link

Do you mean that the user query is just "Install ceiling fan"?
Then it would go as follows?

[System prompt]
User: Install ceiling fan
[F][F]...[F] Assistant: xxx
[F][F]...[F] Assistant: xxx

This is different from the example in the paper, which includes some sentences in the user query ("The video is about to..."). Is this example just for demonstratation purpose, not the real example?
image

@chenjoya
Copy link
Collaborator

chenjoya commented Sep 9, 2024

Sorry for the confusion. What I mentioned is the code to split task label. The prompt in paper is correct.

@yankee624
Copy link

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants