Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How does the evaluation on downstream tasks carried out? #7

Open
ee2110 opened this issue Feb 3, 2024 · 1 comment
Open

How does the evaluation on downstream tasks carried out? #7

ee2110 opened this issue Feb 3, 2024 · 1 comment

Comments

@ee2110
Copy link

ee2110 commented Feb 3, 2024

Hi, thank you for the great work and interesting ideas!

  1. Are the validation/test set from COIN & CrossTask datasets used during evaluation?
  2. Are the downstream models (MLP / Transformer) trained with COIN & CrossTask data before evaluation?
  3. During evaluation for task recognition, are all annotated video segments from a video fed into the pre-trained model e(.)? or only specific one segment from a video is used? I wondered how the accuracy was calculated.

Hope to get more information about these, I enjoyed reading your work.

Below is the screenshot of a diagram taken from the paper
Capture2

Thank you.

@hongluzhou
Copy link
Contributor

Thank you for your interest in our work and for your kind words!

  1. We used train/test sets from COIN (

    if split == 'train' and self.coin_json['database'][video_sid]['subset'] == 'training':
    ) and created train/test sets for CrossTask on our own using random splits (
    def get_task_cls_train_test_splits(cross_task_video_dir, train_ratio=0.8):
    ).

  2. Yes, downstream models were trained on the train set of the downstream datasets before evaluating them on the downstream test set.

  3. We used the pre-trained model to extract features of the video segments that contain steps. These features served as the input to the downstream models (

    video_feats = np.load(
    ).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants