You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, thank you for the great work and interesting ideas!
Are the validation/test set from COIN & CrossTask datasets used during evaluation?
Are the downstream models (MLP / Transformer) trained with COIN & CrossTask data before evaluation?
During evaluation for task recognition, are all annotated video segments from a video fed into the pre-trained model e(.)? or only specific one segment from a video is used? I wondered how the accuracy was calculated.
Hope to get more information about these, I enjoyed reading your work.
Below is the screenshot of a diagram taken from the paper
Thank you.
The text was updated successfully, but these errors were encountered:
Yes, downstream models were trained on the train set of the downstream datasets before evaluating them on the downstream test set.
We used the pre-trained model to extract features of the video segments that contain steps. These features served as the input to the downstream models (
Hi, thank you for the great work and interesting ideas!
Hope to get more information about these, I enjoyed reading your work.
Below is the screenshot of a diagram taken from the paper
Thank you.
The text was updated successfully, but these errors were encountered: