You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
Thank you for sharing your interesting work!
I want to try and reproduce the MSRVTT dataset.
I followed the instructions and used mistral_best.pth checkpoint, and I ran mistral_evaluation.sh
Then I ran evaluate_zero_shot.sh using GPT-4o to get the results for score and accuracy.
I got very low results (acc ~0.18) on 100K samples (didn't try the other 100K).
I wonder if you can help me to reproduce the results as reported in the paper / this repo.
Thanks in advance,
Nimrod
The text was updated successfully, but these errors were encountered:
Hello, I've also been replicating related benchmarks recently, and these benchmarks are mostly based on GPT-assistant, which seems quite costly. I'd like to ask, approximately how much does each of your evaluations cost?
Hi,
Thank you for sharing your interesting work!
I want to try and reproduce the MSRVTT dataset.
I followed the instructions and used
mistral_best.pth
checkpoint, and I ranmistral_evaluation.sh
Then I ran
evaluate_zero_shot.sh
using GPT-4o to get the results for score and accuracy.I got very low results (acc ~0.18) on 100K samples (didn't try the other 100K).
I wonder if you can help me to reproduce the results as reported in the paper / this repo.
Thanks in advance,
Nimrod
The text was updated successfully, but these errors were encountered: