Could you provide the weights of the models for extracting features from scratch? #1

AndreJJXu · 2024-06-01T04:53:50Z

In your section "Extracting Features from Scratch", I find that you have leveraged the pre-trained models fine-tuned by yourself. Since I want to run the whole structure of your work, can you provide these weights or provide more details about "Extracting Features from Scratch"? Thanks!

dkurzend · 2024-06-01T08:13:40Z

Hi @AndreJJXu, I finetuned the feature extraction models during my research, therefore you see if args.finetuned_model == True: in the code. However for the paper, I did not finetune them, so you can ignore this.

To extract eg the features for UCF by yourself, you would have to run
python clip_feature_extraction/get_clip_features_ucf.py --finetuned_model False.
Also, you would have to adjust the paths (for the dataset, save_path, wavcaps paths, etc.) in the script.
I hope that helps.

AndreJJXu · 2024-06-24T12:07:36Z

When I want to load weight from the files downloaded from "https://github.com/XinhaoMei/WavCaps", specifically for the "WavCaps/retrieval/pretrained_models/audio_encoders/HTSAT_BERT_zero_shot.pt", I always get the error {
RuntimeError: Error(s) in loading state_dict for ASE:
Unexpected key(s) in state_dict: "text_encoder.text_encoder.embeddings.position_ids". }
I rebuilt my conda environment, but also got this problem. That made me crazy, could you tell me how to get rid of this problem?

dkurzend · 2024-07-13T10:56:39Z

Hi, did you use the right conda environment?
I created a separate conda environment for the feature extraction: conda env create -f clipclap_feature_extraction.yml.

Also, you have to adjust the model path in the scripts where the features are created. For UCF it would be clip_feature_extraction/get_clip_features_ucf.py in line 121:

else:
    cp_path = '/home/aoq234/dev/CLIP-GZSL/WavCaps/retrieval/pretrained_models/audio_encoders/HTSAT_BERT_zero_shot.pt' # <- adjust this path
    state_dict_key = 'model'

cp = torch.load(cp_path)
wavcaps_model.load_state_dict(cp[state_dict_key])
wavcaps_model.eval()
print("Model weights loaded from {}".format(cp_path))

carankt · 2024-09-23T18:57:16Z

@AndreJJXu I was facing a similar problem, I used https://github.com/XinhaoMei/WavCaps/blob/master/retrieval/work.yaml and created a new env. That solved the issue for me

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Could you provide the weights of the models for extracting features from scratch? #1

Could you provide the weights of the models for extracting features from scratch? #1

AndreJJXu commented Jun 1, 2024

dkurzend commented Jun 1, 2024

AndreJJXu commented Jun 24, 2024

dkurzend commented Jul 13, 2024 •

edited

Loading

carankt commented Sep 23, 2024

Could you provide the weights of the models for extracting features from scratch? #1

Could you provide the weights of the models for extracting features from scratch? #1

Comments

AndreJJXu commented Jun 1, 2024

dkurzend commented Jun 1, 2024

AndreJJXu commented Jun 24, 2024

dkurzend commented Jul 13, 2024 • edited Loading

carankt commented Sep 23, 2024

dkurzend commented Jul 13, 2024 •

edited

Loading