All real/authentic audios in real subfolder are classified as 'fake' with the pre-trained model #10

irdance · 2020-06-15T12:22:55Z

Hi when I ran the inference.py file with all the audio files in the 'real' subfolder - it misclassified them as 'fake'. I just wanted to check that the pre-trained model is the correct one?

headcrabz · 2020-07-02T23:40:20Z

yea I got the same thing, pre trained model seems to be inaccurate

ranasac19878 · 2020-07-06T05:18:50Z

Hi guys,
Thanks for pointing out this issue. Currently, the pretrained model only works for audio files that are in distribution of input data it was trained on. We deliberately provided hard and out of distribution audio files in the 'real' and 'fake' subfolders to show that this work is still in progress. It is very hard to train a generalized model that will work on any audio files out of the blue. It will be great if you guys can put in some ideas.

Thanks,
Sachin

irdance · 2020-07-06T09:43:08Z

Thanks @ranasac19878 just for clarity was the pre-trained model trained on the test dataset as the model does quite well on the test dataset. The test set does contain 'out of distribution' audio files as some of the fake audio files in the test set are generated from different deepfake audio models.

My hunch is that the variety of accents in the dataset (train + test) is limited and therefore may not work well with different accents.

ranasac19878 · 2020-07-07T03:53:18Z

@irdance the model was not trained on test data set but the test set was used as a validation set to set the hyperparams for the neural network. It is not technically correct to do so but it was very difficult to get a model perform good on test otherwise since the distribution of test set is different than validation set.

In the way forward, we will be working to make the model more resilient using adversarial training and other data augmentation techniques.

Yes speech accent is definitely an indication of distrbution difference but there may be some other small differences in the distribution like the number of pauses, time between pauses etc. that the model might have overfitted on given the training data.

Sachin

thaya-k · 2020-10-26T11:05:54Z

Hi, I placed all my audios (both natural and synthesized; total 280) in the path "/data/inference_data/unlabeled" and used the pre-trained model for the classification. Since I am using the terminal mode (Ubuntu), I can't see the "print out with information on predictions of the model, the accuracy of the model on your provided data." However, the result shows likelihood values (correct me if I'm wrong) with a sentence "The probability of the clip being real is: 0.00%". How can I interpret the results?
P.S. I have attached the results in a graph format with likelihood values.

ranasac19878 · 2020-10-28T03:47:32Z

Hi Thaya,

Thanks for the info. Currently, the pretrained model works well only for data it is trained/ validated on. If the data distrubution changes, this model will always default its prediction to fake since the original data had 1:9 ratio or real to fake audio clips. We are working on training another model that will work for out data distribution audio clips in coming months.

The likelihood value is the model propensity score of a clip being real or not.

Thanks,
Sachin

yzslry · 2022-04-10T06:55:30Z

The link to download the asv data in this project seems to be invalid. Can you provide the data or link in the project?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

All real/authentic audios in real subfolder are classified as 'fake' with the pre-trained model #10

All real/authentic audios in real subfolder are classified as 'fake' with the pre-trained model #10

irdance commented Jun 15, 2020

headcrabz commented Jul 2, 2020

ranasac19878 commented Jul 6, 2020

irdance commented Jul 6, 2020

ranasac19878 commented Jul 7, 2020

thaya-k commented Oct 26, 2020 •

edited

Loading

ranasac19878 commented Oct 28, 2020

yzslry commented Apr 10, 2022

All real/authentic audios in real subfolder are classified as 'fake' with the pre-trained model #10

All real/authentic audios in real subfolder are classified as 'fake' with the pre-trained model #10

Comments

irdance commented Jun 15, 2020

headcrabz commented Jul 2, 2020

ranasac19878 commented Jul 6, 2020

irdance commented Jul 6, 2020

ranasac19878 commented Jul 7, 2020

thaya-k commented Oct 26, 2020 • edited Loading

ranasac19878 commented Oct 28, 2020

yzslry commented Apr 10, 2022

thaya-k commented Oct 26, 2020 •

edited

Loading