The data used in this experiments is from Voxceleb Dataset. Among all the speaker present, I chose 100 speaker randomly, and take their voices within 5 minutes duration each. Then, the data is processed with turning them to MFCC coefficients.
The raw data, then we extract using MFCC method with 40 dimensions coefficients. After that, we just throw it in the neural network. The training scheme is just like another training scheme. We made an iteration with 100 epo