Reconstructions sound very similar even with different training files #31

annalina111 · 2022-04-04T19:02:01Z

annalina111
Apr 4, 2022

Hi all! I'm trying since a little while to get meaningful output training the model and using the reconstruction script.

The timbres of the output files generated by reconstruction.py all sound quite similar (kind of metallic), even with checkpoints generated from different types of sounds. I originally thought this made sense as my first checkpoints were generated from drum machine hi-hat sounds. But then I got very similar results reconstructing the same audio with checkpoints from a synthesizer recording with a different timbre.

attaching the dropbox folder here with inputs and ouputs... Maybe this phenomenon sounds familiar to somebody?

The one thing I can guess is that I didn't provide enough training source material in minutes of audio, will be trying that now..

https://www.dropbox.com/scl/fo/t5m3lpc3gryhiqycxact4/h?dl=0&rlkey=3vmpyewf1nrqk3e8dqjojxptl

Answered by caillonantoine

Apr 5, 2022

Hi ! I think you are still at stage 1 of training, you should wait until stage 2 to start having nicely sounding sounds (see the article for more info) ! Another problem could be your dataset size, I've read somewhere that it's about 15mn long which is far from enough ! Something like ~2hours would make a better fit :)

View full answer

caillonantoine · 2022-04-05T08:46:51Z

caillonantoine
Apr 5, 2022
Collaborator

Hi ! I think you are still at stage 1 of training, you should wait until stage 2 to start having nicely sounding sounds (see the article for more info) ! Another problem could be your dataset size, I've read somewhere that it's about 15mn long which is far from enough ! Something like ~2hours would make a better fit :)

2 replies

annalina111 Apr 5, 2022
Author

Hi Antoine, thanks for checking it out! I'm now trying with 1.5 hrs of material yes... But how do I know when the second stage is occurring? I looked at tensorboard but could not find that information. Should i just keep trying reconstructions with the newest checkpoints until something makes sense?

also very curious: how are best checkpoints evaluated? and what do the version_n output folders correspond to?

Thanks for your patience with the questions from an ML newb!!!

vidret Apr 6, 2022

Without looking into or changing anything you will notice when stage 2 occurs by the far slower iterations per second.

Give an update later on how the training went, I've had some issues with training myself (metallic, samey sounds).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reconstructions sound very similar even with different training files #31

{{title}}

Replies: 1 comment 2 replies

{{title}}

{{title}}

{{title}}

Select a reply

Reconstructions sound very similar even with different training files #31

annalina111 Apr 4, 2022

Replies: 1 comment · 2 replies

caillonantoine Apr 5, 2022 Collaborator

annalina111 Apr 5, 2022 Author

vidret Apr 6, 2022

annalina111
Apr 4, 2022

Replies: 1 comment 2 replies

caillonantoine
Apr 5, 2022
Collaborator

annalina111 Apr 5, 2022
Author