Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

完整歌曲效果 #6

Closed
MaxMax2016 opened this issue Jun 14, 2022 · 4 comments
Closed

完整歌曲效果 #6

MaxMax2016 opened this issue Jun 14, 2022 · 4 comments

Comments

@MaxMax2016
Copy link
Collaborator

MaxMax2016 commented Jun 14, 2022

reg_pit_song.mp4

F0根据规则生成,没有生成颤音曲线,机器人感觉严重

@dutchsing009
Copy link

Hello , good thing you opened the repository again i emailed you about it 2 days ago. The audio above sounds so robotic and not natural at all. can you tell me exactly what is the problem here so i see if i can solve it ??
regarding the other issue #3 when you said "but need more detail of f0" i think i have a solution for you which is...
you need to extract the Fundamental frequency from the audio during preprocessing , you extract that from the waveform using this https://github.com/NVIDIA/mellotron/blob/master/yin.py
compute_yin() computes Fundamental frequency from the waveform
w_step should be the size as STFT hop size
unknown (1)
those parameters may need to be changed for SVS since f0 can be higher than in TTS
that was my recommendation for you regarding the #3, please tell me if that works ? or if there are any other problems you also have especially regarding the so robotic audio above :)
Best Regards

@MaxMax2016
Copy link
Collaborator Author

I mean need dnn network to predict smooth f0 with vibrato; The audio above sounds so robotic ,because this f0 is producted by handed(midi map) withnot vibrato; or use https://github.com/stakira/OpenUtau draw smooth f0 with vibrato;
vibrato

@dutchsing009
Copy link

ah ok so now you are back to issue #1 again that you need vibrato for it not to have robotic sound.

@MaxMax2016
Copy link
Collaborator Author

@dutchsing009 i use nn mode to predict the F0, the song is here #8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants