Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft ONNX export for VITS #2563

Merged
merged 4 commits into from
May 15, 2023
Merged

Draft ONNX export for VITS #2563

merged 4 commits into from
May 15, 2023

Conversation

erogol
Copy link
Member

@erogol erogol commented Apr 27, 2023

Could not get it to output the variable length sequence. Dynamic shapes do not work for some reason. If anyone knows how, feel free to jump in.

Thanks to @manmay-nakhashi I fixed the issue.

This is how it works

from TTS.tts.models.vits import Vits
from TTS.tts.configs.vits_config import VitsConfig
from TTS.utils.audio.numpy_transforms import save_wav


config = VitsConfig()
config.load_json("config.json")
vits = Vits.init_from_config(config)
vits.load_checkpoint(config,  "model.pth")

vits.export_onnx()
vits.load_onnx("coqui_vits.onnx")

text = "This is a test"
text_inputs = np.asarray(
    vits.tokenizer.text_to_ids(text, language="en"),
    dtype=np.int64,
)[None, :]

audio = vits.inference_onnx(text_inputs)
print(audio.shape)

save_wav(wav=audio[0], path="coqui_vits.wav", sample_rate=config.audio.sample_rate)

erogol added 2 commits April 27, 2023 14:36
Could not get it work to output variable length sequence
@erogol erogol requested a review from manmay-nakhashi May 8, 2023 10:44
@NeonBohdan
Copy link

@erogol when I load a model with "init_discriminator": false, it returns error here

Copy link
Collaborator

@manmay-nakhashi manmay-nakhashi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

everything looks good to me.

@erogol
Copy link
Member Author

erogol commented May 8, 2023

@NeonBohdan can you give me sample code to reproduce?

@NeonBohdan
Copy link

@erogol sure
download this model
and run onnx_run.py.txt

Got AttributeError: 'Vits' object has no attribute 'disc'

@erogol
Copy link
Member Author

erogol commented May 11, 2023

@NeonBohdan can you post the code here in the thread? I don't want to download a file that I don't know.

@NeonBohdan
Copy link

NeonBohdan commented May 11, 2023

With this model

from TTS.tts.models.vits import Vits
from TTS.tts.configs.vits_config import VitsConfig
from TTS.utils.audio.numpy_transforms import save_wav

import numpy as np


config = VitsConfig()
config.load_json("model/config.json")
vits = Vits.init_from_config(config)
vits.load_checkpoint(config,  "model/model_file.pth")

vits.export_onnx()
vits.load_onnx("coqui_vits.onnx")

text = "This is a test"
text_inputs = np.asarray(
    vits.tokenizer.text_to_ids(text, language="en"),
    dtype=np.int64,
)[None, :]

audio = vits.inference_onnx(text_inputs)
print(audio.shape)

save_wav(wav=audio[0], path="coqui_vits.wav", sample_rate=config.audio.sample_rate)

@erogol
Copy link
Member Author

erogol commented May 11, 2023

I think your file has discriminator layers but config sets discriminator false. I could not reproduce the issue with default settings.

@NeonBohdan
Copy link

NeonBohdan commented May 11, 2023

It's not my file, it's single speaker model trained by Eren Gölge @erogol, it available on TTS package model list and it weights 140MiB so discriminator weights removed
There is no possibility to convert it with this code

Because of this line

@erogol
Copy link
Member Author

erogol commented May 11, 2023

I don't know Eren Golge. This sucker needs to do a better job. Let him know.

Which model is this? What's the name?

@NeonBohdan
Copy link

No problem, maybe the line With this model was lost previously

@alessandropettenuzzo96
Copy link

Hi @erogol, thank you for this PR, it's saving me lot of time, could it be used also with Multi Speaker YourTTS?
It gives me an error executing vits.export_onnx()

RuntimeError: Given groups=1, weight of size [196, 196, 1], expected input[1, 192, 100] to have 196 channels, but got 192 channels instead

image

Thank you

@NeonBohdan
Copy link

@alessandropettenuzzo96 in code comment listed in this PR mentioned that it's only for single speaker models for now

@karelnagel
Copy link

No problem, maybe the line With this model was lost previously

Did you get it working? I have the same issue

@erogol erogol merged commit 4de797b into dev May 15, 2023
@erogol erogol deleted the vits_onnxx branch May 15, 2023 23:07
@SystemPanic
Copy link
Contributor

SystemPanic commented May 17, 2023

@erogol Onnx inference is 6-7 times slower than Pytorch due to EXHAUSTIVE convolution search on cuDNN, which is the default mode in OnnxRuntime when simply CUDAExecutionProvider is specified as provider.

Can you change from CUDAExecutionProvider to ("CUDAExecutionProvider", {"cudnn_conv_algo_search": "DEFAULT"}) here?

With DEFAULT as cudnn_conv_algo_search, OnnxRuntime performs ~20% better than the standard VITS Pytorch mode.

This performance issue is detailed on microsoft/onnxruntime#12880 (comment)

Thanks,

Javier.

@karelnagel
Copy link

No problem, maybe the line With this model was lost previously

Did you get it working? I have the same issue

If anyone else is in this situation, then changing vits.py file on line 1771 to this:

# rollback values
_forward = self.forward
disc = None
if hasattr(self, 'disc'):
    disc = self.disc
training = self.training

worked for me (recommended in discord by Jpg#0419) . And also make sure that you have onnx installed (pip install onnx worked for me) and that the export file isn't named onnx.py.

@jpg-gamepad
Copy link

jpg-gamepad commented May 21, 2023

@erogol
Hey thanks for the update! The ONNX exporter works great!

@SchweitzerGAO
Copy link

@alessandropettenuzzo96 hello, have you solved this? I encountered the same problem.

Tindell pushed a commit to pugtech-co/TTS that referenced this pull request Sep 4, 2023
* Draft ONNX export for VITS

Could not get it work to output variable length sequence

* Fixup for onnx constant output

* Make style

* Remove commented code
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants