automatic reference text transcription #28

cocktailpeanut · 2024-11-26T02:49:04Z

Currently the voice cloning feature doesn't seem to work unless you provide an accurate transcription of the reference audio, which is too tedious.

This can be fixed by incorporating whisper to automatically transcribe the reference audio instead of making the users manually enter the reference text. This would make a huge difference since most people are too lazy to transcribe audio clips

edwko · 2024-11-26T17:16:17Z

Thanks for the suggestion, I’ll look into adding this.

Added Whisper-based transcription for speaker creation when `transcript` is None (#28).

edwko · 2024-11-30T14:56:39Z

Added in the 0.2.1 release :)

edwko added enhancement New feature or request todo labels Nov 26, 2024

edwko added a commit that referenced this issue Nov 30, 2024

Whisper integration for speaker generation

c5a75f4

Added Whisper-based transcription for speaker creation when `transcript` is None (#28).

edwko closed this as completed Nov 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

automatic reference text transcription #28

automatic reference text transcription #28

cocktailpeanut commented Nov 26, 2024

edwko commented Nov 26, 2024

edwko commented Nov 30, 2024

automatic reference text transcription #28

automatic reference text transcription #28

Comments

cocktailpeanut commented Nov 26, 2024

edwko commented Nov 26, 2024

edwko commented Nov 30, 2024