Skip to content

Releases: edwko/OuteTTS

OuteTTS v0.2.3

14 Dec 10:58
Compare
Choose a tag to compare

Release Notes v0.2.3

  • Split WavTokenizer into encoder (82MB) and decoder (248MB) components
  • [WIP] Streaming support

OuteTTS v0.2.1

30 Nov 14:55
Compare
Choose a tag to compare

Release Notes v0.2.1

New Features and Improvements:

  1. Support for ExLlamaV2

    • Integrated support for ExLlamaV2
    • Pull request: #37
  2. Whisper Integration for Speaker Generation

    • Added Whisper-based transcription for generating speakers when no transcript is provided.
    • Suggested in: #28
    • Now, if transcript is set to None, the text will be automatically transcribed using Whisper.
    def create_speaker(
        self, 
        audio_path: str, 
        transcript: str = None, 
        whisper_model: str = "turbo",
        whisper_device = None
    )

OuteTTS v0.2.0 Release

25 Nov 12:08
Compare
Choose a tag to compare

OuteTTS v0.2.0 Release Notes

Major Changes

  • New Model Support: Added support for OuteTTS-0.2-500M model
  • Speaker Management: Introduced default speaker presets for each supported language
  • Breaking Changes:
    • Speaker files from previous versions (<0.2.0) are not compatible
    • Interface usage has been significantly revised (see README.md for new implementation)

New Features

  • Added voice cloning guidelines and interface usage recommendations in README.md
  • Implemented Gradio example playground for OuteTTS-0.2-500M
  • Multi-language alignment support
  • Enhanced speaker management:
    • New methods: interface.print_default_speakers() and interface.load_default_speaker(name="male_1")
    • Switched from pickle to JSON format for speaker saving
    • Added speaker language information in saved files
  • Option to load WavTokenizer from custom path (resolves issue #24)
  • Multiple interface version initialization in a single function

Improvements

  • Restructured library files for better organization
  • Implemented hash verification for WavTokenizer downloads (resolves issue #3)
  • Reworked interface for better usability
  • Made sounddevice optional with improved error handling for sound playback
  • Added data preparation examples for training

Error Handling

  • Added validation for audio token detection
  • Improved error messages for long input text and early EOS cases
  • Enhanced overall library error handling and feedback

How to Upgrade

  • Update your library via pip:
    pip install --upgrade outetts