Release v25.2.0 #264

ROBERT-MCDOWELL · 2025-02-12T14:35:38Z

CHANGELOG

version 25.2.0:

version structure is now based on YEAR.MONTH.PATCH_NUMBER
Now no need to have admin privielges on Windows to install ebook2audiobook packages (replaced chocolatey by scoop)
added MPS processor
added custom models dropdown list
added voices dropdown list and play button to listen each of them
added voice extractor for upload voices (separate vocals from background and music)
added delete button for voices, custom models and audiobooks list
added builtin voices to the voices list and can be used for all TTS models
added --output_dir for custom output folder in headless mode
added directory options for ebook upload batch files in gradio/gui mode
added new output audio format ['m4b', 'm4a', 'mp4', 'webm', 'mov', 'mp3', 'flac', 'wav', 'ogg', 'aac'].
More can be added on demand.
added running conversion cancellation via the ebook upload gradio component (when the "X" is clicked)
hew global config settings:
tmp_expire = for inactive session before cleanup, in days
max_custom_model: max custom model on list (by session id)
max_custom_voices: max custom voice on list (by session id)
tts_default_settings: fine tuned XTTS default parameters
(refer to ./lib/conf.py for all new configuration settings)
gradio GUI settings are now saved and restored on refresh and browser exit
resume conversion in headlless and gradio GUI mode, when client page/connection lost or reloaded
(however the user should restart the process manually with the same session id)
Math symbols and numbers to phonemes are now on all TTS engines
(non covered languages are prounounced with the default_language_code set in ./lib/conf.py.
PR are welcome to fix missing translations)
audio filtering, normalization and improvement of all upload voices and final audiobook
to have the best sound presence and clarity.
fixed custom model upload
fixed missing pages in conversion
fixed modules and libraries missing during the installation (regex, mecab etc..)
various gradio design improvements
optimized multi language sentence splitting to minimize hallucinations and unnatural pauses
now numbers and maths symbles are said for fairseq and XTTSv2
the TTS model is now loaded once in the script and for all users using the same model
added coqui-tts builtin voices for all TTS engines and as standard in all languages
added new modal alerts for info, error, exception aand warnings
removed docker_utils which was a docker with ffmpeg and calibre only

Many more fixes and new features, but don't remember all.... see by yourself ;)

Currently in development:

added Terminal output console to gradio/gui
implement more TTS engines (list not decided yet)
apprise notification
implement chapter summarizing to create background music and sounds
implement indices in the metadata for each sentence in the final file
to eventually improve the prounounciation and replace it with the new sentence.
add builtin voice list of xttsv2
add czhech, croation and others with cv/vits
add music interlude between chapters
adding chapters name (if chapters well detected) in place of number in the final metadata
split the output in multiple file if > 12hours # chapters as final
installation of the right torch and cuda version if GPU available so deepspeed can be used
automatic user crash bug report by email via a URL request
create a legends.py file for all gradio/gui legends to manage multilanguage
mark each sentence number in the metadata with the timeecode so
the user would be able to re-convert one sentence before to export the audiobook
(it requires to not delete the ebook temp folder)
use websocat in cmd and sh script to connect in headless mode via gradio and avoid tts load at each command

…diobook into dev-2.5

discord broken img

…diobook into dev-2.5

Drew, I merged manually from you discord changes since my README is the last version with the new --options and typos fixes

Add note to tell users to remove themselves any text they don't want to be converted in audio.

added --num_beams option

…diobook into dev-2.5

into v25

ROBERT-MCDOWELL added 30 commits January 6, 2025 03:19

v2.1.0rc1

823fbec

Merge branch 'dev-2.5' of https://github.com/ROBERT-MCDOWELL/ebook2au…

7e82625

…diobook into dev-2.5

Merge branch 'DrewThomasson:main' into dev-2.5

6dd1d2c

added pip purge cache and typos fixes

e61f5ef

Merge branch 'DrewThomasson:main' into dev-2.5

8193e6e

typo fixes, various improvements

f288d3b

various fixes

3d5f17e

added old cookie safe condition

0867909

fix default fine_tuned

dc7757d

enforce safe None value on cookie restore

ae33622

fix audiobooks_dir typo

e5a0e5a

interface_shared_tmp_expire fix

0b1effa

typos fixes, added default settings

5457ecb

addeed listen info IP, dynamic tts engine according to language

1985bb6

fixed non supported iso639-2 lnaguage exception

503f72c

fixed 400 tokens bug, get_chapters() optimization

c7fc4bf

...

a6b3d8d

Added math symbols to phonemes

47d8bea

Merge branch 'DrewThomasson:main' into dev-2.5

f009ee7

Update README.md

6fc0efb

discord broken img

fixed --tts_engine

ac0ff36

replaced functools lru_cache by fastapi

a2c81b5

Merge branch 'dev-2.5' of https://github.com/ROBERT-MCDOWELL/ebook2au…

7bc1f8a

…diobook into dev-2.5

fixed get_sentences

044fa12

enforced check virtual env

27f0118

moved some import to avoid confusion at first run

9bd4208

added lib/classes folder

60ef2c9

...

e7f0967

changed message in is_virtual_env()

a56395b

ree-changed warning messages

76866c0

ROBERT-MCDOWELL added 25 commits January 16, 2025 18:42

batch directory gradio added, still missing progress bar for all batch

8e0f890

...

ed6a653

fixed typos and recurrent tts loading in batch mode

61ea27d

...

e58f140

lock coqui-tts version. wrap up batch directory option

b91b6e8

Merge branch 'main' into dev-2.5

43e4f96

added flac and more audio format processing!

53eae7c

optimized ffmpeg setings for various audio export formats

8a33bb5

...

48d8af5

added play button for voices, various fixes

c6e537c

Merge branch 'DrewThomasson:main' into dev-2.5

b462f10

Update README.md

05aaa50

Drew, I merged manually from you discord changes since my README is the last version with the new --options and typos fixes

Update README.md

9ca627e

Add note to tell users to remove themselves any text they don't want to be converted in audio.

Update README.md

7f5f7e0

added --num_beams option

v25.2.0rc4

c9c9290

Merge branch 'dev-2.5' of https://github.com/ROBERT-MCDOWELL/ebook2au…

f816ba1

…diobook into dev-2.5

...

efd13b2

...

98f5359

...

a8a1ea6

fixed typos

1ecdaa5

Merge branch 'main' into v25

d3d6dd2

...

1c68e46

Merge branch 'v25' of https://github.com/ROBERT-MCDOWELL/ebook2audiobook

17d6d4c

into v25

allow to run with python OS if version is compatible

d4e5ade

fix typos

57353f5

ROBERT-MCDOWELL self-assigned this Feb 13, 2025

ROBERT-MCDOWELL added the Release label Feb 13, 2025

ROBERT-MCDOWELL added 3 commits February 12, 2025 18:08

...

85e33bc

...

2071f25

...

4359dbd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release v25.2.0 #264

Release v25.2.0 #264

ROBERT-MCDOWELL commented Feb 12, 2025

Release v25.2.0 #264

Are you sure you want to change the base?

Release v25.2.0 #264

Conversation

ROBERT-MCDOWELL commented Feb 12, 2025