project-sブランチをmainブランチにマージ #1029

Hiroshiba · 2024-01-27T20:15:18Z

内容

project-sブランチをmainブランチにマージします。

その他

…024-01-14

* update metas (add style type) * update engine manifest (add frame rate) * add sing api to core wrapper * add sing api to core adapter * add models for sing api * add sing process to tts engine * add sing api * fix miss * add fixme comment Co-authored-by: Hiroshiba <hihokaruta@gmail.com> * remove sing type * fix typo * remove optional * translate error detail * get -> create * fix docs * Revert "remove optional" This reverts commit 12b8fc6. * fix pytest * add comment * add fixme comment Co-authored-by: Hiroshiba <hihokaruta@gmail.com> * improve models --------- Co-authored-by: Hiroshiba <hihokaruta@gmail.com>

github-actions · 2024-01-27T20:19:05Z

Coverage Result

Resultを開く

Name	Stmts	Miss
run.py	497	313
voicevox_engine/init.py	1	0
voicevox_engine/cancellable_engine.py	94	72
voicevox_engine/core_adapter.py	81	34
voicevox_engine/core_initializer.py	59	30
voicevox_engine/core_wrapper.py	257	183
voicevox_engine/dev/core/init.py	2	0
voicevox_engine/dev/core/mock.py	36	8
voicevox_engine/dev/tts_engine/init.py	2	0
voicevox_engine/dev/tts_engine/mock.py	28	0
voicevox_engine/engine_manifest/EngineManifest.py	35	0
voicevox_engine/engine_manifest/EngineManifestLoader.py	12	0
voicevox_engine/engine_manifest/init.py	3	0
voicevox_engine/library_manager.py	92	4
voicevox_engine/metas/Metas.py	36	0
voicevox_engine/metas/MetasStore.py	18	6
voicevox_engine/metas/init.py	2	0
voicevox_engine/model.py	180	9
voicevox_engine/morphing.py	71	46
voicevox_engine/part_of_speech_data.py	5	0
voicevox_engine/preset/Preset.py	13	0
voicevox_engine/preset/PresetError.py	2	0
voicevox_engine/preset/PresetManager.py	80	2
voicevox_engine/preset/init.py	4	0
voicevox_engine/setting/Setting.py	11	0
voicevox_engine/setting/SettingLoader.py	17	0
voicevox_engine/setting/init.py	3	0
voicevox_engine/tts_pipeline/acoustic_feature_extractor.py	34	0
voicevox_engine/tts_pipeline/kana_converter.py	88	1
voicevox_engine/tts_pipeline/mora_list.py	7	0
voicevox_engine/tts_pipeline/text_analyzer.py	146	6
voicevox_engine/tts_pipeline/tts_engine.py	264	93
voicevox_engine/user_dict.py	145	12
voicevox_engine/utility/init.py	5	0
voicevox_engine/utility/connect_base64_waves.py	37	0
voicevox_engine/utility/core_version_utility.py	8	1
voicevox_engine/utility/mutex_utility.py	13	0
voicevox_engine/utility/path_utility.py	26	8
voicevox_engine/utility/run_utility.py	10	7
TOTAL	2424	835

Hiroshiba

ちょくちょく、なんでこうなったかの説明がないところがあるので、思い出せる限りとりあえずメモを書いてみました。

Hiroshiba · 2024-01-28T20:05:16Z

run.py

@@ -640,6 +642,69 @@ def _synthesis_morphing(
            background=BackgroundTask(delete_file, f.name),
        )

+    @app.post(
+        "/sing_frame_audio_query",


audio_queryに歌い方周りのデータを足すこともできたんですが、互換性とか考えるとややこしいので分けることにしました。
sing_audio_queryにすることもできたのですが、audio_queryはフレームレベルのデータ構造ではないこと、将来フレームレベルのtalkができうることを考えると、構造がフレームレベルであることがわかるようにした方が良さそうだったのでframeを足しています。

Hiroshiba · 2024-01-28T20:06:25Z

run.py

+    )
+    def sing_frame_audio_query(
+        score: Score,
+        style_id: StyleId = Query(alias="speaker"),  # noqa: B008


この書式だと API はspeaker引数になっています。
style_idにしようかまだ迷ってます（リリース前なので）。

Hiroshiba · 2024-01-28T20:07:31Z

run.py

+        core_version: str | None = None,
+    ) -> FrameAudioQuery:
+        """
+        歌唱音声合成用のクエリの初期値を得ます。ここで得られたクエリはそのまま歌唱音声合成に利用できます。各値の意味は`Schemas`を参照してください。


歌唱音声合成用のクエリというドメイン用語は歌唱と音声がかぶってるので「歌唱合成用の」でもいいかもと思ってます。

Hiroshiba · 2024-01-28T20:08:53Z

run.py

+        },
+        tags=["音声合成"],
+    )
+    def frame_synthesis(


sing_synthesisじゃないのは、この処理が歌に依存しておらず、将来的に例えば TTS 側でも利用できなくはないので、汎用的な名前を持たせるために、現状の役割ではなく処理内容を名前にしています。

Hiroshiba · 2024-01-28T20:11:52Z

voicevox_engine/engine_manifest/EngineManifest.py

@@ -57,6 +57,7 @@ class EngineManifest(BaseModel):
    url: str = Field(title="エンジンのURL")
    icon: str = Field(title="エンジンのアイコンをBASE64エンコードしたもの")
    default_sampling_rate: int = Field(title="デフォルトのサンプリング周波数")
+    frame_rate: float = Field(title="エンジンのフレームレート")


フレームレートはマニフェストに足しています。

FrameAudioQueryの中に含めてレスポンスで判断してもらうという手も考えたんですが、それだとフレームレートが1回合成しないとわからないということになるので避けました。

ちなみフレームレートの逆数のframe second にする手もあったのですが、VOICEVOXのデフォルトフレームレート93.75は秒に直すと割り切れないのでこっちにしました。

Hiroshiba · 2024-01-28T20:26:58Z

voicevox_engine/engine_manifest/EngineManifest.py

@@ -57,6 +57,7 @@ class EngineManifest(BaseModel):
    url: str = Field(title="エンジンのURL")
    icon: str = Field(title="エンジンのアイコンをBASE64エンコードしたもの")
    default_sampling_rate: int = Field(title="デフォルトのサンプリング周波数")
+    frame_rate: float = Field(title="エンジンのフレームレート")
    terms_of_service: str = Field(title="エンジンの利用規約")
    update_infos: List[UpdateInfo] = Field(title="エンジンのアップデート情報")
    dependency_licenses: List[LicenseInfo] = Field(title="依存関係のライセンス情報")


エンジンマニフェストに達しているデータはフレームレートのみです。

TTS 側みたく、歌に関する具体的な能力の有無も書き足そうかなと思ったのですがやめました。
理由は･･･ちょっといまいち説得力がないかもですが、仮に中途半端に能力を書き足した場合、後で機能が増えた時に、マルチエンジンに対応したエディター側に初期値が何なのかの処理をVOICEVOXエンジン依存で書く必要が出てきて大変だから、だった気がします。
だとしたらもっと時間がある時に必要な能力を列挙してしまって一気に実装した方が良いかなという判断になりました。

（sing能力の有無だけはあっても良かったかもしれない･･･？）

Hiroshiba · 2024-01-28T20:31:27Z

voicevox_engine/model.py

+    key: int | None = Field(title="音階")
+    frame_length: int = Field(title="音符のフレーム長")
+    lyric: str = Field(title="音符の歌詞")


Noneはノートがない区間を表しています。
lyricはモーラにすることもできたのですが、将来多言語になることとか、あと２モーラ以上あるエンジンも考えられる（NEUTRINOとか）ので、strにしました。

（･･･今思ったら英語のような表音文字じゃない言語のことを考えると、音素列の方が良かったかも。まあその場合はphonemesも足せるようにすれば良さそう。）

Hiroshiba · 2024-01-28T20:34:44Z

voicevox_engine/model.py

+class Score(BaseModel):
+    """
+    楽譜情報
+    """
+
+    notes: List[Note] = Field(title="音符のリスト")


将来的に何か情報が足せるように、要素が一つだけのクラスを作りました。

y-chan and others added 5 commits January 14, 2024 00:30

Merge remote-tracking branch 'upstream/master' into merge/project-s-2…

6ceebc7

…024-01-14

[project-s] mainブランチをマージ (#1007)

8d23bd3

Merge branch 'master' into merge-master

9abe963

[project-s] masterブランチをマージ (#1028)

2f4c1ff

Hiroshiba requested a review from a team as a code owner January 27, 2024 20:15

Hiroshiba requested review from y-chan and removed request for a team January 27, 2024 20:15

Hiroshiba merged commit ea76515 into master Jan 27, 2024
6 checks passed

Hiroshiba deleted the project-s branch January 27, 2024 20:16

Hiroshiba commented Jan 28, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

project-sブランチをmainブランチにマージ #1029

project-sブランチをmainブランチにマージ #1029

Hiroshiba commented Jan 27, 2024

github-actions bot commented Jan 27, 2024

Hiroshiba left a comment

Hiroshiba Jan 28, 2024

Hiroshiba Jan 28, 2024

Hiroshiba Jan 28, 2024

Hiroshiba Jan 28, 2024

Hiroshiba Jan 28, 2024

Hiroshiba Jan 28, 2024

Hiroshiba Jan 28, 2024

Hiroshiba Jan 28, 2024

project-sブランチをmainブランチにマージ #1029

project-sブランチをmainブランチにマージ #1029

Conversation

Hiroshiba commented Jan 27, 2024

内容

関連 Issue

その他

github-actions bot commented Jan 27, 2024

Coverage Result

Hiroshiba left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment