Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

超过15秒的录音分离误差很大 #49

Open
yhbsdtc opened this issue Jan 13, 2025 · 3 comments
Open

超过15秒的录音分离误差很大 #49

yhbsdtc opened this issue Jan 13, 2025 · 3 comments

Comments

@yhbsdtc
Copy link

yhbsdtc commented Jan 13, 2025

能否优化代码,目前测试15秒以内的录音分离基本0误差,能否实现将所有需要处理的音频导入以后,先切割成多个15秒的文件,逐个分离,然后再合并

@lukeewin
Copy link

能否优化代码,目前测试15秒以内的录音分离基本0误差,能否实现将所有需要处理的音频导入以后,先切割成多个15秒的文件,逐个分离,然后再合并

这个很好实现,你自己用python实现就行了

@alibabasglab
Copy link
Collaborator

代码里面有相关配置:https://github.com/modelscope/ClearerVoice-Studio/blob/main/clearvoice/config/inference/MossFormer2_SS_16K.yaml ,修改‘decode_window: 30’,可以改成‘decode_window: 15’ ,这里15代表每次处理15秒的语音段。

@gaoyiyeah
Copy link

代码里面有相关配置:https://github.com/modelscope/ClearerVoice-Studio/blob/main/clearvoice/config/inference/MossFormer2_SS_16K.yaml ,修改‘decode_window: 30’,可以改成‘decode_window: 15’ ,这里15代表每次处理15秒的语音段。

这样的话,怎么解决permutation问题

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants