Existing speech tokens are not explicitly designed for speech language modeling, and there has been no exploration into their suitability for building speech language models. To address this gap, we build Speech Language Model Token Benchmark (SLMTokBench), to assess the suitability of speech tokens for constructing speech language models. In this benchmark, we evaluate the alignment between speech tokens and text by estimating their mutual information. We assess preservation of speech information within speech tokens by evaluating the quality of resynthesized speech.
-
Notifications
You must be signed in to change notification settings - Fork 0
0nutation/SLMTokBench
About
SLMTokBench for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published