-
-
Notifications
You must be signed in to change notification settings - Fork 81
Some benchmarks on the datasets #5
Comments
@akreal You can open the
Overall, this model is not overfitted and there is no post-processing yet. |
Perfect, thank you! |
As you can see the model is not fully fitted yet (we are still in exploratory phase) Obviously I exclude the following datasets from the file
|
Now if we exclude "bad" files from here, we will get more interesting results. |
Almost finished collecting v05 and searching hyper-params, will be posting new benchmarks and new data soon |
@snakers4 What model did you use for benchmark? |
@m1ckyro5a |
@snakers4 How about deepspeech2? Which model is better? |
It is hard to tell yet Some benches we ran on LibriSpeech |
I will structure the benchmark files from now a bit
Please note that exclusion files #7 were based on this benchmarks as well previously All charts contain CER Dataset benchmark v05FileModelCNN trained with CTC loss YoutubeAudio booksTTSAcademic datasetsASR datasetsPranks are very noisy by default RadioStrict exclude file for distillationAn idea on how to set thresholds:
|
Also a comment - model was not over-fitted, it is selected based on optimal generalization |
https://ru-open-stt.ams3.digitaloceanspaces.com/benchmark_v05_public.csv.zip is in fact a gzip-compressed file (not a zip-compressed one), so one should decompress it with unzipping fails with:
after gzip-decompression the first line contains some weird stuff:
|
Hi! What datasets have speaker labels? |
We decided not to update and / or maintain these for reasons. |
Below I will post some of the results on the public part of the dataset
Both train and validation
Hope this will inspire the community to share their results and models
The text was updated successfully, but these errors were encountered: