Replies: 2 comments 1 reply
-
I have tried larger weights but result is worse |
Beta Was this translation helpful? Give feedback.
-
These results seem good overall, especially for a relatively short wakeword like "hey luna" with just three phonemes. It's possible that by continuing to tweak hyper parameters you can incrementally improve performance. However, in my own experiments I also notice that different metrics often diverge late in training. In particular, it is very difficult to keep the validation false positive rate low during extended training iterations, as even small models will start to overfit on the training data. I find that often the best way to reduce false positives is to try and identify what types of words and phrases trigger false activation during normal usage, and then manually adding those phrases to the config file when training. |
Beta Was this translation helpful? Give feedback.
-
I want to train the model to wake "hey luna", if I set the audio augmentation parameters " {AddColoredNoise:[10:30], AddBackgroundNoise:[0:25], RIR[ave]}", the training result is very fantastic. I use 200000 positive samples and ACAV100M_2000_hrs_16bit datasets in training. The gray line uses the weight 50 and orange line uses weight 150. The model in training dataset seems to reduce false positive rate but in validation set the fpr is increasing.

If I reduce these parameters as {AddColoredNoise:[20:30], AddBackgroundNoise:[20:30], RIR[peak]}, the model can be trained with a satisfied fpr (1 time per hour) and recall rate (90%) within 100000 steps.
Besides, I realize that the "hi luna" phase performs worse than "hey luna" phase in the model, the reason is unknown.
Beta Was this translation helpful? Give feedback.
All reactions