A curated list of awesome voice activity detection
- wiseman/py-webrtcvad : Python interface to the WebRTC Voice Activity Detector
- mwv/vad : This is a straight-forward re-implementation of Bowon Lee’s Voice Activity Detector.
- halleytl/pyvad : VAD(Voice Activity Detector) python 实现对时时读入的流式数据进行端点检测
- jtkim-kaist/VAD : This toolkit provides the voice activity detection (VAD) code and our recorded dataset.
- snakers4/silero-vad : Silero VAD: pre-trained enterprise-grade Voice Activity Detector, Language Classifier and Spoken Number Detector
- marsbroshok/VAD-python : Voice Activity Detector in Python
- hcmlab/vadnet : Real-time Voice Activity Detection in Noisy Environments using Deep Neural Networks
- eesungkim/Voice_Activity_Detector : A statistical model-based Voice Activity Detection
- jymsuper/VAD_tutorial : Simple DNN based Voice Activity Detection (VAD) using Pytorch
- mounalab/LSTM-RNN-VAD : Voice Activity Detection LSTM-RNN learning model
- RicherMans/GPV : Repository for our Interspeech2020 general-purpose voice activity detection (GPVAD) paper
- MaigoAkisame/cmu-thesis : Code for Yun Wang's PhD Thesis: Polyphonic Sound Event Detection with Weak Labeling
- amsehili/auditok : An audio/acoustic activity detection and audio segmentation tool
- athena-team/athena-signal : Athena-signal is an open-source implementation of speech signal processing algorithms. It aims to help researchers and engineers who want to use speech signal processing algorithms in their own projects. Athena-signal is mainly implemented using C, and called by python.
- SIP-Lab/CNN-VAD : A Convolutional Neural Network based Voice Activity Detector for Smartphones
- filippogiruzzi/voice_activity_detection : Voice Activity Detection based on Deep Learning & TensorFlow
- nicklashansen/voice-activity-detection : Voice Activity Detection (VAD) using deep learning.
- marianne-m/brouhaha-vad : Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimation (2023)
- Picovoice/cobra : On-device voice activity detection (VAD) powered by deep learning.
- iic/speech_fsmn_vad_zh-cn-16k-common-pytorch: Deep-FSMN for large vocabulary continuous speech recognition. FSMN-Monophone VAD是达摩院语音团队提出的高效语音端点检测模型,用于检测输入音频中有效语音的起止时间点信息,并将检测出来的有效音频片段输入识别引擎进行识别,减少无效语音带来的识别错误。
- Revai/reverb-diarization-pipeline-v2: This repository contains 2 new speaker diarization models built upon the PyAnnote framework. These models are trained and intended for the usage with ASR system (speaker attributed ASR).
- pyannote/speaker-diarization-3.1: pyannote.audio is an open-source toolkit written in Python for speaker diarization. Based on PyTorch machine learning framework, it comes with state-of-the-art pretrained models and pipelines, that can be further finetuned to your own data for even better performance.