Joshua Ball
Edward E. Whitacre Jr. College of Engineering
Department of Computer Science
IEEE
#94164957
Lubbock, USA
joshua.ball@ttu.edu
Voice Activity Detection (VAD) plays a pivotal role in numerous digital audio applications. Distinguishing speech from non-speech in noisy environments remains a challenge. This research aims to design an efficient VAD system that demonstrates high accuracy across various noise conditions.
Voice Activity Detection, at its core, determines the presence or absence of human speech in an audio signal. Its applications span speech recognition, telecommunication systems, and audio compression. The need for a robust VAD system is accentuated in real-world scenarios where noise is omnipresent. This research delves into the design and validation of such a system.
Historically, VAD techniques have ranged from basic energy thresholds and zero-crossing rates to more advanced machine learning approaches. With the advent of deep learning, neural network-based VAD systems have emerged.
- Traditional VAD techniques.
- Recent advancements in neural network-based VAD.
- Challenges and shortcomings of existing methods.
Despite advancements, VAD in noisy conditions remains challenging. This research aims to:
- Analyze the spectral and temporal characteristics of speech and non-speech.
- Design a VAD system resilient to varying noise conditions.
- Validate the system's performance against benchmark datasets.
A dataset comprising clean speech, non-speech, and speech with various noise backgrounds is employed.
Audio signals are resampled and segmented for consistent analysis.
Features like MFCCs, spectral contrast, and chroma features are extracted.
An SVM classifier is trained on the extracted features to differentiate between speech and non-speech segments.
The VAD system's performance is validated against various noise conditions, including white noise, street noise, and cafe ambiance. Performance metrics such as accuracy, precision, recall, and F1-score are computed.
Insights from experimental results, challenges faced during research, and comparisons with existing VAD systems are discussed.
A summary of the research, highlighting contributions and potential future work. The proposed VAD system's viability for real-world applications is discussed.
- [Typical paper on basic VAD techniques.]
- [Recent paper on neural networks for VAD.]
- [Review paper on challenges in VAD.]