Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add NAS KWS model, Dynamic Augmentation and Automated Evaluation Notebook #280

Merged
merged 19 commits into from
Feb 1, 2024

Conversation

alicangok
Copy link
Contributor

@alicangok alicangok commented Jan 1, 2024

Major changes:

  • Dynamic Augmentation is introduced
    • Instead of fixing the augmented examples during dataset creation, the dataset loader now generates unique training examples during each epoch, significantly boosting robustness against noise and time shifts.
    • The more costly "speed augmentation" remains fixed, carried out once during dataset creation.
    • For stability of validation results across epochs, the validation examples (original + augmentations) are also fixed; they are constructed during initial dataset creation.
  • Changed the dataset filename (dataset2.pt->dataset3.pt) to avoid potential mix-ups, as this PR introduces a major change
    • Added "shift_limits" property to each sample (for possible future feature compatibility, regarding voice activity detection)
    • The generated dataset contains the following:
      • The original training samples from Google Speech Commands, and 2 augmented versions of each sample with different speeds.
      • Additional training samples from Librispeech as additional examples for the "background" class.
      • The original validation samples from Google Speech Commands, and 2 augmented versions of each sample with different speeds, time shifts, and added white noise.
      • The original test samples from Google Speech Commands without any augmentation.
  • Dataset creation is significantly faster (90 mins -> 4 mins), thanks to more efficient operations done in batches.
  • The network found via "Neural Architecture Search" is introduced, which significantly improves accuracy than its predecessors (v2 & v3), having a higher parameter count, slightly increased #MACs, and latency (3.2ms -> 3.9ms).
  • From: @EyubogluMerve: Added automated evaluation notebook for specified noise types and SNR levels..
    • Added a new dataset (signalmixer.py)
    • Modified msnoise.py to:
      • include "Tradeshow" as another type of noise
      • carry out proper train/test splits

Summary of Improvements:

Along with the previous PR, we have improved the KWS20 accuracy from ~86.5% to 92.5% on the validation set which includes augmented samples, and from 87.6% to 93.7% on the clean test set.

The impact of each change on the KWS20 accuracy are as follows:

  • pytsmod tempo augmentation -> torchaudio speed augmentation: +1%
  • v3 -> v2 model: +1.5%
  • v2 -> NAS model: +2.5%
  • Dynamic noise & shift augmentation: +1%
  • Total: +6% Absolute change in accuracy, from 86.5%->92.5%
    • 44% decrease in error rates, with even more significant reduction in false alarm rates.

@alicangok alicangok marked this pull request as draft January 8, 2024 11:32
@alicangok
Copy link
Contributor Author

alicangok commented Jan 8, 2024

Changed the pull request to "draft" mode: Awaiting requested changes at: alicangok#1

rotx-eva and others added 2 commits January 9, 2024 10:12
* changed files are added

* name changes are done

* Update msnoise.py copyrights

* Update signalmixer.py copyright notices

* Update Automated_Evaluation_KWS.ipynb copyright notices

* signalmixer parameters are updated

* Notebook is updated using current paths

* Correct os.path.join usage for non-Linux operating systems

* Define `data_path` once

---------

Co-authored-by: Alican Gök <alicangok@gmail.com>
@alicangok alicangok marked this pull request as ready for review January 15, 2024 12:06
@alicangok alicangok changed the title Add NAS KWS model and Dynamic Augmentation Add NAS KWS model, Dynamic Augmentation and Automated Evaluation Script Jan 15, 2024
@alicangok alicangok changed the title Add NAS KWS model, Dynamic Augmentation and Automated Evaluation Script Add NAS KWS model, Dynamic Augmentation and Automated Evaluation Notebook Jan 15, 2024
Copy link
Contributor

@ermanok ermanok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added minor comments.

datasets/kws20.py Outdated Show resolved Hide resolved
datasets/kws20.py Outdated Show resolved Hide resolved
Copy link
Contributor

@aniktash aniktash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor text updates suggested

notebooks/KWS_Noise_Evaluation.ipynb Outdated Show resolved Hide resolved
notebooks/KWS_Noise_Evaluation.ipynb Outdated Show resolved Hide resolved
notebooks/KWS_Noise_Evaluation.ipynb Outdated Show resolved Hide resolved
Copy link
Contributor

@MaximGorkem MaximGorkem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only few small comments, looks and trains nice.

datasets/kws20.py Outdated Show resolved Hide resolved
datasets/kws20.py Outdated Show resolved Hide resolved
@alicangok alicangok marked this pull request as draft January 24, 2024 22:30
@alicangok alicangok marked this pull request as ready for review January 24, 2024 22:44
@rotx-eva rotx-eva merged commit 3a4a661 into analogdevicesinc:develop Feb 1, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants