Skip to content

PyTorch implementation of frontend, like PCEN (per-channel energy normalization) and Mel-Filterbank (mel-filterbank).

Notifications You must be signed in to change notification settings

TeaPoly/asr_frontend

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 

Repository files navigation

ASR Frontend

PyTorch implementation ASR frontend, like PCEN, Mel filter bank log energy.

Usage

The following is a example for using PCEN:

import pcen
import numpy as np

b, s, d = 32, 100, 40
filterbanks = np.random.uniform(low=0.5, high=13.3, size=(b, s, d))
filterbanks = torch.from_numpy(filterbanks.astype(dtype=np.float32))
trainable_pcen = pcen.Pcen(d)
pcen_features = trainable_pcen(filterbanks)

Citation

Wang, Yuxuan, Pascal Getreuer, Thad Hughes, Richard F. Lyon, and Rif A. Saurous. Trainable frontend for robust and far-field keyword spotting. In Acoustics, Speech and Signal Processing (ICASSP), 2017 IEEE International Conference on, pp. 5670-5674. IEEE, 2017.

@inproceedings{wang2017trainable,
  title={Trainable frontend for robust and far-field keyword spotting},
  author={Wang, Yuxuan and Getreuer, Pascal and Hughes, Thad and Lyon, Richard F and Saurous, Rif A},
  booktitle={Acoustics, Speech and Signal Processing (ICASSP), 2017 IEEE International Conference on},
  pages={5670--5674},
  year={2017},
  organization={IEEE}
}

About

PyTorch implementation of frontend, like PCEN (per-channel energy normalization) and Mel-Filterbank (mel-filterbank).

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages