spkembd

Speaker embedding extractor for various tasks. Open-source speaker embedding extractor.

Reference Publications

1. Speaker Verification Papers

2015, George et al., Google, End-to-End Text-Dependent Speaker Verification

Application: Text Dependent
Feature: 40 dim Filterbank
Neural Net Architecture: Maxout-DNN(4-layer), embedding layer before softmax
Loss Function: Cross entropy Loss
Normalization: L2 Norm Dataset Size: 646 speakers
Baseline: i-vector

2018, Li et al., Google, GENERALIZED END-TO-END LOSS FOR SPEAKER VERIFICATION

Application: Text Independent and Independent("OK Google")
Feature: 40 dim Filterbank
Neural Net Architecture: 3 layer LSTM
Loss Function: GE2E (Generalized end-to-end loss)
Normalization: L2 Norm
Window Size: 1.6 second overlap 50%, element wise average
Dataset Size: 1000 speakers, 6.3 enrollment utterances, 7.2 evaluation utterances
Baseline: TE2E (Tuple based end-to-end loss)

2017, Chao et al., Baidu, End-to-End Neural Speaker Embedding System

Application: Text Dependent and Independent
Feature: 64 dim Fbank
Neural Net Architecture: ResNet CNN and GRU
Pre-training: Yes (softmax pre-training)
Loss Function: Triplet loss with cosine distance metric
Feature Normalization: Zero mean unit variance
Dataset: Mandarin and English (not public)
Dataset Size: 250,000 speakers
Baseline: DNN i-vector system
GPUs: 16 K40 GPUs

2017, David et al, JHU, Deep Neural Network Embeddings for Text-Independent Speaker Verification

2. Loss Function Papers

2015, Florian et al., Google, (Triplet loss paper)FaceNet: A Unified Embedding for Face Recognition and Clustering

Ours (1st Baseline of Baseline)

Feature: MFCC20 NN structure: n-layer LSTM Loss: CE Eval Set: SRE2018 Dev Set Dataset: 1TB hold out from SRE2018

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
__pycache__		__pycache__
configs		configs
data_index		data_index
models		models
sample_data		sample_data
.gitignore		.gitignore
README.md		README.md
create_json_config.py		create_json_config.py
dataloader.py		dataloader.py
evaluation.py		evaluation.py
model_baseline.py		model_baseline.py
modules.py		modules.py
requirements.txt		requirements.txt
run_train.py		run_train.py
setup.sh		setup.sh
trainer.py		trainer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

spkembd

Reference Publications

1. Speaker Verification Papers

2. Loss Function Papers

Ours (1st Baseline of Baseline)

Dependency

Dataset

Models

Feature

Performance Evaluation

About

Releases

Packages

Languages

tango4j/prj_spkembd

Folders and files

Latest commit

History

Repository files navigation

spkembd

Reference Publications

1. Speaker Verification Papers

2. Loss Function Papers

Ours (1st Baseline of Baseline)

Dependency

Dataset

Models

Feature

Performance Evaluation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages