Automatic LInguistic Unit Count Estimator (ALICE)

Introduction

ALICE is a tool for estimating the number of adult-spoken linguistic units from child-centered audio recordings, as captured by microphones worn by children. It is meant as an open-source alternative for LENA adult word count (AWC) estimator [1].

ALICE uses SylNet [2] for feature extraction and voice type classifier [3] for broad-class speaker diarization. The used model for linguistic unit counts has been optimized across four languages: Argentinian Spanish, Tseltal, Yélî Dnye, and American and UK variants of English. SylNet uses a model that has been adapted for daylong child-centered audio, starting from the baseline model available in standard SylNet.

ALICE outputs an estimate for the number of phonemes, syllables, and words in the input. Only speech detected as spoken by adult male or female talkers is considered towards the counts.

Unit counts from ALICE are not (and are not meant to be) accurate at short time-scales, but optimized for counting across several minutes of audio. Also note that ALICE is NOT designed for "typical" high-quality audio recordings, and may not operate on such data properly.

How to use ?

How to cite ?

If you use ALICE or its derivatives, please cite the following paper:

Räsänen, O., Seshadri, S., Lavechin, M., Cristia, A. & Casillas, M. (in press): ALICE: An open-source tool
for automatic linguistic unit count estimation from child-centered daylong recordings. Behavior Research Methods. 
Online open acccess: https://link.springer.com/article/10.3758/s13428-020-01460-x.

If you use the speaker diarization output (e.g., to compute conversational turns), please cite the following paper:

Lavechin, M., Bousbib, R., Bredin, H., Dupoux, E., & Cristia, A. (2020).
An open-source voice type classifier for child-centered daylong recordings. Interspeech.
Online open access: https://www.isca-archive.org/interspeech_2020/lavechin20_interspeech.pdf

References

[1] Xu, D., Yapanel, U. Gray, S., Gilkerson, J., Richards, J. Hansen, J. (2008).
    Signal processing for young child speech language development
    Proceedings of the 1st Workshop on Child Computer and Interaction (WOCCI-2008), Chania, Crete, Greece.
    (https://www.lena.org/)

[2] Seshadri S. & Räsänen O. (2019). SylNet: An Adaptable End-to-End Syllable Count Estimator for Speech.
    IEEE Signal Processing Letters, vol 26, pp. 1359--1363  (https://github.com/shreyas253/SylNet)

[3] Lavechin, M., Bousbib, R., Bredin, H., Dupoux, E., & Cristia, A. (2020).
    An open-source voice type classifier for child-centered daylong recordings. Interspeech.
    (https://github.com/MarvinLvn/voice-type-classifier)

Name		Name	Last commit message	Last commit date
Latest commit History 157 Commits
SylNet @ 29eb34d		SylNet @ 29eb34d
SylNet_model		SylNet_model
demo		demo
docs		docs
voice-type-classifier @ e443d8c		voice-type-classifier @ e443d8c
.gitignore		.gitignore
.gitmodules		.gitmodules
ALICE_Linux.yml		ALICE_Linux.yml
ALICE_Linux_tf2.yml		ALICE_Linux_tf2.yml
ALICE_macOS.yml		ALICE_macOS.yml
README.md		README.md
extract_basic_features.py		extract_basic_features.py
getFinalEstimates.py		getFinalEstimates.py
prepare_data.py		prepare_data.py
regress_ALUCs.py		regress_ALUCs.py
run_ALICE.sh		run_ALICE.sh
split_to_utterances.py		split_to_utterances.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Automatic LInguistic Unit Count Estimator (ALICE)

Introduction

How to use ?

How to cite ?

References

About

Releases

Packages

Contributors 2

Languages

orasanen/ALICE

Folders and files

Latest commit

History

Repository files navigation

Automatic LInguistic Unit Count Estimator (ALICE)

Introduction

How to use ?

How to cite ?

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages