Skip to content

Latest commit

 

History

History
20 lines (18 loc) · 3.37 KB

README.md

File metadata and controls

20 lines (18 loc) · 3.37 KB

GSSP analysis notebooks

Notebook Description
0.1_EDA This notebook withholds some basic Exploratory Data Analysis (EDA) of the web_app_data and its corresponding metadata. Specifically:

- The participant metadata file is parsed
-The arousal and valence datafiles are parsed, which are further analyzed in this notebook
- Exploratory visualizations of (whole recording) utterance durations, audio sample rate
- Parsing of event data
0.2_Arousal_Valence Visual Analysis of arousal and valence values. Specifically it covers:
- arousal & Valence over time
- Arousal & valence for each picture stimuli (compared across groups)

A statistical analysis of the arousal-valence analysis can be found in the r-script folder.
Speech data processing
1_transform_audio - loads and scales the audio data
- resamples audio to 16kHz using pytorch sinc interpolation
- scales the data to float32 [0, 1]
- save the data as .wav and .npy array
2_audio_quality_assessment This notebook contains the visualizations that were utilized to create the GSSP_analysis. Furthermore, some additional visualizations are included which focus on single utterances (afhter manual inspection was completed).
3_VAD_slicing Applies an open-access (speechbrain/huggignface) Voice Activity Detection model to find the outer voiced bounds of each segment. Afterwards, these bounds are padded with a margin to ensure that the segments
Speech feature extraction
5_OpenSMILE feature extraction Perform (fixed duration) feature extraction on the 16kHZ VAD-sliced audio segments
Speech analysis
OpenSMile Visualization Visualizes the fixed duration OpenSMILE GeMAPSv01b functional features w.r.t. Speech acquisition task.
Note: this notebook contains the visualizations full utterance duration OpenSMILE features.
ECAPA-TDNN Extract and projects (using t-SNE) ECAPA-TDNN embeddings from fixed and whole duration utterances.

A machine learning model is also utilized to assess the speech style separability of the embeddings.
External validation
CGN Parsing This notebook parses the Corpus Gesproken Nederlands (CGN) its orthographic description and speaker recording.
CGN Feature extraction This notebook allows listening to excerpts of the selected CGN components and extracts the OpenSMILE GeMAPSv01b Functional features.
OpenSmile ML This notebooks performs a within GSSP web app dataset validation utilizing the OpenSMILE GeMAPSv01b functional features.

Afterwards, a subset of these functional features are selected to train on the whole web app dataset and predict on the CGN dataset, hinting for speech style generalizability.