Notebook | Description |
---|---|
0.1_EDA | This notebook withholds some basic Exploratory Data Analysis (EDA) of the web_app_data and its corresponding metadata. Specifically: - The participant metadata file is parsed -The arousal and valence datafiles are parsed, which are further analyzed in this notebook - Exploratory visualizations of (whole recording) utterance durations, audio sample rate - Parsing of event data |
0.2_Arousal_Valence | Visual Analysis of arousal and valence values. Specifically it covers: - arousal & Valence over time - Arousal & valence for each picture stimuli (compared across groups) A statistical analysis of the arousal-valence analysis can be found in the r-script folder. |
Speech data processing | |
1_transform_audio | - loads and scales the audio data - resamples audio to 16kHz using pytorch sinc interpolation - scales the data to float32 [0, 1] - save the data as .wav and .npy array |
2_audio_quality_assessment | This notebook contains the visualizations that were utilized to create the GSSP_analysis. Furthermore, some additional visualizations are included which focus on single utterances (afhter manual inspection was completed). |
3_VAD_slicing | Applies an open-access (speechbrain/huggignface) Voice Activity Detection model to find the outer voiced bounds of each segment. Afterwards, these bounds are padded with a margin to ensure that the segments |
Speech feature extraction | |
5_OpenSMILE feature extraction | Perform (fixed duration) feature extraction on the 16kHZ VAD-sliced audio segments |
Speech analysis | |
OpenSMile Visualization | Visualizes the fixed duration OpenSMILE GeMAPSv01b functional features w.r.t. Speech acquisition task. Note: this notebook contains the visualizations full utterance duration OpenSMILE features. |
ECAPA-TDNN | Extract and projects (using t-SNE) ECAPA-TDNN embeddings from fixed and whole duration utterances. A machine learning model is also utilized to assess the speech style separability of the embeddings. |
External validation | |
CGN Parsing | This notebook parses the Corpus Gesproken Nederlands (CGN) its orthographic description and speaker recording. |
CGN Feature extraction | This notebook allows listening to excerpts of the selected CGN components and extracts the OpenSMILE GeMAPSv01b Functional features. |
OpenSmile ML | This notebooks performs a within GSSP web app dataset validation utilizing the OpenSMILE GeMAPSv01b functional features. Afterwards, a subset of these functional features are selected to train on the whole web app dataset and predict on the CGN dataset, hinting for speech style generalizability. |