The goal of this project was to build an interactive physical interface that could control the articulatory and acoustic parameters of an articulatory speech synthesis in real-time to produce speech sound (vowel sound). The interface was built as a part of course requirement UBC CPEN 541-Human Interface Technologies, April 2018. Later the project was extended and implemented on Artisynth and presented (poster presentation) at the 176th joint meeting of ASA-CAA, Victoria, BC, Canada.
- JASS SDK is used to control the in-built one-dimensional area function based articulatory speech synthesizer.
- The synthesizer is driven by vocal tract parameters (change in area function) and vocal fold parameters (change in source frequency and gain)
- To control the vocal tract parameters, we designed a physical tube whose area could be changed by the user. The interaction is captured by using a document camera and then the change in area is computed in real-time using an image processing algotrithm.
- To control the vocal fold parameters, two interfaces are built and a comparision is made in terms of its flexibility and usability.
- Mouse based controller (Using a two-dimensional interactive pad)
- Slider sensor (Based upon slider positions)
Tools Used: JASS SDK (Java based), Artisynth, MATLAB (image processing toolkit), Processing (p5.js), Computer mouse, Arduino, Phidgets Slider Sensor, Document camera and others (boards, paper, black tape etc.)
- Download the JASS SDK and configure it using Eclipse or anyother IDE.
- Update the
VTNTDemo.java
file with the provided code. - Experiment with different source-controller options (Slider sensor/Mouse) by commenting out either of them.
// SLIDER CONTROLLER
// valRosenberg[0] = ((double[]) proxy.getVariable("freq"))[0];
// valRosenberg[3] = ((double[]) proxy.getVariable("gain"))[0];
// END
// MOUSE CONTROLLER
valRosenberg[0] = (sketch_obj.circleX/675)*1000;
valRosenberg[3] = sketch_obj.circleY/675;
//END
[Note] I have not provided the code for reading slider sensor outputs using Arduino as you might be using different I/O PINs and the code is very simple and straight-forward.
Debasish Ray Mohapatra, Pramit Saha, Praneeth SV
My Contribution: Helped in designing the physical tube (vocal tract), arduino programming (to feed data to JASS SDK/Artisynth), Design the mouse-controlled source-model (vocal fold) using Processing in java environemnt
For more details please follow these papers:
[1] SOUND STREAM: Towards vocal sound synthesis via dual-handed simultaneouscControl of articulatory parameters. Course Submission, Classroom Presentation
[2] SOUND-STREAM II: Towards real-time gesture-controlled articulatory sound synthesis. Conference Paper, 176th joint meeting of ASA-CAA conference, Victoria, BC Canada.
-
The demo video demonstrates the working principle of our interface.
-
A short powerpoint presentation of our work.
In case of any queries please contact - debasishiter@gmail.com