Skip to content

Latest commit

 

History

History

20 - Speech Detection

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

Native Speech Recognition

The idea is to use the native speech recognition from the browser to build like a SpeechNotebook where each sentence will be in a different paragraph, that means that once the user stops talking a new we will append a new p element.

Notes

This exercise needs to be run in a server in order to be able to have access to the media devices!!

The Web Speech API has two components: the SpeechRecognition and the SpeechSythesis. In this first project we are focusing in the first component.

We initialize it like this:

// Chrome special prefix
window.SpeechRecognition = window.SpeechRecognition || window.webkitSpeechRecognition;

const recognition = new SpeechRecognition();
recognition.interimResults = true;  //As you are speaking igves values rather than waiting till you end speaking       

Handling the result event is the most important part of this exercise. It is in e.results were we can find most of the relevant information. This is an array where we can find the transcript, a value between [0, 1] that shows the confidence of the transcription and a isFinal boolean which tells you if the user stopped talking to start in a new paragraph.

recognition.addEventListener('result', e => {
    const transcript = Array.from(e.results)
        .map(result => result[0])
        .map(result => result.transcript)
        .join('');

    p.textContent = transcript;
    if(e.results[0].isFinal) {
        p = document.createElement('p');
        words.appendChild(p);
    }
});

Finally we append a new element to the div and rebind the listener to the result event.

Events

  • result -> Once we get it, the listener unbinds itself
  • end -> We use it to rebind it 😀

To Do's

  • Implemente some basic functionality like Siri:
    • Siri weather
    • Siri NBA