Speech-To-Text using Azure Cognitive Service

The Speech-To-Text feature of the Azure Cognitive is a service that allows you to transcribe spoken words from a microphone or an audio recording.

The goal of this tutorial is to show you how to use Azure cognitive to transcribe a single file or a batch of files.

Fun Fact, I am using this very feature of Azure cognitive services embedded into Microsoft Word to dictate part of this tutorial.

The sample codes from this example is an adaptation of the samples from the following repos [https://github.com/Azure-Samples/cognitive-services-speech-sdk]

Prerequisites

1 - Azure Subscription ("Azure account") 2 - Deploy a Speech-to-text resource from Azure Portal 3 - Get the resource key and region. After your Speech resource is deployed, select Go to resource to view and manage keys. For more information about Cognitive Services resources, see Get the keys for your resource.

How to run this example?

Use the steps below to successfully transcribe a .wav file and an MP3 recording.
I will be transcribing the English language from the audio recordings and the Azure Cognitive Service supports more than more than 100 languages and variants.

Step 1 : Create the "utils.py" file

Create the "utils.py" file and add the AZURE_SPEECH_KEY (resource key), AZURE_SERVICE_REGION (e.g. "eastus" or "westus" ) values from retrieved from the Azure portal.
Keep the file at the folder root level as described below.

/.
/_utils.py

Step 2: Create your virtual environment

For my virtual environment, I use pipenv. Visit the following link to set up the pipenv environment : [https://pypi.org/project/pipenv/] In Python a virtual environment provide an isolation that helps test your application without making configuration changes to your working station(local machines or remote servers)

Step 3 : Running Batch transcript (.wav)

Use the script below on your terminal to transcribe the ".wav" files listed in the file named "testcase.py".

python callback_batch_transcript.py

Step 4 : Transcribe mp3 to file

To transcript MP3 files on Windows, you will need to install few more packages. Visit click here to learn How to use compressed input audio

Use the script below on your terminal to transcribe the file name "GRF19740809_64kb.mp3" located in the "audio" folder

    python transcribe-mp3-to-file.py

What next ?

Transcribe a conversion
Deploy the solution on Azure using GitHub Actions

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
archive		archive
audio		audio
img		img
output		output
.gitignore		.gitignore
README.md		README.md
callback_batch_transcript.py		callback_batch_transcript.py
testcase.py		testcase.py
transcribe-mp3-to-file.py		transcribe-mp3-to-file.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Speech-To-Text using Azure Cognitive Service

Prerequisites

How to run this example?

Step 1 : Create the "utils.py" file

Step 2: Create your virtual environment

Step 3 : Running Batch transcript (.wav)

Step 4 : Transcribe mp3 to file

What next ?

About

Releases

Packages

Languages

hartou/speech-to-text

Folders and files

Latest commit

History

Repository files navigation

Speech-To-Text using Azure Cognitive Service

Prerequisites

How to run this example?

Step 1 : Create the "utils.py" file

Step 2: Create your virtual environment

Step 3 : Running Batch transcript (.wav)

Step 4 : Transcribe mp3 to file

What next ?

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages