azure_stt

API Wrapper for the Microsoft Azure Speech Services Speech-to-text REST API 3.1 (Cognitive Services).

Installation

Add this line to your application's Gemfile:

gem 'azure_stt'

And then execute:

bundle

Or install it yourself as:

gem install azure_stt

Azure Speech-to-text Subscription key

To be able to use the gem, you must have a subscription key. You can generate one on your Azure account.

If you don't have an Azure account, you can create one for free on this page.
Once logged on your Azure portal, subscribe to Speech in Microsoft Cognitive Services.
You will find two subscription keys available in 'RESOURCE MANAGEMENT > Keys' ('KEY 1' and 'KEY 2').

Usage

Configuration

Two environment variables are used:

'REGION': the region of your subscription
'SUBSCRIPTION_KEY': the API key you can generate on your Azure account.

You can look at the file env.sample and change the values. If you do not want to use environment variables, you can configure the values like so:

AzureSTT.configure do |config|
  config.region = 'your_region'
  config.subscription_key = 'your_key'
end

Finally, the class AzureSTT::Session uses by the default the values from the configuration, but you can initialize the session with custom values:

session = AzureSTT::Session.new(region: 'your_region', subscription_key: 'your_key')

Start a transcription

require 'azure_stt'

properties = {
  "diarizationEnabled" => false,
  "wordLevelTimestampsEnabled" => false,
  "punctuationMode" => "DictatedAndAutomatic",
  "profanityFilterMode" => "Masked"
}

content_urls = [ 'https://path.com/audio.ogg', 'https://path.com/audio1.ogg']

session = AzureSTT::Session.new

transcription = session.create_transcription(
  content_urls: content_urls,
  properties: properties,
  locale: 'en-US',
  display_name: 'The name of the transcription')

# You can the retrieve the results of your transcription with the id
puts transcription.id
# Outputs 'your_transcription_id'

Get a transcription

require 'azure_stt'

session = AzureSTT::Session.new

transcription = session.get_transcription('your_transcription_id')

# Returns
# #<AzureSTT::Transcription id="d35a802d-70ae-4358-a35d-b5faa0c75457"
# # model="" properties={"diarizationEnabled"=>false,
# # "wordLevelTimestampsEnabled"=>false, "channels"=>[0, 1],
# # "punctuationMode"=>"DictatedAndAutomatic", "profanityFilterMode"=>"Masked",
# # "duration"=>"PT5M18S"}
# # links={"files"=>"https://uscentral.api.cognitive.microsoft.com/speechtotext/v3.1/transcriptions/d35a802d-70ae-4358-a35d-b5faa0c75457/files"}
# # last_action_date_time=#<Date: 2020-05-31 ((2459366j,0s,0n),+0s,2299161j)> created_date_time=#<Date: 2020-05-31 ((2459366j,0s,0n),+0s,2299161j)>
# # status="Succeeded" locale="en-US" display_name="Transcription name" files=[]>

if transcription.succeeded?
  # You can then access to the text, for instance :
  result = transcription.results.first
  puts result.text
end

Delete a transcription

require 'azure_stt'

session = AzureSTT::Session.new

transcription = session.delete_transcription('your_transcription_id')

The API doesn't seem to send 404 errors when the id is unknown, but always send a 204 response. So the Session#delete_transcription returns true even when the transcription didn't exist.

Starting a transcription, fetching the results and deleting the transcription

require 'azure_stt'

session = AzureSTT::Session.new

properties = {
  "diarizationEnabled" => false,
  "wordLevelTimestampsEnabled" => false,
  "punctuationMode" => "DictatedAndAutomatic",
  "profanityFilterMode" => "Masked"
}

content_urls = [ 'https://path.com/audio.ogg' ]

session = AzureSTT::Session.new

transcription = session.create_transcription(
  content_urls: content_urls,
  properties: properties,
  locale: 'en-US',
  display_name: 'The name of the transcription')

id = transcription.id

while(!transcription.finished?) do
  sleep(30)
  transcription = session.get_transcription(id)
end

if(transcription.succeeded?)
  puts transcription.results.first.text
end

session.delete_transcription(id)

Development

After checking out the repo, run bin/setup to install dependencies. You can also run bin/console for an interactive prompt that will allow you to experiment.

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/PerfectMemory/azure_stt. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the Contributor Covenant code of conduct.

Code of Conduct

Everyone interacting in the AzureStt project’s codebases, issue trackers, chat rooms and mailing lists is expected to follow the code of conduct.

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
.github/workflows		.github/workflows
bin		bin
docs		docs
lib		lib
spec		spec
.gitignore		.gitignore
.rubocop.yml		.rubocop.yml
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
Gemfile		Gemfile
LICENSE		LICENSE
README.md		README.md
Rakefile		Rakefile
azure_stt.gemspec		azure_stt.gemspec
env.sample		env.sample

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

azure_stt

Installation

Azure Speech-to-text Subscription key

Usage

Configuration

Start a transcription

Get a transcription

Delete a transcription

Starting a transcription, fetching the results and deleting the transcription

Development

Contributing

Code of Conduct

About

Releases

Contributors 2

Languages

License

PerfectMemory/azure_stt

Folders and files

Latest commit

History

Repository files navigation

azure_stt

Installation

Azure Speech-to-text Subscription key

Usage

Configuration

Start a transcription

Get a transcription

Delete a transcription

Starting a transcription, fetching the results and deleting the transcription

Development

Contributing

Code of Conduct

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Contributors 2

Languages