Skip to content

Commit

Permalink
More intuitive session behaviour for api.ai
Browse files Browse the repository at this point in the history
  • Loading branch information
Uberi committed May 22, 2016
1 parent 49eec3f commit 14ef93a
Show file tree
Hide file tree
Showing 2 changed files with 7 additions and 6 deletions.
6 changes: 3 additions & 3 deletions reference/library-reference.rst
Original file line number Diff line number Diff line change
Expand Up @@ -221,16 +221,16 @@ Returns the most likely transcription if ``show_all`` is false (the default). Ot

Raises a ``speech_recognition.UnknownValueError`` exception if the speech is unintelligible. Raises a ``speech_recognition.RequestError`` exception if the speech recognition operation failed, if the key isn't valid, or if there is no internet connection.

``recognizer_instance.recognize_api(audio_data, client_access_token, language = "en", session_id = "session", show_all = False)``
---------------------------------------------------------------------------------------------------------------------------------
``recognizer_instance.recognize_api(audio_data, client_access_token, language = "en", session_id = None, show_all = False)``
----------------------------------------------------------------------------------------------------------------------------

Perform speech recognition on ``audio_data`` (an ``AudioData`` instance), using the api.ai Speech to Text API.

The api.ai API client access token is specified by ``client_access_token``. Unfortunately, this is not available without `signing up for an account <https://console.api.ai/api-client/#/signup>`__ and creating an api.ai agent. To get the API client access token, go to the agent settings, go to the section titled "API keys", and look for "Client access token". API client access tokens are 32-character lowercase hexadecimal strings.

Although the recognition language is specified when creating the api.ai agent in the web console, it must also be provided in the ``language`` parameter as an RFC5646 language tag like ``"en"`` (US English) or ``"fr"`` (International French), defaulting to US English. A list of supported language values can be found in the `API documentation <https://api.ai/docs/reference/#languages>`__.

The ``session_id`` is a string of up to 36 characters used to identify the client making the requests; api.ai can make use of previous requests that used the same session ID to give more accurate results for future requests.
The ``session_id`` is an optional string of up to 36 characters used to identify the client making the requests; api.ai can make use of previous requests that used the same session ID to give more accurate results for future requests. If ``None``, sessions are not used; every query is interpreted as if it is the first one.

Returns the most likely transcription if ``show_all`` is false (the default). Otherwise, returns the `raw API response <https://api.ai/docs/reference/#a-namepost-multipost-query-multipart>`__ as a JSON dictionary.

Expand Down
7 changes: 4 additions & 3 deletions speech_recognition/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -799,15 +799,15 @@ def recognize_bing(self, audio_data, key, language = "en-US", show_all = False):
if "header" not in result or "lexical" not in result["header"]: raise UnknownValueError()
return result["header"]["lexical"]

def recognize_api(self, audio_data, client_access_token, language = "en", session_id = "session", show_all = False):
def recognize_api(self, audio_data, client_access_token, language = "en", session_id = None, show_all = False):
"""
Perform speech recognition on ``audio_data`` (an ``AudioData`` instance), using the api.ai Speech to Text API.
The api.ai API client access token is specified by ``client_access_token``. Unfortunately, this is not available without `signing up for an account <https://console.api.ai/api-client/#/signup>`__ and creating an api.ai agent. To get the API client access token, go to the agent settings, go to the section titled "API keys", and look for "Client access token". API client access tokens are 32-character lowercase hexadecimal strings.
Although the recognition language is specified when creating the api.ai agent in the web console, it must also be provided in the ``language`` parameter as an RFC5646 language tag like ``"en"`` (US English) or ``"fr"`` (International French), defaulting to US English. A list of supported language values can be found in the `API documentation <https://api.ai/docs/reference/#languages>`__.
The ``session_id`` is a string of up to 36 characters used to identify the client making the requests; api.ai can make use of previous requests that used the same session ID to give more accurate results for future requests.
The ``session_id`` is an optional string of up to 36 characters used to identify the client making the requests; api.ai can make use of previous requests that used the same session ID to give more accurate results for future requests. If ``None``, sessions are not used; every query is interpreted as if it is the first one.
Returns the most likely transcription if ``show_all`` is false (the default). Otherwise, returns the `raw API response <https://api.ai/docs/reference/#a-namepost-multipost-query-multipart>`__ as a JSON dictionary.
Expand All @@ -816,7 +816,7 @@ def recognize_api(self, audio_data, client_access_token, language = "en", sessio
assert isinstance(audio_data, AudioData), "Data must be audio data"
assert isinstance(client_access_token, str), "`username` must be a string"
assert isinstance(language, str), "`language` must be a string"
assert isinstance(session_id, str) and len(session_id) <= 36, "`session_id` must be a string of up to 36 characters"
assert session_id is None or (isinstance(session_id, str) and len(session_id) <= 36), "`session_id` must be a string of up to 36 characters"

wav_data = audio_data.get_wav_data(convert_rate = 16000, convert_width = 2) # audio must be 16-bit mono 16 kHz
url = "https://api.api.ai/v1/query"
Expand All @@ -827,6 +827,7 @@ def recognize_api(self, audio_data, client_access_token, language = "en", sessio
if boundary.encode("utf-8") not in wav_data:
break

if session_id is None: session_id = uuid.uuid4().hex
data = (
b"--" + boundary.encode("utf-8") + b"\r\n" +
b"Content-Disposition: form-data; name=\"request\"\r\n" +
Expand Down

0 comments on commit 14ef93a

Please sign in to comment.