Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add AzureCognitiveServicesToolkit to call Azure Cognitive Services API #5012

Merged
merged 27 commits into from
May 23, 2023

Conversation

whiskyboy
Copy link
Contributor

Add AzureCognitiveServicesToolkit to call Azure Cognitive Services API: achieve some multimodal capabilities

This PR adds a toolkit named AzureCognitiveServicesToolkit which bundles the following tools:

  • AzureCogsImageAnalysisTool: calls Azure Cognitive Services image analysis API to extract caption, objects, tags, and text from images.
  • AzureCogsFormRecognizerTool: calls Azure Cognitive Services form recognizer API to extract text, tables, and key-value pairs from documents.
  • AzureCogsSpeech2TextTool: calls Azure Cognitive Services speech to text API to transcribe speech to text.
  • AzureCogsText2SpeechTool: calls Azure Cognitive Services text to speech API to synthesize text to speech.

This toolkit can be used to process image, document, and audio inputs.

@hwchase17 and @vowelparrot Would be glad to hear your thoughts!

Copy link
Contributor

@hwchase17 hwchase17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks really solid! some lint errors, but we can help fix those up if you dont get to it

thanks for this!! really excited about this

@whiskyboy
Copy link
Contributor Author

whiskyboy commented May 21, 2023

Thanks @hwchase17 for reviewing it!

I just did the reformatting and linting, hope it could pass the checks now.

@hwchase17 @vowelparrot Could you help to approve triggering the workflows? Thanks!

@dev2049 dev2049 added 03 enhancement Enhancement of existing functionality lgtm PR looks good. Use to confirm that a PR is ready for merging. labels May 22, 2023
Copy link
Contributor

@dev2049 dev2049 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couple small comments but overall looks great!

langchain/tools/azure_cognitive_services/image_analysis.py Outdated Show resolved Hide resolved
@dev2049 dev2049 merged commit d7f807b into langchain-ai:master May 23, 2023
vowelparrot pushed a commit that referenced this pull request May 24, 2023
#5012)

# Add AzureCognitiveServicesToolkit to call Azure Cognitive Services
API: achieve some multimodal capabilities

This PR adds a toolkit named AzureCognitiveServicesToolkit which bundles
the following tools:
- AzureCogsImageAnalysisTool: calls Azure Cognitive Services image
analysis API to extract caption, objects, tags, and text from images.
- AzureCogsFormRecognizerTool: calls Azure Cognitive Services form
recognizer API to extract text, tables, and key-value pairs from
documents.
- AzureCogsSpeech2TextTool: calls Azure Cognitive Services speech to
text API to transcribe speech to text.
- AzureCogsText2SpeechTool: calls Azure Cognitive Services text to
speech API to synthesize text to speech.

This toolkit can be used to process image, document, and audio inputs.
---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
@danielchalef danielchalef mentioned this pull request Jun 5, 2023
This was referenced Jun 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
03 enhancement Enhancement of existing functionality lgtm PR looks good. Use to confirm that a PR is ready for merging.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants