Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

azure-ai-inference sdk support for Managed Online Endpoint LLM deployments. AML, AI Foundry #39025

Closed
jakeatmsft opened this issue Jan 3, 2025 · 7 comments
Assignees
Labels
AI Model Inference Issues related to the client library for Azure AI Model Inference (\sdk\ai\azure-ai-inference) AI customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention Workflow: This issue is responsible by Azure service team.

Comments

@jakeatmsft
Copy link

  • Package Name: azure-ai-inference
  • Package Version:
  • Operating System:
  • Python Version:

Describe the bug
Customer has deployed multiple Managed Online Endpoints to AzureML as well as AI Foundry. Endpoints exposed a /score inference endpoint but are not compatible with azure-ai-inference sdk. When specifying scoring endpoint and using key credential, the sdk returns error of "Failed Dependencies"

To Reproduce
Steps to reproduce the behavior:

  1. Deploy LLM to Managed Endpoint
  2. Configure Key, Endpoint url: ('https://endpointname.westus2.inference.ml.azure.com/score)
  3. Run simple completion example in docs: (https://learn.microsoft.com/en-us/azure/ai-studio/reference/reference-model-inference-api?tabs=python#inference-sdk-support)

Result:
Error:
HttpResponseError: Operation returned an invalid status 'Failed Dependency'
Content: {"detail":"Not Found"}

Expected behavior
Completion response returned

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.

@github-actions github-actions bot added customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-triage Workflow: This is a new issue that needs to be triaged to the appropriate team. question The issue doesn't require a change to the product in order to be resolved. Most issues start as that labels Jan 3, 2025
@xiangyan99 xiangyan99 added Service Attention Workflow: This issue is responsible by Azure service team. AI AI Model Inference Issues related to the client library for Azure AI Model Inference (\sdk\ai\azure-ai-inference) and removed needs-triage Workflow: This is a new issue that needs to be triaged to the appropriate team. labels Jan 3, 2025
@github-actions github-actions bot added the needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team label Jan 3, 2025
Copy link

github-actions bot commented Jan 3, 2025

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @dargilco.

@dargilco
Copy link
Member

dargilco commented Jan 3, 2025

Adding @santiagxf

@dargilco
Copy link
Member

dargilco commented Jan 3, 2025

@santiagxf to correct me if I'm wrong, but my understanding is that the /score route is for model-provider specific REST API, not the common "Azure AI Model Inference API" that the Python azure-ai-inference SDK supports. The common Inference API for chat completions has the route /chat/completions. @jakeatmsft What AI model are you using? Please try removing the /score from the endpoint you are using. The SDK automatically adds the /chat/completions to the given endpoint when making a ChatCompletionsClient.complete() call.

@santiagxf
Copy link
Member

Thanks for reaching out @jakeatmsft. Can you please confirm the model you have deployed and that you are trying to use?

@jakeatmsft
Copy link
Author

@santiagxf to correct me if I'm wrong, but my understanding is that the /score route is for model-provider specific REST API, not the common "Azure AI Model Inference API" that the Python azure-ai-inference SDK supports. The common Inference API for chat completions has the route /chat/completions. @jakeatmsft What AI model are you using? Please try removing the /score from the endpoint you are using. The SDK automatically adds the /chat/completions to the given endpoint when making a ChatCompletionsClient.complete() call.

This worked for my customer deploying llama-3.1-70b-instruct. I'll submit PR to update instructions.

@dargilco
Copy link
Member

dargilco commented Jan 6, 2025

Jake's PR: #39038

@santiagxf
Copy link
Member

I'm closing this issue as we have clarify that it was due to incorrect URL used in the inference client.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AI Model Inference Issues related to the client library for Azure AI Model Inference (\sdk\ai\azure-ai-inference) AI customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention Workflow: This issue is responsible by Azure service team.
Projects
None yet
Development

No branches or pull requests

4 participants