azure-ai-inference sdk support for Managed Online Endpoint LLM deployments. AML, AI Foundry #39025

jakeatmsft · 2025-01-03T19:44:54Z

Package Name: azure-ai-inference
Package Version:
Operating System:
Python Version:

Describe the bug
Customer has deployed multiple Managed Online Endpoints to AzureML as well as AI Foundry. Endpoints exposed a /score inference endpoint but are not compatible with azure-ai-inference sdk. When specifying scoring endpoint and using key credential, the sdk returns error of "Failed Dependencies"

To Reproduce
Steps to reproduce the behavior:

Deploy LLM to Managed Endpoint
Configure Key, Endpoint url: ('https://endpointname.westus2.inference.ml.azure.com/score)
Run simple completion example in docs: (https://learn.microsoft.com/en-us/azure/ai-studio/reference/reference-model-inference-api?tabs=python#inference-sdk-support)

Result:
Error:
HttpResponseError: Operation returned an invalid status 'Failed Dependency'
Content: {"detail":"Not Found"}

Expected behavior
Completion response returned

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.

github-actions · 2025-01-03T21:33:19Z

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @dargilco.

dargilco · 2025-01-03T22:33:32Z

Adding @santiagxf

dargilco · 2025-01-03T22:42:41Z

@santiagxf to correct me if I'm wrong, but my understanding is that the /score route is for model-provider specific REST API, not the common "Azure AI Model Inference API" that the Python azure-ai-inference SDK supports. The common Inference API for chat completions has the route /chat/completions. @jakeatmsft What AI model are you using? Please try removing the /score from the endpoint you are using. The SDK automatically adds the /chat/completions to the given endpoint when making a ChatCompletionsClient.complete() call.

santiagxf · 2025-01-03T23:37:02Z

Thanks for reaching out @jakeatmsft. Can you please confirm the model you have deployed and that you are trying to use?

jakeatmsft · 2025-01-06T17:27:53Z

@santiagxf to correct me if I'm wrong, but my understanding is that the /score route is for model-provider specific REST API, not the common "Azure AI Model Inference API" that the Python azure-ai-inference SDK supports. The common Inference API for chat completions has the route /chat/completions. @jakeatmsft What AI model are you using? Please try removing the /score from the endpoint you are using. The SDK automatically adds the /chat/completions to the given endpoint when making a ChatCompletionsClient.complete() call.

This worked for my customer deploying llama-3.1-70b-instruct. I'll submit PR to update instructions.

dargilco · 2025-01-06T19:30:42Z

Jake's PR: #39038

santiagxf · 2025-01-06T20:06:18Z

I'm closing this issue as we have clarify that it was due to incorrect URL used in the inference client.

github-actions bot added the needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team label Jan 3, 2025

dargilco assigned dargilco and santiagxf Jan 3, 2025

santiagxf closed this as completed Jan 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

azure-ai-inference sdk support for Managed Online Endpoint LLM deployments. AML, AI Foundry #39025

azure-ai-inference sdk support for Managed Online Endpoint LLM deployments. AML, AI Foundry #39025

jakeatmsft commented Jan 3, 2025

github-actions bot commented Jan 3, 2025

dargilco commented Jan 3, 2025

dargilco commented Jan 3, 2025 •

edited

Loading

santiagxf commented Jan 3, 2025

jakeatmsft commented Jan 6, 2025

dargilco commented Jan 6, 2025

santiagxf commented Jan 6, 2025

azure-ai-inference sdk support for Managed Online Endpoint LLM deployments. AML, AI Foundry #39025

azure-ai-inference sdk support for Managed Online Endpoint LLM deployments. AML, AI Foundry #39025

Comments

jakeatmsft commented Jan 3, 2025

github-actions bot commented Jan 3, 2025

dargilco commented Jan 3, 2025

dargilco commented Jan 3, 2025 • edited Loading

santiagxf commented Jan 3, 2025

jakeatmsft commented Jan 6, 2025

dargilco commented Jan 6, 2025

santiagxf commented Jan 6, 2025

dargilco commented Jan 3, 2025 •

edited

Loading