NotFound error when using mlx server #875

cbschen1 · 2024-07-06T21:40:52Z

cbschen1
Jul 6, 2024

Dear mlx community,

I am working on MAC with apple silicon, and I ran into an issue running local LLMs using the mlx server. (the same code below works on llama-cpp server)

Below is the code I used:

openai_client = openai.OpenAI(api_key = "placeholder-api", base_url="http://localhost:8080")

response = openai_client.chat.completions.create(
model = 'mlx-community/Meta-Llama-3-70B-Instruct-4bit',
messages= [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": f"Question: {query}."},
]
)

I confirm that mlx server opens successfully, and running below on command line works

curl localhost:8080/v1/chat/completions
-H "Content-Type: application/json"
-d '{
"messages": [{"role": "user", "content": "Say this is a test!"}],
"temperature": 0.7
}'

However, when I send query via the openai API, it gives me the NotFoundError: Not Found
"POST /chat/completions HTTP/1.1" 404 -

Any help would be sincerely appreciated!

Answered by awni

Jul 7, 2024

Fix is in #877

View full answer

awni · 2024-07-07T22:51:44Z

awni
Jul 7, 2024
Maintainer

Fix is in #877

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NotFound error when using mlx server #875

{{title}}

Replies: 1 comment

{{title}}

Select a reply

NotFound error when using mlx server #875

cbschen1 Jul 6, 2024

Replies: 1 comment

awni Jul 7, 2024 Maintainer

cbschen1
Jul 6, 2024

awni
Jul 7, 2024
Maintainer