Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: ollama_chat/ provider does not honor timeout #8333

Open
paul-gauthier opened this issue Feb 6, 2025 · 3 comments
Open

[Bug]: ollama_chat/ provider does not honor timeout #8333

paul-gauthier opened this issue Feb 6, 2025 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@paul-gauthier
Copy link
Contributor

What happened?

I can pass timeout to completion() and most models seem to honor it. Models from the ollama_chat/ provider do not.

Relevant log output

import litellm

def doit(model):
    messages=[{"role": "user", "content": "hi"}]
    try:
        comp = litellm.completion(model, messages, timeout=0.1)
        print(model, comp.choices[0].message.content)
    except Exception as e:
        print(model, type(e))

doit("gpt-4o") 
# outputs: gpt-4o <class 'litellm.exceptions.Timeout'>

doit("ollama/llama3.2:3b-instruct-q5_K_S")
# outputs: ollama/llama3.2:3b-instruct-q5_K_S <class 'litellm.exceptions.APIConnectionError'>

doit("ollama_chat/llama3.2:3b-instruct-q5_K_S")
# outputs: ollama_chat/llama3.2:3b-instruct-q5_K_S Hello! How can I assist you today?

Are you a ML Ops Team?

No

What LiteLLM version are you on ?

v1.60.5

Twitter / LinkedIn details

No response

@vmajor
Copy link

vmajor commented Feb 6, 2025

Since I was copied into this I am guessing at least one of my reports made it even though I cannot see them anywhere.

I do not us ollama, I use the OG llama-server and aider also ignores any and all --timeout settings that I tried and times out the session mid response.

I have yet to see any advice on how to stop this from happening. Is there a separate setting in model config that talks directly to LiteLLM? I do not have a standalone LitelLLM instalation, only what was installed by Aider itself.

Ideally in any situation where the API is on localhost, any and all timeout settings need to be disabled since we are directly in control of the API health and behaviour. Aider/LiteLLM should not interfere with this.

@paul-gauthier
Copy link
Contributor Author

@vmajor Please follow up back in the aider issue:
Aider-AI/aider#276

@krrishdholakia
Copy link
Contributor

Thanks for the issue @paul-gauthier. I believe we need to just refactor ollama_chat to also use the base_llm_http_handler

should fix this

@krrishdholakia krrishdholakia self-assigned this Feb 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants