Is the timeout logic handling correct? #454

tim-tribble-ai · 2023-10-27T13:39:23Z

I've got the following config setup:

        "request_timeout": 30,
        "max_retry_period": 200,
        "retry_wait_time": 30

However, I'm noticing that when there's a read timeout, there's no retry happening:

Exception: Timeout: Request timed out: HTTPSConnectionPool(host='blah123.openai.azure.com', port=443): Read timed out. (read timeout=30)

I'm unsure if the timeout handling in oai/completion.py is correct:

except (RateLimitError, Timeout) as err:
                time_left = max_retry_period - (time.time() - start_time + retry_wait_time)
                if (
                    time_left > 0
                    and isinstance(err, RateLimitError)
                    or time_left > request_timeout
                    and isinstance(err, Timeout)
                    and "request_timeout" not in config
                ):
                    if isinstance(err, Timeout):
                        request_timeout <<= 1
                    request_timeout = min(request_timeout, time_left)
                    logger.info(f"retrying in {retry_wait_time} seconds...", exc_info=1)
                    sleep(retry_wait_time)
                elif raise_on_ratelimit_or_timeout:
                    raise

In this case, it's not a rate limit exception, but just a Timeout exception. Why is there a check for ...and "request_timeout" not in config ? This seems to be skipping the retry for non-RateLimit timeouts.

Can that check be removed, and within the except logic,

request_timeout = request_timeout <<= 1 if "request_timeout" not in config else request_timeout

?

The text was updated successfully, but these errors were encountered:

tim-tribble-ai · 2023-10-27T13:53:39Z

Follow-on from the above: in the else branch of the above, I think the response = -1 is also incorrect:

                else:
                    response = -1
                    if use_cache and isinstance(err, Timeout):
                        cls._cache.set(key, response)
                    logger.warning(
                        f"Failed to get response from openai api due to getting RateLimitError or Timeout for {max_retry_period} seconds."
                    )
                    return response

This is causing issues if that else branch is returned to caller:

Getting Exception: TypeError: 'int' object is not subscriptable exceptions:

...
File "/Users/my_stuff/.venv/lib/python3.10/site-packages/autogen/agentchat/conversable_agent.py", line 609, in generate_oai_reply
  return True, oai.ChatCompletion.extract_text_or_function_call(response)[0]
File "/Users/my_stuff/.venv/lib/python3.10/site-packages/autogen/oai/completion.py", line 1070, in extract_text_or_function_call
  choices = response["choices"]

So the caller should probably check if response is not a dict then act accordingly?

gagb · 2023-10-27T18:33:38Z

@afourney is this related to an observation in your test suite?

afourney · 2023-10-27T18:47:26Z

Yes I was getting many timeouts, but I hadn’t tracked down the source of the errors. To mitigate the problem, I extended the timeout.

Let me investigate.

This logic will also be replaced after #203 is merged since it uses OpenAI v1.0.

Quintas168 · 2023-10-30T12:46:26Z

I have run into the same question but still don't know why

tim-watcha · 2023-11-06T00:48:39Z

openai.error.Timeout: Request timed out: HTTPSConnectionPool(host='api.openai.com', port=443): Read timed out. (read timeout=60)
I got same error

mwahl217 · 2023-11-23T04:03:22Z

I too am seeing this error, openai.error.Timeout: Request timed out: HTTPSConnectionPool(host='api.openai.com', port=5001): Read timed out. (read timeout=60). However, I am using autogen with LM Studio

I did try and increase timeouts to see if that will help, but not sure I am addressing the root cause.

llm_config = {
"request_timeout": 800,
"max_retry_period": 50,
"retry_wait_time": 10,
#"seed": 0,
"seed":44,
"config_list": config_list,
"temperature": 0.2,
}

sonichi · 2023-12-03T18:10:01Z

In v0.2 these settings changed: https://microsoft.github.io/autogen/docs/Installation#python

thinkall · 2024-06-18T04:51:21Z

We are closing this issue due to inactivity; please reopen if the problem persists.

…r samples #454 (#469) * create model context component, remove chat memory component, refactor samples #454 * Fix bugs in samples. * Fix * Update docs * add unit tests

gagb added question Further information is requested bug labels Oct 27, 2023

thinkall closed this as completed Jun 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is the timeout logic handling correct? #454

Is the timeout logic handling correct? #454

tim-tribble-ai commented Oct 27, 2023

tim-tribble-ai commented Oct 27, 2023

gagb commented Oct 27, 2023

afourney commented Oct 27, 2023 •

edited

Loading

Quintas168 commented Oct 30, 2023

tim-watcha commented Nov 6, 2023

mwahl217 commented Nov 23, 2023

sonichi commented Dec 3, 2023

thinkall commented Jun 18, 2024

Is the timeout logic handling correct? #454

Is the timeout logic handling correct? #454

Comments

tim-tribble-ai commented Oct 27, 2023

tim-tribble-ai commented Oct 27, 2023

gagb commented Oct 27, 2023

afourney commented Oct 27, 2023 • edited Loading

Quintas168 commented Oct 30, 2023

tim-watcha commented Nov 6, 2023

mwahl217 commented Nov 23, 2023

sonichi commented Dec 3, 2023

thinkall commented Jun 18, 2024

afourney commented Oct 27, 2023 •

edited

Loading