[New Feature] Add default retry to every `/api/ask` endpoint to utilize connection pool. #23

ahmetkca · 2022-12-28T19:58:04Z

I think we can add default number of retry to each incoming request to /api/ask endpoint. Instead of returning Content-Type: 'application/json' we could return Content-Type: 'text/event-stream'. With this change it might be slightly slower than the original but we would at least try at least 3 times with different agents from connection pool.

For example, proposed change:

/api/ask endpoint content type would be text/event-stream instead of application/json:

curl "http://localhost:8080/api/ask" -X POST --header 'Authorization: <API_KEY>' -d '{"content": "Hello world"}'

default minimum retry # is 3.

In this example /api/ask endpoint failed with first two agent and was successful at the third one.

data: retry #1 failed.

data: retry #2 failed.

data: {"message": { ... }, "conversation_id": " ... "}

data: [DONE]

I believe this would utilize the connection pool even better.

The text was updated successfully, but these errors were encountered:

acheong08 · 2022-12-29T06:58:58Z

The retry functionality has been added to https://github.com/ChatGPT-Hackers/ChatGPT-API-server/tree/dev

acheong08 · 2022-12-29T06:59:07Z

I'm still testing it out

acheong08 · 2022-12-29T08:09:13Z

There is a bug I don't know how to fix

0xRaduan · 2022-12-29T09:41:33Z

IMO, you need to use some sort of blocking mechanism, so you can retrieve quickly websockets, which are available at the moment, or just passively wait for any new one to be available.

One way to do that is to use a sort of priority queue, where you will be grading just based on this parameter(available/not_available). And make sure, that you can write to it in a multi-threaded fashion updates on whether websocket is used or not.

Also, a good feature to have would be to add a number of request, that was already made to a particular connection. I am not aware of exact number of request per hour, that are permitted, but in my testing I have hit some limits within an hour. In this case we don't really want to navigate our requests to this websocket.

acheong08 · 2022-12-29T10:14:22Z

I was gonna write a queue system but I'm not quite sure how to implement it correctly. Gin is inherently multi-threaded and there is already a blocking mechanism in place for the connection pool though.

acheong08 · 2022-12-29T10:15:52Z

Also, a good feature to have would be to add a number of request, that was already made to a particular connection. I am not aware of exact number of request per hour, that are permitted, but in my testing I have hit some limits within an hour. In this case we don't really want to navigate our requests to this websocket.

Since it is cycling by oldest connection first, each connection should have a similar number of requests. If limits are hit, all existing connections would also be rate limited in the subsequent request.

icycandy · 2023-01-04T07:53:55Z

In my experience, one conversation_id is bind to one OpenAI account, one conversation_id can be used multi times, representing a long multi round conversation. So different accounts may be used at different frequencies.

Another experience is that when the api server has not been connected for a period of time, the first request will returns {"id": "65f76efa-e0cb-47c1-a054-6f6b5fd5888d", "message": "error", "data": "Wrong response code"}} . But the immediately next request will return normally. Guess that the connection was becoming invalid because there was no connection for long time? (the firefox tab needs refresh?)

If we maintain rate of each account, then

if one account was rate limited, we can redirect request to other accounts
if one account has no connection for a period of time (such as 10 minutes), we can send a fake request to keep it alive

acheong08 · 2023-01-04T08:52:15Z

The error handling can be done on the client side. If you get a message of "error", let it sleep a second or two and then try again. Doing this from the server could clog up the connection and compete with actual requests, introducing additional downtime and errors

acheong08 · 2023-01-04T08:52:39Z

if one account was rate limited, we can redirect request to other accounts

Possible. Will consider it.

icycandy · 2023-01-04T11:14:06Z

How about the other one: regularly send fake request to idle connection?

icycandy · 2023-01-04T13:20:39Z

How about the other one: regularly send fake request to idle connection?

I'm not sure whether regulary send fake request will help keeping the connection alive. Ignore me.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[New Feature] Add default retry to every `/api/ask` endpoint to utilize connection pool. #23

[New Feature] Add default retry to every `/api/ask` endpoint to utilize connection pool. #23

ahmetkca commented Dec 28, 2022

acheong08 commented Dec 29, 2022

acheong08 commented Dec 29, 2022

acheong08 commented Dec 29, 2022

0xRaduan commented Dec 29, 2022 •

edited

Loading

acheong08 commented Dec 29, 2022

acheong08 commented Dec 29, 2022

icycandy commented Jan 4, 2023

acheong08 commented Jan 4, 2023

acheong08 commented Jan 4, 2023

icycandy commented Jan 4, 2023

icycandy commented Jan 4, 2023

[New Feature] Add default retry to every /api/ask endpoint to utilize connection pool. #23

[New Feature] Add default retry to every /api/ask endpoint to utilize connection pool. #23

Comments

ahmetkca commented Dec 28, 2022

acheong08 commented Dec 29, 2022

acheong08 commented Dec 29, 2022

acheong08 commented Dec 29, 2022

0xRaduan commented Dec 29, 2022 • edited Loading

acheong08 commented Dec 29, 2022

acheong08 commented Dec 29, 2022

icycandy commented Jan 4, 2023

acheong08 commented Jan 4, 2023

acheong08 commented Jan 4, 2023

icycandy commented Jan 4, 2023

icycandy commented Jan 4, 2023

[New Feature] Add default retry to every `/api/ask` endpoint to utilize connection pool. #23

[New Feature] Add default retry to every `/api/ask` endpoint to utilize connection pool. #23

0xRaduan commented Dec 29, 2022 •

edited

Loading