-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resilience and AI Service #748
Comments
I am wondering if we can behind the scenes "move" the resilience declared on the AiService to the underlying client... |
From the Zulip discussion, another interesting idea is smallrye/smallrye-fault-tolerance#259 |
- Rewrite the prompt to work with Granite 7B Instruct - Add a retry strategy (however, we are hitting quarkiverse/quarkus-langchain4j#748) - Add a readme with the instructions to run the application locally
shouldn't the state used to do a call avoid being mutated before the call has been completed? wouldn't that avoid the "growing" ? |
So you are essentially proposing that the chat memory only be added to when the call succeeds, right? That could potentially work... |
It's actually a lot trickier than I thought because there are potentially multiple API calls that go into implementing an AI service and that add (and even remove) to / from memory |
#764 fixes this |
Ensure that @Retry works properly with chat memory
This issue discusses the resilience in AI Services and their impact on the memory/context.
Context:
I'm using Granite 7B instruct, with a relatively limited context size (2048 tokens). My prompt (user message) is relatively large.
I was using @Retry on the IAService method, as the model misbehaves sometimes, and retrying improves reliability (the response time is not a factor in my context).
My AI Service calls are part of an HTTP request processing, so they are part of the request scope associated with the HTTP request.
Problem:
When retrying, the context grows, including multiple times the user message, which eventually exceeds the context size.
Let's try to describe it:
While my issue was on a
@Retry
, it may happen when using@CircuitBreaker
, and so on.Some ideas:
The text was updated successfully, but these errors were encountered: