Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLM SQL interface + Fixes to TOKEN limits in prompts. #81

Merged
merged 8 commits into from
May 22, 2023
Merged

Conversation

raulraja
Copy link
Contributor

LLM SQL interface

The centerpiece of this implementation is the SQL interface that leverages a Large Language Model (LLM), such as GPT-3.5 Turbo, to transform natural language inputs into SQL queries.

Here is an overview of how this process works:

User Input: The interface accepts natural language inputs from users. This could be in the form of a question or a statement that requires data from an SQL database.

LLM SQL: The input is passed through the LLM and translated into SQL queries.

Execution of SQL Query: The translated SQL query is then executed on the appropriate SQL database. This step is abstracted away from the user, providing a seamless experience but an additional operator to pass raw SQL as input is also provided.

Context Extension: The results from the SQL query execution are then added back to the conversation context as documents when used with extendContext or context. The operators can also be used independently of the context for manual memory management. This helps maintain the conversation's continuity and provides the LLM with additional information that can be used to generate more accurate and contextual responses.

User Output: Finally, the result is presented back to the user in natural language, completing the conversation loop.

The DatabaseExample.kt file demonstrates this concept in practice with a CLI chat application. In the example, the user interacts with the SQL database using natural language inputs. Each input from the user is treated as a natural language query, transformed into an SQL query via the LLM, and executed against the database. The results are then added to the context, ready for the next user input.

While working on this feature, I ran into issues with prompts, so I attempted also to fix this part which was making the
prompt messages continuously fail when we send large prompts.

TOKEN limits in prompts:

  • Changes related to improving how token limits are managed when sending messages:

AIError.kt: A new error class PromptExceedsMaxTokenLength is introduced. It contains the error prompt, the number of prompt tokens, and the maximum token limit. It generates an error message indicating that the prompt exceeds the max token length.

DeserializerLLMAgent.kt & LLMAgent.kt: Changes have been made to include a new argument in the prompt function named minResponseTokens with a default value 500. This argument aims to ensure the AI generates a response of at least this length.

LLMAgent.kt: Major changes have been introduced. A new function, createPromptWithContextAwareOfTokens is defined to handle cases when the context information exceeds the model's maximum context length. It truncates the context to make sure it fits within the limit. Also, two helper functions callCompletionEndpoint and callChatEndpoint have been refactored to consider the new changes.

LLMAgent.kt: The promptWithContextAndRemainingTokens function is introduced to handle cases where the context needs to be truncated based on the model's maximum context length. It calculates the number of tokens used and remaining, logging this information for debugging.

models.kt: The LLMModel class is modified to include a modelType property of type ModelType, replacing the contextLength property. The purpose is to encapsulate more detailed information about the model, such as its maximum context length and tokenization method, within the ModelType class. Predefined instances of LLMModel, like GPT_4, GPT_4_0314, etc., are updated accordingly.

These changes reflect a move towards ensuring that the prompt, context, and responses adhere to the maximum token length limitations of the AI model. Added also a new AIError to ensure we bail early if we send a prompt that is too long.

@raulraja raulraja changed the title Adapt request to token limits + sql module LLM SQL interface + Fixes to TOKEN limits in prompts. May 21, 2023
@raulraja
Copy link
Contributor Author

Here is some example output with bogus data I set up on a local database from the example program.

llmdb> Welcome to the LLMDB (An LLM interface to your SQL Database) !
llmdb> You can ask me questions about the database and I will try to answer them.
llmdb> You can type `exit` to exit the program.
llmdb> Loading recommended prompts...
22:17:27.580 [DefaultDispatcher-worker-7] DEBUG c.xebia.functional.xef.auto.LLMAgent -- Tokens: used: 2650, model max: 4097, left: 1447
llmdb> 1. What are the columns in the `auctions` table?
2. Can you provide information about the `newauctions` table?
3. How many records are in the `stats` table?
user> auctions near postal code 11100
22:17:52.705 [DefaultDispatcher-worker-11] DEBUG c.xebia.functional.xef.auto.LLMAgent -- Tokens: used: 246, model max: 4097, left: 3851
22:17:53.431 [DefaultDispatcher-worker-1] DEBUG c.x.f.xef.sql.jdbc.JDBCSQLImpl -- Selected tables: [auctions]
22:17:54.077 [DefaultDispatcher-worker-1] DEBUG c.xebia.functional.xef.auto.LLMAgent -- Tokens: used: 546, model max: 4097, left: 3551
22:17:59.807 [DefaultDispatcher-worker-6] DEBUG c.x.f.xef.sql.jdbc.JDBCSQLImpl -- SQL: SELECT id, auctionID, calle, localidad, codigo_postal, provincia_sort, provincia, comunidad_autonoma, lat, lon 
FROM auctions 
WHERE codigo_postal = '11100' 
LIMIT 50;
22:17:59.974 [DefaultDispatcher-worker-6] DEBUG c.x.f.xef.sql.jdbc.JDBCSQLImpl -- Found: 23 records
22:17:59.977 [DefaultDispatcher-worker-6] DEBUG c.x.f.xef.sql.jdbc.JDBCSQLImpl -- Split into: 23 documents
22:18:01.398 [DefaultDispatcher-worker-11] DEBUG c.xebia.functional.xef.auto.LLMAgent -- Tokens: used: 877, model max: 4097, left: 3220
llmdb> The context provides information about auctions near postal code 11100 in the province of Cádiz, Spain. There are three auction IDs mentioned: 1, 2, and 3. The addresses of the auctions are mentioned as Calle El Futuro 2, Calle El Presente 3, and an unspecified address. The context also provides information about the latitude and longitude of the auctions, as well as the community city and zip code.
user> 

@raulraja
Copy link
Contributor Author

@xebia-functional/team-ai

@raulraja
Copy link
Contributor Author

@nomisRev @realdavidvega ready for another review or merge whenever CI is green

@realdavidvega
Copy link
Contributor

LGTM! Thanks @raulraja! 🚀

* refactor: reduce complexity and reuse variables
* feat: add debug logger on open ai client for token usage
* feat: add tokens counting by model
* style: spotless happiness
* style: more spotless happiness
@realdavidvega realdavidvega merged commit 796a69b into main May 22, 2023
@realdavidvega realdavidvega deleted the sql branch May 22, 2023 16:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants