Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding chat history to RAG app and refactor to better utilize LangChain #648

Open
wants to merge 23 commits into
base: main
Choose a base branch
from

Conversation

alpha-amundson
Copy link
Collaborator

See commit log for full description. tl;dr: added chat history to rag-frontend app.

…eep track of and retrieve chat history from Cloud SQL.

main.py - removed old langchain and logic to retrieve context. replaced with new chain from rag_chain.py. Introduced browser session with 30 minute ttl. Storing session ID in the session cookie. Session ID is then used to retrieve chat history. Chat history is cleared when timeout is reached.
cloud_sql.py - now includes a method to create a PostgresEngine for storing and retrieving history, plus a CustomVectorStore to perform the query embedding and vector search. Old code paths no longer needed were removed.
rag_chain.py - contains helper method create_chain to create, update and delete the end-to-end RAG chain with history.
various tf files: increased max input and total tokens on HF TGI for mistral. threadded through some parameters needed to instantiate the PostgresEngine.
requirements.txt - added some dependencies needed for langchain
@alpha-amundson alpha-amundson changed the title Also introduced a basic session history mechanism in the browser to k… Adding chat history to RAG app and refactor to better utilize LangChain May 3, 2024
@imreddy13
Copy link
Collaborator

/gcbrun

@alpha-amundson
Copy link
Collaborator Author

/gcbrun

1 similar comment
@alpha-amundson
Copy link
Collaborator Author

/gcbrun

* Working on improvements for rag application:
    - Working on missing TODO
    - Fixing issue with credentials
    - Refactoring vector_storages so you can add different vector storages
      TODO: Vector Storage factory
    - Unit test will be added on future PR

* Updating changes with db

* refactoring app so can be executed using gunicorn

* refactory of the code as flask application package

* Fixing Bugs
- Reviewing issue with IPtypes, currently the fix is to validate if there's an development environment so a public cloud_sql instance can be use.
- Fixing issue with Flask App Factory
@german-grandas
Copy link
Collaborator

/gcbrun

* Working on improvements for rag application:
    - Working on missing TODO
    - Fixing issue with credentials
    - Refactoring vector_storages so you can add different vector storages
      TODO: Vector Storage factory
    - Unit test will be added on future PR

* Updating changes with db

* refactoring app so can be executed using gunicorn

* refactory of the code as flask application package

* Fixing Bugs
- Reviewing issue with IPtypes, currently the fix is to validate if there's an development environment so a public cloud_sql instance can be use.
- Fixing issue with Flask App Factory

* Working on Custom HuggingFace interface
     - Adding a custom chat model to send request to HuggingFace TGI API
     - Applying formatting to code.
applications/rag/frontend/container/main.py Dismissed Show dismissed Hide dismissed
* Working on improvements for rag application:
    - Working on missing TODO
    - Fixing issue with credentials
    - Refactoring vector_storages so you can add different vector storages
      TODO: Vector Storage factory
    - Unit test will be added on future PR

* Updating changes with db

* refactoring app so can be executed using gunicorn

* refactory of the code as flask application package

* Fixing Bugs
- Reviewing issue with IPtypes, currently the fix is to validate if there's an development environment so a public cloud_sql instance can be use.
- Fixing issue with Flask App Factory

* Working on Custom HuggingFace interface
     - Adding a custom chat model to send request to HuggingFace TGI API
     - Applying formatting to code.

* Improving the CloudSQL vector vector_storage
applications/rag/frontend/container/main.py Dismissed Show dismissed Hide dismissed
main.py Fixed Show fixed Hide fixed
main.py Fixed Show fixed Hide fixed
@german-grandas
Copy link
Collaborator

/gcbrun

@german-grandas
Copy link
Collaborator

/gcbrun

@german-grandas
Copy link
Collaborator

/gcbrun

@german-grandas
Copy link
Collaborator

Some prompt answer examples using meta-llama/Llama-2-7b-hf

prompt_with_meta_llama_7b

Some prompt answer examples using meta-llama/Llama-2-7b-chat-hf
prompt_with_meta_llama_7b_chat

@gongmax
Copy link
Collaborator

gongmax commented Aug 15, 2024

/gcbrun

level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s"
)

ENVIRONMENT = os.environ.get("ENVIRONMENT")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an environment that flask set?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gongmax it's just for local development purposes, the variable was added because an issue on line 66.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean where is the ENVIRONMENT env get set?

cloudbuild.yaml Outdated Show resolved Hide resolved
)

chain = setup_and_retrieval | prompt | model
chain_with_history = RunnableWithMessageHistory(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some comments around how the chain.invoke works with this chain_with_history and user input? Especially around how setup_and_retrieval component works.

@gongmax
Copy link
Collaborator

gongmax commented Aug 16, 2024

Please also resolve the conflicts

@german-grandas
Copy link
Collaborator

/gcbrun

level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s"
)

ENVIRONMENT = os.environ.get("ENVIRONMENT")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean where is the ENVIRONMENT env get set?

@@ -269,7 +269,7 @@ steps:
kubectl exec -it -n rag-$SHORT_SHA-$_BUILD_ID jupyter-admin -c notebook -- jupyter nbconvert --to script /data/rag-kaggle-ray-sql-interactive.ipynb
kubectl exec -it -n rag-$SHORT_SHA-$_BUILD_ID jupyter-admin -c notebook -- ipython /data/rag-kaggle-ray-sql-interactive.py

python3 ./applications/rag/tests/test_rag.py "http://127.0.0.1:8081/prompt"
# python3 ./applications/rag/tests/test_rag.py "http://127.0.0.1:8081/prompt" Ignoring while the test approach is reviewed
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add this back to ensure the e2e test pass?

@@ -23,6 +23,14 @@ locals {
})
}

resource "random_string" "application_secret_key" {
length = var.project_id
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

length should be a number, and CI complains on this line:

Error: Incorrect attribute value type

  on frontend/main.tf line 27, in resource "random_string" "application_secret_key":
  27:   length  = var.project_id
    ├────────────────
    │ var.project_id is "gke-ai-eco-dev"

Inappropriate value for attribute "length": a number is required.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants