Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RAG: Upgrade text embedding from text-embedding-004 to text-multilingual-embedding-002 #48

Closed
MrCsabaToth opened this issue Aug 31, 2024 · 18 comments
Assignees
Labels
enhancement New feature or request RAG Retrieval Augmented Generation related

Comments

@MrCsabaToth
Copy link
Member

CHANDRA got me thinking about the new text-embedding-preview-0815 model to upgrade from text-embedding-004. However https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/generative_ai/text_embedding_new_api.ipynb shows that:

  • "text-embedding-004" for English.
  • "text-multilingual-embedding-002" for i18n.
  • "text-embedding-preview-0815" for English with text and code embeddings for Python and Java.

Python and Java code right now (along with the CODE_RETRIEVAL_QUERY new task type text-embedding-preview-0815 introducing) is not a significant purpose of the app as of now, even though users may screenshot a computer screen with code and ask about that. The current multimodal embedding has severe limitations, we'll probably go with transcribing images for vector indexing. The database models are already prepared for that.

So both text-embedding-preview-0815 and text-embedding-004 models turn out to be English only. To support international use I decided to try the text-multilingual-embedding-002, hoping that soon there will be new versions of that as well. Also note that with the introduction of the new dimensionality folding #47 we'll control the vector size regardless of the embedding model's output vector length.

curl -X POST
     -H "Authorization: Bearer $(gcloud auth print-access-token)"
     -H "Content-Type: application/json; charset=utf-8"
     -d @multirequest.json
     "https://us-central1-aiplatform.googleapis.com/v1/projects/duet-ai-roadshow-415022/locations/us-central1/publishers/google/models/text-multilingual-embedding-002:predict" > multiresult.json

multirequest.json:

{
  "instances": [
    {
      "task_type": "RETRIEVAL_DOCUMENT",
      "title": "Chat request",
      "content": "I would like embeddings for this text!"
    },
    {
      "task_type": "RETRIEVAL_DOCUMENT",
      "title": "Beszélgetés kérelem",
      "content": "Ehhez a szöveghez szeretnék beágyazásokat!"
    }
  ]
}

multiresult.json

@MrCsabaToth MrCsabaToth added enhancement New feature or request RAG Retrieval Augmented Generation related labels Aug 31, 2024
@MrCsabaToth MrCsabaToth self-assigned this Aug 31, 2024
@MrCsabaToth
Copy link
Member Author

MrCsabaToth commented Aug 31, 2024

Quick cursory look at the multiresult.json shows that the two vectors (English and Hungarian equivalent) are very close to each other: the signs of the elements match and the values are mostly very close. Dimensionality of the multi language model is 768 just like the newer English models.

@MrCsabaToth
Copy link
Member Author

There's a problem: models/text-multilingual-embedding-002 is not found for API version v1beta, or is not supported for embedContent. Call ListModels to see the list of available models and their supported methods.

Apparently https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/text-embeddings-api is Vertex AI and via Gemini API only text-embedding-004 and embedding-001 is mentioned https://ai.google.dev/gemini-api/docs/models/gemini#text-embedding

@MrCsabaToth MrCsabaToth reopened this Sep 3, 2024
@MrCsabaToth
Copy link
Member Author

I tested and unfortunately text-embedding-004 doesn't seem to produce close vectors for English vs Hungarian sentence like the text-multilingual-embedding-002 did.
I filed an issue: google-gemini/generative-ai-dart#209

@MrCsabaToth
Copy link
Member Author

Looks like we can use the Firebase Vertex AI Flutter package (https://pub.dev/packages/firebase_vertexai/example) for the multilingual embedding: https://firebase.google.com/docs/vertex-ai/gemini-models#input-output-comparison.

If we can transition to Firebase we might be able to eliminate the Gemini API Key in favor of Firebase? (I still want to avoid login if possible though, so maybe we'll use a close function - we already have a pair for STT and TTS).

However the Flutter Firebase Vertex AI package indicates it doesn't support function calling and structured output for Gemini 1.5 Flash? https://firebase.google.com/docs/vertex-ai/gemini-models#capabilities-features-comparison

@MrCsabaToth
Copy link
Member Author

MrCsabaToth commented Sep 3, 2024

It'd be good to try preview models. I tried latest before and that was a step back. We should also test experimental models, there's a fresh 0827 experimental version for 1.5 Pro and 1.5 Flash: https://ai.google.dev/gemini-api/docs/models/experimental-models#available-models

The question is if Google AI or Firebase Vertex AI package supports those.

List of Vertex AI embedding models regardless of SDK: https://cloud.google.com/vertex-ai/generative-ai/docs/learn/model-versions#embeddings_stable_model_versions

@MrCsabaToth
Copy link
Member Author

It'd be good to try preview models. I tried latest before and that was a step back. We should also test experimental models, there's a fresh 0827 experimental version for 1.5 Pro and 1.5 Flash: https://ai.google.dev/gemini-api/docs/models/experimental-models#available-models

@MrCsabaToth
Copy link
Member Author

@MrCsabaToth
Copy link
Member Author

Note that https://pub.dev/packages/googleai_dart is by langchain Dart, but google_generative_ai almost achieved feature feature parity now so googleai_dart will be retired.

@MrCsabaToth
Copy link
Member Author

MrCsabaToth commented Sep 4, 2024

Firebase Vertex AI Flutter setup:
https://firebase.google.com/docs/vertex-ai/get-started?platform=flutter

Code sample: https://firebase.google.com/docs/vertex-ai/locations?platform=flutter#code-samples

Unfortunately so far Firebase Vertex AI 404s for embedContent: firebase/flutterfire#13269

92caxj.jpg

MrCsabaToth added a commit that referenced this issue Sep 4, 2024
…ling, reasoning and multi modal capabilities #48 #45

(There's a 0827 /2024 Aug 27/ experimental model vs the current stable is 0514 May 14th)
@MrCsabaToth
Copy link
Member Author

@MrCsabaToth
Copy link
Member Author

MrCsabaToth commented Sep 5, 2024

Firebase Vertex AI Flutter setup:
https://firebase.google.com/docs/vertex-ai/get-started?platform=flutter

Code sample: https://firebase.google.com/docs/vertex-ai/locations?platform=flutter#code-samples

Unfortunately so far Firebase Vertex AI 404s for embedContent: firebase/flutterfire#13269

@MrCsabaToth
Copy link
Member Author

The workaround will be a cloud function, we'll have to establish another anyway for reranking as well #39

@MrCsabaToth MrCsabaToth changed the title Upgrade text embedding from text-embedding-004 to text-multilingual-embedding-002 RAG: Upgrade text embedding from text-embedding-004 to text-multilingual-embedding-002 Sep 5, 2024
@MrCsabaToth
Copy link
Member Author

Multilingual embedding codes:

But now I'm thinking I should rather go for multi-modal embedding! This way we wouldn't have to describe images and videos.

from vertexai.language_models import TextEmbeddingInput, TextEmbeddingModel
from vertexai.vision_models import Image as VMImage
from vertexai.vision_models import MultiModalEmbeddingModel, MultiModalEmbeddingResponse
from vertexai.vision_models import Video as VMVideo
from vertexai.vision_models import VideoSegmentConfig

mm_embedding_model = MultiModalEmbeddingModel.from_pretrained("multimodalembedding")
text_embedding_model = TextEmbeddingModel.from_pretrained("textembedding-gecko@003")

image = VMImage.load_from_file(image_path) if image_path else None
video = VMVideo.load_from_file(video_path) if video_path else None

embeddings = mm_embedding_model.get_embeddings(
    image=image,
    video=video,
    video_segment_config=video_segment_config,
    contextual_text=contextual_text,
    dimension=dimension,
)

@MrCsabaToth
Copy link
Member Author

Note that we perform a dimensionality reduction with folding (instead of truncation) which currently leads to non normalized vectors. This means that dot product (the potentially most cost effective distance) is not a valid distance any more https://cloud.google.com/firestore/docs/vector-search#choose-distance-measure
So maybe we should normalize after the folding?

@MrCsabaToth
Copy link
Member Author

Maybe the user should decide if they prefer multi lingual embedding or multi modal. Currently it seems that the two together is not possible.

  1. In case of multi lingual embedding we would need to perform an image / audio / video transcription and embed that.
  2. In case of multi modal embedding we'd need to translate everything to English.

MrCsabaToth added a commit that referenced this issue Oct 18, 2024
MrCsabaToth added a commit that referenced this issue Oct 18, 2024
See notebooks/official/generative_ai/vertex_sdk_llm_snippets.ipynb and https://cloud.google.com/vertex-ai/docs/start/install-sdk
MrCsabaToth added a commit that referenced this issue Oct 20, 2024
(we don't override dimensionality, but the empty parameter should be passed differently)
MrCsabaToth added a commit that referenced this issue Oct 20, 2024
MrCsabaToth added a commit that referenced this issue Oct 20, 2024
Maybe we'll need to indeed have that kwargs. The model has 768 dimensions.
MrCsabaToth added a commit that referenced this issue Oct 20, 2024
@MrCsabaToth
Copy link
Member Author

The new embedding function uses both multi modal embedding (for media) and multi lingual embedding (for texts)

@MrCsabaToth
Copy link
Member Author

MrCsabaToth commented Dec 15, 2024

The documentation explicitly states that embedding is not available via Firebase Vertex AI: https://firebase.google.com/docs/vertex-ai/gemini-models#capabilities-features-comparison

"Note: Context caching, fine tuning a model, embeddings, and semantic retrieval are supported by various models or the Vertex AI Gemini API, but they're not supported by the Vertex AI in Firebase SDKs."

Screenshot_2024-12-15_13-41-51

So indeed the only way to go is our own cloud functions.

@MrCsabaToth
Copy link
Member Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request RAG Retrieval Augmented Generation related
Projects
None yet
Development

No branches or pull requests

1 participant