RAG: Upgrade text embedding from text-embedding-004 to text-multilingual-embedding-002 #48

MrCsabaToth · 2024-08-31T22:46:42Z

CHANDRA got me thinking about the new text-embedding-preview-0815 model to upgrade from text-embedding-004. However https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/generative_ai/text_embedding_new_api.ipynb shows that:

"text-embedding-004" for English.
"text-multilingual-embedding-002" for i18n.
"text-embedding-preview-0815" for English with text and code embeddings for Python and Java.

Python and Java code right now (along with the CODE_RETRIEVAL_QUERY new task type text-embedding-preview-0815 introducing) is not a significant purpose of the app as of now, even though users may screenshot a computer screen with code and ask about that. The current multimodal embedding has severe limitations, we'll probably go with transcribing images for vector indexing. The database models are already prepared for that.

So both text-embedding-preview-0815 and text-embedding-004 models turn out to be English only. To support international use I decided to try the text-multilingual-embedding-002, hoping that soon there will be new versions of that as well. Also note that with the introduction of the new dimensionality folding #47 we'll control the vector size regardless of the embedding model's output vector length.

curl -X POST
     -H "Authorization: Bearer $(gcloud auth print-access-token)"
     -H "Content-Type: application/json; charset=utf-8"
     -d @multirequest.json
     "https://us-central1-aiplatform.googleapis.com/v1/projects/duet-ai-roadshow-415022/locations/us-central1/publishers/google/models/text-multilingual-embedding-002:predict" > multiresult.json

multirequest.json:

{
  "instances": [
    {
      "task_type": "RETRIEVAL_DOCUMENT",
      "title": "Chat request",
      "content": "I would like embeddings for this text!"
    },
    {
      "task_type": "RETRIEVAL_DOCUMENT",
      "title": "Beszélgetés kérelem",
      "content": "Ehhez a szöveghez szeretnék beágyazásokat!"
    }
  ]
}

multiresult.json

The text was updated successfully, but these errors were encountered:

MrCsabaToth · 2024-08-31T22:50:10Z

Quick cursory look at the multiresult.json shows that the two vectors (English and Hungarian equivalent) are very close to each other: the signs of the elements match and the values are mostly very close. Dimensionality of the multi language model is 768 just like the newer English models.

…onality remains 768

MrCsabaToth · 2024-09-03T06:33:15Z

There's a problem: models/text-multilingual-embedding-002 is not found for API version v1beta, or is not supported for embedContent. Call ListModels to see the list of available models and their supported methods.

Apparently https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/text-embeddings-api is Vertex AI and via Gemini API only text-embedding-004 and embedding-001 is mentioned https://ai.google.dev/gemini-api/docs/models/gemini#text-embedding

MrCsabaToth · 2024-09-03T06:48:42Z

I tested and unfortunately text-embedding-004 doesn't seem to produce close vectors for English vs Hungarian sentence like the text-multilingual-embedding-002 did.
I filed an issue: google-gemini/generative-ai-dart#209

MrCsabaToth · 2024-09-03T15:11:31Z

Looks like we can use the Firebase Vertex AI Flutter package (https://pub.dev/packages/firebase_vertexai/example) for the multilingual embedding: https://firebase.google.com/docs/vertex-ai/gemini-models#input-output-comparison.

If we can transition to Firebase we might be able to eliminate the Gemini API Key in favor of Firebase? (I still want to avoid login if possible though, so maybe we'll use a close function - we already have a pair for STT and TTS).

However the Flutter Firebase Vertex AI package indicates it doesn't support function calling and structured output for Gemini 1.5 Flash? https://firebase.google.com/docs/vertex-ai/gemini-models#capabilities-features-comparison

MrCsabaToth · 2024-09-03T15:14:12Z

It'd be good to try preview models. I tried latest before and that was a step back. We should also test experimental models, there's a fresh 0827 experimental version for 1.5 Pro and 1.5 Flash: https://ai.google.dev/gemini-api/docs/models/experimental-models#available-models

The question is if Google AI or Firebase Vertex AI package supports those.

List of Vertex AI embedding models regardless of SDK: https://cloud.google.com/vertex-ai/generative-ai/docs/learn/model-versions#embeddings_stable_model_versions

MrCsabaToth · 2024-09-03T17:18:26Z

It'd be good to try preview models. I tried latest before and that was a step back. We should also test experimental models, there's a fresh 0827 experimental version for 1.5 Pro and 1.5 Flash: https://ai.google.dev/gemini-api/docs/models/experimental-models#available-models

MrCsabaToth · 2024-09-03T17:20:09Z

Firebase Vertex AI Flutter setup:
https://firebase.google.com/docs/vertex-ai/get-started?platform=flutter

Code sample: https://firebase.google.com/docs/vertex-ai/locations?platform=flutter#code-samples

MrCsabaToth · 2024-09-03T22:23:57Z

Note that https://pub.dev/packages/googleai_dart is by langchain Dart, but google_generative_ai almost achieved feature feature parity now so googleai_dart will be retired.

MrCsabaToth · 2024-09-04T07:21:38Z

Firebase Vertex AI Flutter setup:
https://firebase.google.com/docs/vertex-ai/get-started?platform=flutter

Code sample: https://firebase.google.com/docs/vertex-ai/locations?platform=flutter#code-samples

Unfortunately so far Firebase Vertex AI 404s for embedContent: firebase/flutterfire#13269

…ling, reasoning and multi modal capabilities #48 #45 (There's a 0827 /2024 Aug 27/ experimental model vs the current stable is 0514 May 14th)

MrCsabaToth · 2024-09-04T23:23:09Z

Go could have similar issues: https://discuss.ai.google.dev/t/text-multilingual-embedding-002-is-not-found-for-api-version-v1beta/4721/3

MrCsabaToth · 2024-09-05T15:16:30Z

Firebase Vertex AI Flutter setup:
https://firebase.google.com/docs/vertex-ai/get-started?platform=flutter

Code sample: https://firebase.google.com/docs/vertex-ai/locations?platform=flutter#code-samples

Unfortunately so far Firebase Vertex AI 404s for embedContent: firebase/flutterfire#13269

MrCsabaToth · 2024-09-05T15:17:24Z

The workaround will be a cloud function, we'll have to establish another anyway for reranking as well #39

MrCsabaToth · 2024-09-21T07:41:56Z

Multilingual embedding codes:

But now I'm thinking I should rather go for multi-modal embedding! This way we wouldn't have to describe images and videos.

from vertexai.language_models import TextEmbeddingInput, TextEmbeddingModel
from vertexai.vision_models import Image as VMImage
from vertexai.vision_models import MultiModalEmbeddingModel, MultiModalEmbeddingResponse
from vertexai.vision_models import Video as VMVideo
from vertexai.vision_models import VideoSegmentConfig

mm_embedding_model = MultiModalEmbeddingModel.from_pretrained("multimodalembedding")
text_embedding_model = TextEmbeddingModel.from_pretrained("textembedding-gecko@003")

image = VMImage.load_from_file(image_path) if image_path else None
video = VMVideo.load_from_file(video_path) if video_path else None

embeddings = mm_embedding_model.get_embeddings(
    image=image,
    video=video,
    video_segment_config=video_segment_config,
    contextual_text=contextual_text,
    dimension=dimension,
)

MrCsabaToth · 2024-09-21T12:02:42Z

Note that we perform a dimensionality reduction with folding (instead of truncation) which currently leads to non normalized vectors. This means that dot product (the potentially most cost effective distance) is not a valid distance any more https://cloud.google.com/firestore/docs/vector-search#choose-distance-measure
So maybe we should normalize after the folding?

MrCsabaToth · 2024-10-11T06:04:17Z

Maybe the user should decide if they prefer multi lingual embedding or multi modal. Currently it seems that the two together is not possible.

In case of multi lingual embedding we would need to perform an image / audio / video transcription and embed that.
In case of multi modal embedding we'd need to translate everything to English.

… functions

… embedding function #48

See notebooks/official/generative_ai/vertex_sdk_llm_snippets.ipynb and https://cloud.google.com/vertex-ai/docs/start/install-sdk

(we don't override dimensionality, but the empty parameter should be passed differently)

… (default) to 512MB due to OOM #48

Maybe we'll need to indeed have that kwargs. The model has 768 dimensions.

#48

…48

…s & runs, still no effect tho #48

MrCsabaToth · 2024-10-20T04:44:45Z

The new embedding function uses both multi modal embedding (for media) and multi lingual embedding (for texts)

MrCsabaToth · 2024-12-15T21:43:35Z

The documentation explicitly states that embedding is not available via Firebase Vertex AI: https://firebase.google.com/docs/vertex-ai/gemini-models#capabilities-features-comparison

"Note: Context caching, fine tuning a model, embeddings, and semantic retrieval are supported by various models or the Vertex AI Gemini API, but they're not supported by the Vertex AI in Firebase SDKs."

So indeed the only way to go is our own cloud functions.

MrCsabaToth · 2024-12-15T21:56:40Z

This meme #48 (comment) is not true any more. Since v 1.0.0 (https://pub.dev/packages/firebase_vertexai/changelog#100), the GA General Availability, commit firebase/flutterfire@77b4880#diff-82dd2dc26e6d987c08c5710d250ad1167ea7ca874efe5f1f9ca140bec27a11eaR14 (see also firebase/flutterfire#13453) firebase_vertexai doesn't depend on google_generative_ai any more: firebase/flutterfire@77b4880#diff-6bcb02bb8cd5e90cfd8d924c5acb26084d3a301b8bef56e255ecaa6b9cd0221eL22

MrCsabaToth added enhancement New feature or request RAG Retrieval Augmented Generation related labels Aug 31, 2024

MrCsabaToth self-assigned this Aug 31, 2024

MrCsabaToth mentioned this issue Aug 31, 2024

Upgrade text embedding from text-embedding-004 to text-embedding-preview-0815 #46

Closed

MrCsabaToth added a commit that referenced this issue Sep 2, 2024

Upgrade the embedding model to a multilingual one, fixes #48, Dimensi…

da1234e

…onality remains 768

MrCsabaToth closed this as completed Sep 2, 2024

MrCsabaToth reopened this Sep 3, 2024

MrCsabaToth added a commit that referenced this issue Sep 4, 2024

Use cutting edge preview version hoping it's better with function cal…

d2b6d11

…ling, reasoning and multi modal capabilities #48 #45 (There's a 0827 /2024 Aug 27/ experimental model vs the current stable is 0514 May 14th)

MrCsabaToth added a commit that referenced this issue Sep 4, 2024

Switch back to text-embedding-004 until I find a solution to #48

2f4a41c

MrCsabaToth changed the title ~~Upgrade text embedding from text-embedding-004 to text-multilingual-embedding-002~~ RAG: Upgrade text embedding from text-embedding-004 to text-multilingual-embedding-002 Sep 5, 2024

MrCsabaToth mentioned this issue Sep 19, 2024

Switch from google_generative_ai package to firebase_vertexai (and from BYO API key to BYO Firebase project) #53

Closed

MrCsabaToth added a commit that referenced this issue Oct 18, 2024

Upgrading pypi package versions for new embedding #48 and reranking #39…

4675929

… functions

MrCsabaToth added a commit that referenced this issue Oct 18, 2024

Adding multi modal (for image and video) and multi lingual (for text)…

de89be9

… embedding function #48

MrCsabaToth added a commit that referenced this issue Oct 18, 2024

Correction after renaming embedding #48 and reranking #39 functions

be4f0cc

MrCsabaToth added a commit that referenced this issue Oct 18, 2024

Trying to fix ModuleNotFound for import vertexai #48

5ee7c4a

See notebooks/official/generative_ai/vertex_sdk_llm_snippets.ipynb and https://cloud.google.com/vertex-ai/docs/start/install-sdk

MrCsabaToth added a commit that referenced this issue Oct 20, 2024

Correction to text embedding call #48

53b61b6

(we don't override dimensionality, but the empty parameter should be passed differently)

MrCsabaToth added a commit that referenced this issue Oct 20, 2024

Trying to increase memory allocation fo the embed function from 256MB…

9eb2507

… (default) to 512MB due to OOM #48

MrCsabaToth added a commit that referenced this issue Oct 20, 2024

Trying to overcome exception of the text embedding call #48

cdb46c9

Maybe we'll need to indeed have that kwargs. The model has 768 dimensions.

MrCsabaToth added a commit that referenced this issue Oct 20, 2024

Making text embedding call robust (pacman exception catch with logging)

c668d16

#48

MrCsabaToth added a commit that referenced this issue Oct 20, 2024

Don't even bother with multi lingual embedding when the text is empty #…

756669a

…48

MrCsabaToth added a commit that referenced this issue Oct 20, 2024

Refactoring result construction for embeddings #48

7cb84eb

MrCsabaToth added a commit that referenced this issue Oct 20, 2024

Refactor programmatic function memory capacity configuration, compile…

6fcaca0

…s & runs, still no effect tho #48

MrCsabaToth added a commit that referenced this issue Oct 20, 2024

Embedding result array needs to be flat #48

658fce5

MrCsabaToth added a commit that referenced this issue Oct 20, 2024

Restructuring embedding function result format from flat list #48

97ed31a

MrCsabaToth added a commit that referenced this issue Oct 20, 2024

Wiring in the new embedding function #48

da371c0

MrCsabaToth added a commit that referenced this issue Oct 20, 2024

Fix new embed function return data format (to dictionary) #48

de4618f

MrCsabaToth added a commit that referenced this issue Oct 20, 2024

Fixing embedding function return call handling #48

4e5aaf8

MrCsabaToth closed this as completed Oct 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RAG: Upgrade text embedding from text-embedding-004 to text-multilingual-embedding-002 #48

RAG: Upgrade text embedding from text-embedding-004 to text-multilingual-embedding-002 #48

MrCsabaToth commented Aug 31, 2024

MrCsabaToth commented Aug 31, 2024 •

edited

Loading

MrCsabaToth commented Sep 3, 2024

MrCsabaToth commented Sep 3, 2024

MrCsabaToth commented Sep 3, 2024

MrCsabaToth commented Sep 3, 2024 •

edited

Loading

MrCsabaToth commented Sep 3, 2024

MrCsabaToth commented Sep 3, 2024

MrCsabaToth commented Sep 3, 2024

MrCsabaToth commented Sep 4, 2024 •

edited

Loading

MrCsabaToth commented Sep 4, 2024

MrCsabaToth commented Sep 5, 2024 •

edited

Loading

MrCsabaToth commented Sep 5, 2024

MrCsabaToth commented Sep 21, 2024

MrCsabaToth commented Sep 21, 2024

MrCsabaToth commented Oct 11, 2024

MrCsabaToth commented Oct 20, 2024

MrCsabaToth commented Dec 15, 2024 •

edited

Loading

MrCsabaToth commented Dec 15, 2024

RAG: Upgrade text embedding from text-embedding-004 to text-multilingual-embedding-002 #48

RAG: Upgrade text embedding from text-embedding-004 to text-multilingual-embedding-002 #48

Comments

MrCsabaToth commented Aug 31, 2024

MrCsabaToth commented Aug 31, 2024 • edited Loading

MrCsabaToth commented Sep 3, 2024

MrCsabaToth commented Sep 3, 2024

MrCsabaToth commented Sep 3, 2024

MrCsabaToth commented Sep 3, 2024 • edited Loading

MrCsabaToth commented Sep 3, 2024

MrCsabaToth commented Sep 3, 2024

MrCsabaToth commented Sep 3, 2024

MrCsabaToth commented Sep 4, 2024 • edited Loading

MrCsabaToth commented Sep 4, 2024

MrCsabaToth commented Sep 5, 2024 • edited Loading

MrCsabaToth commented Sep 5, 2024

MrCsabaToth commented Sep 21, 2024

MrCsabaToth commented Sep 21, 2024

MrCsabaToth commented Oct 11, 2024

MrCsabaToth commented Oct 20, 2024

MrCsabaToth commented Dec 15, 2024 • edited Loading

MrCsabaToth commented Dec 15, 2024

MrCsabaToth commented Aug 31, 2024 •

edited

Loading

MrCsabaToth commented Sep 3, 2024 •

edited

Loading

MrCsabaToth commented Sep 4, 2024 •

edited

Loading

MrCsabaToth commented Sep 5, 2024 •

edited

Loading

MrCsabaToth commented Dec 15, 2024 •

edited

Loading