Perform dimensionality reduction on the embeddings #47

MrCsabaToth · 2024-08-30T21:44:18Z

While researching #46 I saw that since text-embedding-004 the API supports outputDimensionality reduction, it's part of the parameter section of the payload: https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/text-embeddings-api#advanced-use

{
  "instances": [
    { "content": "TEXT",
      "task_type": "TASK_TYPE",
      "title": "TITLE"
    },
  ],
  "parameters": {
    "autoTruncate": AUTO_TRUNCATE,
    "outputDimensionality": OUTPUT_DIMENSIONALITY
  }
}

According to doc the "reduction" is a simple truncation:

Used to specify output embedding size. If set, output embeddings will be truncated to the size specified.

Note that a code example suggests reduction to 256. Note also that autoTruncate is on by default.

If we go for a dimensionality reduction to 256, that would cut the storage size in 1/3rd (768 / 3 = 256) and also the retrieval processing time as well. But since this affects the accuracy and precision we'd definitely benefit from a reranking https://github.com/CsabaConsulting/InspectorGadgetApp/issues/39

Since this parameter is not available through the Gemini Dart API anyway, and probably the bandwidth aspect of the saving is not that important for us (there are sporadic requests only, however the history can pile up over time, so we'd rather benefit from the storage and processing time savings) we'd use the workaround of performing the reduction ourselves. However I think we should perform a fold instead of the truncation. This way we'd merge every three dimensions into one, potentially not losing any dimension info, however the merging would still result in precision loss, but I suspect that not as much as we just simply throw out 2/3rd of the dimensions (512).

The text was updated successfully, but these errors were encountered:

MrCsabaToth · 2024-08-30T21:46:56Z

The folding would mean a 512 floating point addition operations per embedding. Simple.

MrCsabaToth · 2024-08-31T21:31:19Z

Thoughts about truncation vs folding vs other techniques:

Technique:	Truncation	Folding	PCA	t-SNE	UMAP
Mechanism:	Discards less important dimensions	Combines multiple dimensions into one	Linear projection onto principal components	Nonlinear, preserves local structure	Nonlinear, preserves local and global structure
Note:		Addition, Average, or Maximum within folded groups	Principal Component Analysis	t-distributed Stochastic Neighbor Embedding	Uniform Manifold Approximation and Projection
Advantages:	Simple, computationally efficient	Can preserve more information than truncation	Interpretable, captures global variance, computationally efficient for moderate dimensions	Excellent for visualization, captures nonlinear relationships	Fast, scalable, good for visualization and clustering
Disadvantages:	Potential for significant information loss, doesn't consider data structure	Choice of aggregation function is crucial, less interpretable	Assumes linear relationships, sensitive to scaling	Computationally expensive, difficult to interpret, sensitive to hyperparameters	Can be sensitive to hyperparameters, might require more tuning
Use Cases:	When the first few dimensions capture most information, exploratory analysis	When you have many features and want to retain some information from less important ones	Feature selection, data visualization, noise reduction	Data visualization, clustering, exploring high-dimensional data	Similar to t-SNE, but faster and more scalable

MrCsabaToth · 2024-08-31T22:10:19Z

I'm gravitating towards addition based folding instead of average or max.

…eddings #47

MrCsabaToth added the enhancement New feature or request label Aug 30, 2024

MrCsabaToth mentioned this issue Aug 30, 2024

Upgrade text embedding from text-embedding-004 to text-embedding-preview-0815 #46

Closed

MrCsabaToth added the RAG Retrieval Augmented Generation related label Aug 30, 2024

MrCsabaToth self-assigned this Aug 31, 2024

MrCsabaToth mentioned this issue Aug 31, 2024

RAG: Upgrade text embedding from text-embedding-004 to text-multilingual-embedding-002 #48

Closed

MrCsabaToth added a commit that referenced this issue Sep 2, 2024

Perform dimensionality reduction by addition based folding of the emb…

b022901

…eddings #47

MrCsabaToth closed this as completed Sep 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Perform dimensionality reduction on the embeddings #47

Perform dimensionality reduction on the embeddings #47

MrCsabaToth commented Aug 30, 2024

MrCsabaToth commented Aug 30, 2024

MrCsabaToth commented Aug 31, 2024 •

edited

Loading

MrCsabaToth commented Aug 31, 2024

Perform dimensionality reduction on the embeddings #47

Perform dimensionality reduction on the embeddings #47

Comments

MrCsabaToth commented Aug 30, 2024

MrCsabaToth commented Aug 30, 2024

MrCsabaToth commented Aug 31, 2024 • edited Loading

MrCsabaToth commented Aug 31, 2024

MrCsabaToth commented Aug 31, 2024 •

edited

Loading