Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use float32 in embeddings #133

Merged
merged 1 commit into from
Mar 8, 2023
Merged

use float32 in embeddings #133

merged 1 commit into from
Mar 8, 2023

Conversation

sashabaranov
Copy link
Owner

@sashabaranov sashabaranov commented Mar 8, 2023

@sashabaranov sashabaranov merged commit c46ebb2 into master Mar 8, 2023
@sashabaranov sashabaranov deleted the float32-embeddings branch March 8, 2023 10:08
@@ -103,7 +103,7 @@ var stringToEnum = map[string]EmbeddingModel{
// then their vector representations should also be similar.
type Embedding struct {
Object string `json:"object"`
Embedding []float64 `json:"embedding"`
Embedding []float32 `json:"embedding"`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are trying to track down a bug, and I think this change causes problems when you use Python to store embeddings and golang to try to do similarity scores against them. Python seems to be using float64 (or something with even more precision), and when we try to use those values with the embeddings from this library we get wildly different results then when we do the same code using Python only. I am not sure if the endpoint returns 64, but it would seem down casting to 32 bit make more sense then losing precision you can't get back.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants