How to approach clustering? #56
Unanswered
OmarShehata
asked this question in
Q&A
Replies: 1 comment
-
Figured it out, very simple with |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello! I'm using vectra to do local, fast vector search after getting embeddings from OpenAI etc, it's incredible!! Love how extremely easy, portable, and fast it is.
I know how to query for the distances to a given vector, but how would you approach clustering? As in, I define some threshold, and I can see which groups of vectors are most similar (my use case is, I have a DB of 15k tweets and I want to semantically cluster them). The "naive" way would be take every single vector and find the closest ones to it, repeat that for all?
(is this where you would do something like PCA, or k-means etc? is this out of scope for vectra/vector databases like this in general? apologies for the noob question, thank you for your time!!)
Beta Was this translation helpful? Give feedback.
All reactions