You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Necessary if I want to match the dtype when searching. And if I want to be strict about it, I'd remove the ability to set a "search" device that is different than the one that created the vectordb...However,
If VectorDB created with cuda:
can be created using float32, float16, or bfloat16 (only if cuda compute 8+ is supported)
user checks/unchecks "half" during creation to choose either float32 or float16/bfloat16
Again, whether float16 or bfloat16 is ultimately used depends on a user's GPU capabilities
If VectorDB created with cpu:
can only be created using float32 (unless I subsequently add some specialized backends)
Since cpus don't support float16/bfloat16, ideally we'd only want to allow querying with the cpu if:
A user created the vectordb with a GPU but "half" was unchecked; or
A user created it using cpu
Basically, querying in float32 is inefficient if the vectors are stored in float16/bfloat16...it works, just inefficient. Until now, it's been done this way.
Add the precision to the config.yaml file that a vectordb was created with to allow better handling of the dtype when querying. Either this or stick to "cuda" all the time (if available) or "cpu" for both creation and querying to simplify matters.
The text was updated successfully, but these errors were encountered:
Necessary if I want to match the dtype when searching. And if I want to be strict about it, I'd remove the ability to set a "search" device that is different than the one that created the vectordb...However,
If VectorDB created with cuda:
If VectorDB created with cpu:
Since cpus don't support float16/bfloat16, ideally we'd only want to allow querying with the cpu if:
Basically, querying in float32 is inefficient if the vectors are stored in float16/bfloat16...it works, just inefficient. Until now, it's been done this way.
Add the precision to the
config.yaml
file that a vectordb was created with to allow better handling of the dtype when querying. Either this or stick to "cuda" all the time (if available) or "cpu" for both creation and querying to simplify matters.The text was updated successfully, but these errors were encountered: