add dtype to config.yaml when creating vectordb #341

BBC-Esq · 2025-01-02T16:56:42Z

Necessary if I want to match the dtype when searching. And if I want to be strict about it, I'd remove the ability to set a "search" device that is different than the one that created the vectordb...However,

If VectorDB created with cuda:

can be created using float32, float16, or bfloat16 (only if cuda compute 8+ is supported)
user checks/unchecks "half" during creation to choose either float32 or float16/bfloat16
- Again, whether float16 or bfloat16 is ultimately used depends on a user's GPU capabilities

If VectorDB created with cpu:

can only be created using float32 (unless I subsequently add some specialized backends)

Since cpus don't support float16/bfloat16, ideally we'd only want to allow querying with the cpu if:

A user created the vectordb with a GPU but "half" was unchecked; or
A user created it using cpu

Basically, querying in float32 is inefficient if the vectors are stored in float16/bfloat16...it works, just inefficient. Until now, it's been done this way.

Add the precision to the `config.yaml` file that a vectordb was created with to allow better handling of the dtype when querying. Either this or stick to "cuda" all the time (if available) or "cpu" for both creation and querying to simplify matters.

The text was updated successfully, but these errors were encountered:

BBC-Esq added the enhancement approved repository owner use only label Jan 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add dtype to config.yaml when creating vectordb #341

add dtype to config.yaml when creating vectordb #341

BBC-Esq commented Jan 2, 2025

add dtype to config.yaml when creating vectordb #341

add dtype to config.yaml when creating vectordb #341

Comments

BBC-Esq commented Jan 2, 2025

Add the precision to the config.yaml file that a vectordb was created with to allow better handling of the dtype when querying. Either this or stick to "cuda" all the time (if available) or "cpu" for both creation and querying to simplify matters.

Add the precision to the `config.yaml` file that a vectordb was created with to allow better handling of the dtype when querying. Either this or stick to "cuda" all the time (if available) or "cpu" for both creation and querying to simplify matters.