Add `model2vec` to config.json #134

davidmezzetti · 2024-11-25T18:27:11Z

Hello.

I'm planning to add a change to txtai to autodetect model2vec models. The best idea I have right now is the read the config.json file and see if it has the keys apply_pca and apply_zipf.

While I believe this will be pretty unique, have you guys considered adding something to the config.json file to signal it's a model2vec file?

The text was updated successfully, but these errors were encountered:

Pringled · 2024-11-26T07:59:52Z

Hey @davidmezzetti,

That's a great suggestion! We will add the following to our config.json files:

"model_type": "model2vec",
"architectures": [
    "StaticModel"
  ],

And then you can use the model_type key to check for model2vec models. I'll ping you once we've made that change.

davidmezzetti · 2024-11-26T10:42:56Z

Sounds great! This will make it easier in my case as txtai is working with multiple vectorization libraries. Once this change is in, txtai will be able to automatically infer the vectorization method for model2vec models.

import txtai
embeddings = txtai.Embeddings(path="minishlab/potion-base-8M")

Pringled · 2024-11-27T18:31:01Z

Hey @davidmezzetti, I just updated all our configs to include the changes. Let me know if everything works as intended!

davidmezzetti · 2024-11-27T20:35:07Z

Excellent. Looks like this should work. I'll integrate this into txtai and report back. Thank you!

davidmezzetti · 2024-12-01T13:33:11Z

This change has been made, thanks again!

Just as an FYI, the potion-science models don't appear to have the updated config. But the other models are all working as expected.

Pringled · 2024-12-01T14:04:40Z

Hey @davidmezzetti, great, no problem! I just updated the science models as well, thanks for pointing that out!

Pringled self-assigned this Nov 26, 2024

Pringled added the enhancement New feature or request label Nov 26, 2024

davidmezzetti mentioned this issue Nov 27, 2024

Autodetect Model2Vec model paths neuml/txtai#822

Closed

davidmezzetti closed this as completed Dec 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `model2vec` to config.json #134

Add `model2vec` to config.json #134

davidmezzetti commented Nov 25, 2024

Pringled commented Nov 26, 2024

davidmezzetti commented Nov 26, 2024

Pringled commented Nov 27, 2024

davidmezzetti commented Nov 27, 2024

davidmezzetti commented Dec 1, 2024

Pringled commented Dec 1, 2024

Add model2vec to config.json #134

Add model2vec to config.json #134

Comments

davidmezzetti commented Nov 25, 2024

Pringled commented Nov 26, 2024

davidmezzetti commented Nov 26, 2024

Pringled commented Nov 27, 2024

davidmezzetti commented Nov 27, 2024

davidmezzetti commented Dec 1, 2024

Pringled commented Dec 1, 2024

Add `model2vec` to config.json #134

Add `model2vec` to config.json #134