feat: add max_sim operator for IR tasks to support multi-vector models - #1560 #1563

sam-hey · 2024-12-07T18:54:17Z

This PR adds support for ColBERT models that implement SentenceTransformer.
An example can be found in the pylate library.

Fixes #1560

Checklist

Run tests locally to make sure nothing is broken using make test.
Run the formatter to format the code using make lint.

Samoed · 2024-12-07T19:10:28Z

I think you could add a wrapper for ColBERT models in the models folder for better integration with MTEB. You can refer to model2vec as an example.

Samoed

Great! I've suggested a few changes

mteb/models/colbert_models.py

Samoed

Great! Can you resolve conflicts?

README.md

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

isaac-chung

Nice! Just some small suggestions before we merge.

README.md

mteb/evaluation/evaluators/RetrievalEvaluator.py

KennethEnevoldsen · 2024-12-08T02:13:13Z

Seems like Isaac took this one - have unassigned myself, but let me know if you need me to take a look

Co-authored-by: Isaac Chung <chungisaac1217@gmail.com>

sam-hey · 2024-12-08T06:25:40Z

Thanks for the suggestions! I’ve cleaned up the README as requested. I appreciate your time and feedback—hopefully, everything is in good shape now!

Best
Sam

mteb/evaluation/evaluators/utils.py

isaac-chung · 2024-12-08T14:17:31Z

@sam-hey Just pushed a few changes, and now the example script runs and gives ndcg_at_10": 0.27872 on NFCorpus, which is close to the 0.338 reported in the Colbert v2 paper. Can you see if you can find where the rest of the gap is from?

sam-hey · 2024-12-08T14:20:29Z

@isaac-chung Yes, I put it to draft as I found a major bug. The Encode function needs is_query to work. I am really sorry - I am working on a fix...

isaac-chung · 2024-12-08T14:21:44Z

@sam-hey no worries. Take your time, and ping us once it's ready :)

mteb/models/colbert_models.py

sam-hey · 2024-12-10T18:54:08Z

@sam-hey Just pushed a few changes, and now the example script runs and gives ndcg_at_10": 0.27872 on NFCorpus, which is close to the 0.338 reported in the Colbert v2 paper. Can you see if you can find where the rest of the gap is from?

@isaac-chung I’m currently getting an ndcg_at_10 score of 0.32989 for ColBERTv2.

For comparison, jinaai/jina-colbert-v2 gave an ndcg_at_10 of 0.35528, while the reported score for nfcorpus is 0.346.

It seems there is some deviation, but it doesn’t appear to be related to the fn max_sim()

isaac-chung · 2024-12-11T20:31:56Z

Looks good, thanks @sam-hey! I skimmed the Colbert v2 paper and couldn't find any mentions of special prompts. Just pinging @Samoed for a 2nd pair of eyes and I think it's good to go.

Samoed · 2024-12-11T21:45:07Z

I think the part with prompts still needs to be updated. If prompts aren't specified, they won't be applied. In future releases, they should be specified, or at the very least, the encode function should match its signature with the interface

mteb/models/colbert_models.py

…ndling

sam-hey · 2024-12-12T08:18:09Z

@isaac-chung and @Samoed,
Thanks for your support! 😊

I’ve added the handling of prompt_name and integrated jinja-colbertv2 (128). Let me know your thoughts!

Samoed · 2024-12-12T09:03:10Z

Great! Can you run some tasks to make shure that everything works

sam-hey · 2024-12-13T10:06:21Z

@Samoed ran additional tasks, and the results were as expected. Added a note to the documentation indicating that MaxSim becomes resource-intensive with large datasets.

A solution is already underway—PyLate will support PLAID indexing, enabling better scalability in the future. For details, see PyLate Issue #72.

In the meantime, I have a WIP branch (mteb: colbert-with-index) that can be utilized as soon as a more efficient index is available for PyLate.

isaac-chung

Thanks for iterating!

README.md

* feat: add new arctic v2.0 models (#1574) * feat: add new arctic v2.0 models * chore: make lint * 1.24.0 Automatically generated by python-semantic-release * fix: Add namaa MrTydi reranking dataset (#1573) * Add dataset class and file requirements * pass tests * make lint changes * adjust meta data and remove load_data --------- Co-authored-by: Omar Elshehy <omarelshehy@Omars-MacBook-Pro.local> * Update tasks table * 1.24.1 Automatically generated by python-semantic-release * fix: Eval langs not correctly passed to monolingual tasks (#1587) * fix SouthAfricanLangClassification.py * add check for langs * lint * 1.24.2 Automatically generated by python-semantic-release * feat: Add ColBert (#1563) * feat: add max_sim operator for IR tasks to support multi-vector models * docs: add doc for Model2VecWrapper.__init__(...) * feat: add ColBERTWrapper to models & add ColBERTv2 * fix: resolve issues * fix: resolve issues * Update README.md Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * Update README.md Co-authored-by: Isaac Chung <chungisaac1217@gmail.com> * Update README.md Co-authored-by: Isaac Chung <chungisaac1217@gmail.com> * Update mteb/evaluation/evaluators/RetrievalEvaluator.py Co-authored-by: Isaac Chung <chungisaac1217@gmail.com> * Update README.md Co-authored-by: Isaac Chung <chungisaac1217@gmail.com> * README.md: rm subset * doc: update example for Late Interaction * get colbert running without errors * fix: pass is_query to pylate * fix: max_sim add pad_sequence * feat: integrate Jinja templates for ColBERTv2 and add model prompt handling * feat: add revision & prompt_name * doc: pad_sequence * rm TODO jina colbert v2 * doc: warning: higher resource usage for MaxSim --------- Co-authored-by: sam021313 <40773225+sam021313@users.noreply.github.com> Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> Co-authored-by: Isaac Chung <chungisaac1217@gmail.com> * 1.25.0 Automatically generated by python-semantic-release * doc: colbert add score_function & doc section (#1592) * doc: colbert add score_function & doc section * doc: Update README.md Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com> * doc: Update README.md Co-authored-by: Isaac Chung <chungisaac1217@gmail.com> --------- Co-authored-by: sam021313 <40773225+sam021313@users.noreply.github.com> Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com> Co-authored-by: Isaac Chung <chungisaac1217@gmail.com> * Feat: add support for scoring function (#1594) * add support for scoring function * lint * move similarity to wrapper * remove score function * lint * remove from InstructionRetrievalEvaluator * Update mteb/evaluation/evaluators/RetrievalEvaluator.py Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com> * remove score function from README.md --------- Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com> * Add new models nvidia, gte, linq (#1436) * Add new models nvidia, gte, linq * add warning for gte-Qwen and nvidia models re: instruction used in docs as well --------- Co-authored-by: isaac-chung <chungisaac1217@gmail.com> * Leaderboard: Refined plots (#1601) * Added embedding size guide to performance-size plot, removed shading on radar chart * Changed plot names to something more descriptive * Made plots failsafe * fix: Leaderboard refinements (#1603) * Added explanation of aggregate measures * Added download button to result tables * Task info gets sorted by task name * Added custom, shareable links for each benchmark * Moved explanation of aggregate metrics to the summary tab * 1.25.1 Automatically generated by python-semantic-release * Feat: Use similarity scores if available (#1602) * Use similarity scores if available * lint * Add NanoBEIR Datasets (#1588) * add NanoClimateFeverRetrieval task, still requires some debugging * move task to correct place in init file * add all Nano datasets and results * format code * Update mteb/tasks/Retrieval/eng/tempCodeRunnerFile.py Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * pin revision to commit and add datasets to benchmark.py * create new benchmark for NanoBEIR * add revision when loading datasets * lint --------- Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> Co-authored-by: isaac-chung <chungisaac1217@gmail.com> * Update tasks table * Feat: Evaluate missing languages (#1584) * init * fix tests * update mock retrieval * update tests * use subsets instead of langs * Apply suggestions from code review Co-authored-by: Isaac Chung <chungisaac1217@gmail.com> * fix tests * add to readme * rename subset in readme --------- Co-authored-by: Isaac Chung <chungisaac1217@gmail.com> * Add IBM Granite Embedding Models (#1613) * add IBM granite embedding models * lint formatting * add adapted_from and superseded_by to ModelMeta * fix: disable co2_tracker for API models (#1614) * 1.25.2 Automatically generated by python-semantic-release * fix: set `use_instructions` to True in models using prompts (#1616) feat: set `use_instructions` to True in models using prompts * 1.25.3 Automatically generated by python-semantic-release * update RetrievalEvaluator.py * update imports * update imports and metadata * fix tests * fix tests * fix output path for retrieval * fix similarity function --------- Co-authored-by: Daniel Buades Marcos <daniel.buades@clinia.com> Co-authored-by: github-actions <github-actions@github.com> Co-authored-by: Omar Elshehy <41394057+omarelshehy@users.noreply.github.com> Co-authored-by: Omar Elshehy <omarelshehy@Omars-MacBook-Pro.local> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Sam <40773225+sam-hey@users.noreply.github.com> Co-authored-by: sam021313 <40773225+sam021313@users.noreply.github.com> Co-authored-by: Isaac Chung <chungisaac1217@gmail.com> Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com> Co-authored-by: Alexey Vatolin <vatolinalex@gmail.com> Co-authored-by: Márton Kardos <power.up1163@gmail.com> Co-authored-by: KGupta10 <92774828+KGupta10@users.noreply.github.com> Co-authored-by: Aashka Trivedi <aashka.trivedi@gmail.com>

sam-hey added 2 commits December 7, 2024 19:37

feat: add max_sim operator for IR tasks to support multi-vector models

9784541

docs: add doc for Model2VecWrapper.__init__(...)

f77a554

feat: add ColBERTWrapper to models & add ColBERTv2

db75000

Samoed requested changes Dec 7, 2024

View reviewed changes

mteb/models/colbert_models.py Outdated Show resolved Hide resolved

mteb/models/colbert_models.py Outdated Show resolved Hide resolved

mteb/models/colbert_models.py Outdated Show resolved Hide resolved

sam-hey added 2 commits December 7, 2024 22:15

fix: resolve issues

8882130

fix: resolve issues

10dfd2f

Samoed approved these changes Dec 7, 2024

View reviewed changes

README.md Outdated Show resolved Hide resolved

Samoed requested a review from KennethEnevoldsen December 7, 2024 21:43

sam-hey and others added 2 commits December 7, 2024 22:58

Merge branch 'main' into main

654f6a0

Update README.md

eb3e832

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

isaac-chung reviewed Dec 7, 2024

View reviewed changes

README.md Outdated Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

mteb/evaluation/evaluators/RetrievalEvaluator.py Outdated Show resolved Hide resolved

KennethEnevoldsen removed their request for review December 8, 2024 02:11

sam-hey and others added 5 commits December 8, 2024 07:19

Update README.md

fcc0d68

Co-authored-by: Isaac Chung <chungisaac1217@gmail.com>

Update README.md

e121aad

Co-authored-by: Isaac Chung <chungisaac1217@gmail.com>

Update mteb/evaluation/evaluators/RetrievalEvaluator.py

cbde7d1

Co-authored-by: Isaac Chung <chungisaac1217@gmail.com>

Update README.md

bebdf22

Co-authored-by: Isaac Chung <chungisaac1217@gmail.com>

README.md: rm subset

16167d0

isaac-chung reviewed Dec 8, 2024

View reviewed changes

mteb/evaluation/evaluators/utils.py Outdated Show resolved Hide resolved

doc: update example for Late Interaction

babed93

sam-hey marked this pull request as draft December 8, 2024 12:59

isaac-chung added 2 commits December 8, 2024 13:58

get colbert running without errors

112b484

Merge branch 'main' of github.com:sam-hey/mteb into sam-hey/main

14d63db

Samoed reviewed Dec 8, 2024

View reviewed changes

mteb/models/colbert_models.py Outdated Show resolved Hide resolved

fix: pass is_query to pylate

1167517

fix: max_sim add pad_sequence

216d3f8

Samoed reviewed Dec 11, 2024

View reviewed changes

mteb/models/colbert_models.py Show resolved Hide resolved

sam-hey added 4 commits December 12, 2024 09:03

feat: integrate Jinja templates for ColBERTv2 and add model prompt ha…

8b64f4c

…ndling

feat: add revision & prompt_name

3f57c1c

doc: pad_sequence

4dc5f69

rm TODO jina colbert v2

e5ce5e0

sam-hey marked this pull request as ready for review December 12, 2024 08:21

sam-hey and others added 2 commits December 12, 2024 11:13

doc: warning: higher resource usage for MaxSim

67e5200

Merge branch 'embeddings-benchmark:main' into main

a3a126f

isaac-chung approved these changes Dec 13, 2024

View reviewed changes

Samoed merged commit fdfdaef into embeddings-benchmark:main Dec 14, 2024
10 checks passed

isaac-chung reviewed Dec 14, 2024

View reviewed changes

README.md Show resolved Hide resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add max_sim operator for IR tasks to support multi-vector models - #1560 #1563

feat: add max_sim operator for IR tasks to support multi-vector models - #1560 #1563

sam-hey commented Dec 7, 2024 •

edited by isaac-chung

Loading

Samoed commented Dec 7, 2024

Samoed left a comment

Samoed left a comment

isaac-chung left a comment

KennethEnevoldsen commented Dec 8, 2024

sam-hey commented Dec 8, 2024

isaac-chung commented Dec 8, 2024 •

edited

Loading

sam-hey commented Dec 8, 2024

isaac-chung commented Dec 8, 2024

sam-hey commented Dec 10, 2024

isaac-chung commented Dec 11, 2024

Samoed commented Dec 11, 2024

sam-hey commented Dec 12, 2024

Samoed commented Dec 12, 2024

sam-hey commented Dec 13, 2024

isaac-chung left a comment

feat: add max_sim operator for IR tasks to support multi-vector models - #1560 #1563

feat: add max_sim operator for IR tasks to support multi-vector models - #1560 #1563

Conversation

sam-hey commented Dec 7, 2024 • edited by isaac-chung Loading

Checklist

Samoed commented Dec 7, 2024

Samoed left a comment

Choose a reason for hiding this comment

Samoed left a comment

Choose a reason for hiding this comment

isaac-chung left a comment

Choose a reason for hiding this comment

KennethEnevoldsen commented Dec 8, 2024

sam-hey commented Dec 8, 2024

isaac-chung commented Dec 8, 2024 • edited Loading

sam-hey commented Dec 8, 2024

isaac-chung commented Dec 8, 2024

sam-hey commented Dec 10, 2024

isaac-chung commented Dec 11, 2024

Samoed commented Dec 11, 2024

sam-hey commented Dec 12, 2024

Samoed commented Dec 12, 2024

sam-hey commented Dec 13, 2024

isaac-chung left a comment

Choose a reason for hiding this comment

sam-hey commented Dec 7, 2024 •

edited by isaac-chung

Loading

isaac-chung commented Dec 8, 2024 •

edited

Loading