Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluating RepCONC on different datasets in a zero-shot fashion #1

Open
thakur-nandan opened this issue Nov 9, 2021 · 9 comments
Open

Comments

@thakur-nandan
Copy link

Hi @jingtaozhan,

Thanks for releasing this super repository and interesting paper. I'm interested in evaluating the model generalization across different datasets. For example, evaluating the model on different datasets from the BEIR Benchmark (https://github.com/UKPLab/beir).

It would really help if a sample code is available to evaluate an already trained RepCONC model on a dataset from the BEIR Benchmark.

Thanks!

Kind Regards,
Nandan Thakur

@jingtaozhan
Copy link
Owner

Hi Nandan,

Thanks for your interest in our work. I planned to update you on this repo after I released the training code :)

BEIR is a very fascinating benchmark, and it will be great to evaluate RepCONC on it. I will try testing RepCONC on one of the selected datasets (e.g., TREC-COVID) and then share the code. I will update here when the code is ready.

Best,
Jingtao

@thakur-nandan
Copy link
Author

Hi @jingtaozhan,

Thank you so much! I look forward to the code when it's ready!

Kind Regards,
Nandan Thakur

@thakur-nandan
Copy link
Author

Hi @jingtaozhan,

I understand that you will be busy with other work with higher priorities.
Would it be possible for you to provide an approximate timeline for this? If possible.

Thanks!

Kind Regards,
Nandan Thakur

@jingtaozhan
Copy link
Owner

Hi @NThakur20

I'm currently working on it and it is almost done. The repo will be updated today. RepCONC will utilize the JPQ package to perform zero-shot retrieval for BEIR. Here is how I evaluate JPQ on BEIR.

I write code by following the model examples in BEIR repo. So I think both JPQ and RepCONC can be added to the BEIR examples. What do you think?

Best,
Jingtao

@thakur-nandan
Copy link
Author

Awesome, thank you so much @jingtaozhan!

Yes, I believe both will be interesting to evaluate, I'm currently working on a paper where we are evaluating several memory compression strategies. I can add these examples to the BEIR examples folder as well :)

Kind Regards,
Nandan Thakur

@jingtaozhan
Copy link
Owner

The code is released now. Happy to help if you have any other questions.

I do think it's very meaningful to include JPQ and RepCONC in BEIR examples. They follow a different paradigm, joint optimization with compact index, compared with many existing DR models. It wouldn't be hard to add them since the code is ready. I can open a PR if you think is OK.

Best,
Jingtao

@thakur-nandan
Copy link
Author

Thanks @jingtaozhan, for providing the scores and the script soon.

I think it will be definitely interesting to add the models to the repository. I would be happy if you can open a PR in the repository.

Thanks,
Nandan Thakur

@jingtaozhan
Copy link
Owner

Hi @NThakur20

I update the code so that it is now very easy to apply RepCONC to different dense retrieval models (Pull Request). Thought you might be interested.

Best,
Jingtao

@thakur-nandan
Copy link
Author

Hi @jingtaozhan,

Thank you for the PR and apologies for the delay. Will have a look.

Meanwhile, in a recent preprint of ours, we found our work on GPL (https://aclanthology.org/2022.naacl-main.168/) useful to help improve zero-shot JPQ performance. We experimented with JPQ on all BEIR datasets (with TAS-B as backbone instead of STAR). GPL which involves cross-encoder distillation with MarginMSE loss function helped improve JPQ models across all BEIR datasets and even outperformed the original uncompressed TAS-B model. For more details, you can have a look below. Would like to get some feedback :)

Paper: https://arxiv.org/abs/2205.11498
Code: https://github.com/thakur-nandan/income

Kind Regards,
Nandan Thakur

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants