Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to v0.2: a more flexible framework for compressing index of any dense retrieval models #6

Merged
merged 3 commits into from
Jun 2, 2022

Conversation

jingtaozhan
Copy link
Owner

This is a major code update. The previous code is deprecated. Here are several features:

  • Flexible code framework. The previous code has a preprocessing process, which is abandoned now. Tokenization is done during training and inference. JPQ and RepCONC no longer depend on a certain dense retrieval model structure. They are two training instances that accept a dense retrieval model as input. So dense models of different architectures can all be the input of JPQ and RepCONC.
  • Support distributed training for RepCONC.
  • Support large batch sizes for RepCONC with GradCache
  • Already added and will add more examples about transferring dense retrieval models into memory-efficient ones.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant