MTEB: Massive Text Embedding Benchmark
-
Updated
Nov 21, 2024 - Jupyter Notebook
MTEB: Massive Text Embedding Benchmark
A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
Generative Representational Instruction Tuning
Build and train state-of-the-art natural language processing models using BERT
Search with BERT vectors in Solr, Elasticsearch, OpenSearch and GSI APU
Rust port of sentence-transformers (https://github.com/UKPLab/sentence-transformers)
TextReducer - A Tool for Summarization and Information Extraction
文本相似度,语义向量,文本向量,text-similarity,similarity, sentence-similarity,BERT,SimCSE,BERT-Whitening,Sentence-BERT, PromCSE, SBERT
Using machine learning on your anki collection to enhance the scheduling via semantic clustering and semantic similarity
Building a model to recognize incentives for landscape restoration in environmental policies from Latin America, the US and India. Bringing NLP to the world of policy analysis through an extensible framework that includes scraping, preprocessing, active learning and text analysis pipelines.
Heterogenous, Task- and Domain-Specific Benchmark for Unsupervised Sentence Embeddings used in the TSDAE paper: https://arxiv.org/abs/2104.06979.
Interactive tree-maps with SBERT & Hierarchical Clustering (HAC)
Backend code for GitHub Recommendation Extension
Run sentence-transformers (SBERT) compatible models in Node.js or browser.
emoji_finder
Classification pipeline based on sentenceTransformer and Facebook nearest-neighbor search library
Match celebrity users with their respective tweets by making use of Semantic Textual Similarity on over 900+ celebrity users' 2.5 million+ scraped tweets utilizing SBERT, streamlit, tweepy and FastAPI
Add a description, image, and links to the sbert topic page so that developers can more easily learn about it.
To associate your repository with the sbert topic, visit your repo's landing page and select "manage topics."