docs: add tiny model (#763)

* docs: add tiny model * docs: add tiny model * chore: update readme * docs: add paper to pretrained models * chore: add changelog * chore: update readme
jina-ai · Jul 25, 2023 · b232b3f · b232b3f
1 parent c387752
commit b232b3f
Show file tree

Hide file tree

Showing 3 changed files with 50 additions and 5 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -18,6 +18,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 ### Docs
 
+- Add tiny model and citation to Readme and docs. ([#763](https://github.com/jina-ai/finetuner/pull/763))
+
 - Fix huggingface link of jina embeddings. ([#761](https://github.com/jina-ai/finetuner/pull/761))
 
 - Remove redundant text in jina embedding page. ([#762](https://github.com/jina-ai/finetuner/pull/762))

diff --git a/README.md b/README.md
@@ -43,6 +43,15 @@ without worrying about resource availability, complex integration, or infrastruc
 
 ## [Documentation](https://finetuner.jina.ai/)
 
+## Pretrained Text Embedding Models
+
+| name                   | parameter | dimension | Huggingface                                            |
+|------------------------|-----------|-----------|--------------------------------------------------------|
+| jina-embedding-t-en-v1 | 14m       | 312             | [link](https://huggingface.co/jinaai/jina-embedding-t-en-v1) |
+| jina-embedding-s-en-v1 | 35m       | 512             | [link](https://huggingface.co/jinaai/jina-embedding-s-en-v1) |
+| jina-embedding-b-en-v1 | 110m      | 768             | [link](https://huggingface.co/jinaai/jina-embedding-b-en-v1) |
+| jina-embedding-l-en-v1 | 330m      | 1024            | [link](https://huggingface.co/jinaai/jina-embedding-l-en-v1) |
+
 ## Benchmarks
 
 <table>
@@ -172,6 +181,22 @@ Check out our published blogposts and tutorials to see Finetuner in action!
 
 <!-- end finetuner-articles -->
 
+<!-- start citations -->
+If you find Jina Embeddings useful in your research, please cite the following paper:
+
+```text
+@misc{günther2023jina,
+      title={Jina Embeddings: A Novel Set of High-Performance Sentence Embedding Models}, 
+      author={Michael Günther and Louis Milliken and Jonathan Geuter and Georgios Mastrapas and Bo Wang and Han Xiao},
+      year={2023},
+      eprint={2307.11224},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL}
+}
+
+```
+<!-- end citations -->
+
 <!-- start support-pitch -->
 ## Support
 

diff --git a/docs/get-started/pretrained.md b/docs/get-started/pretrained.md
@@ -6,6 +6,7 @@ we have introduced a suite of pre-trained text embedding models licensed under A
 These models have a variety of use cases, including information retrieval, semantic textual similarity, text reranking, and more.
 The suite consists of the following models:
 
+- `jina-embedding-t-en-v1` [**[Huggingface](https://huggingface.co/jinaai/jina-embedding-t-en-v1)**]: The fastest embedding model in the world with 14 million parameters.
 - `jina-embedding-s-en-v1` [**[Huggingface](https://huggingface.co/jinaai/jina-embedding-s-en-v1)**]: This is a compact model with just 35 million parameters, that performs lightning-fast inference while delivering impressive performance.
 - `jina-embedding-b-en-v1` [**[Huggingface](https://huggingface.co/jinaai/jina-embedding-b-en-v1)**]: This model has a size of 110 million parameters, performs fast inference and delivers better performance than our smaller model.
 - `jina-embedding-l-en-v1` [**[Huggingface](https://huggingface.co/jinaai/jina-embedding-l-en-v1)**]: This is a relatively large model with a size of 330 million parameters, that performs single-gpu inference and delivers better performance than the other models.
@@ -36,12 +37,29 @@ Each Jina embedding model can encode up to 512 tokens,
 with any further tokens being truncated.
 The models have different output dimensionalities, as shown in the table below:
 
-|Name|param    |context| Dimension |
-|------------------------------|-----|------|-----------|
-|jina-embedding-s-en-v1|35m      |512| 512       |
-|jina-embedding-b-en-v1|110m      |512| 768       |
-|jina-embedding-l-en-v1|330m      |512| 1024      |
+| Name                   | param |context| Dimension |
+|------------------------|-------|------|-----------|
+| jina-embedding-t-en-v1 | 14m   |512| 312       |
+| jina-embedding-s-en-v1 | 35m   |512| 512       |
+| jina-embedding-b-en-v1 | 110m  |512| 768       |
+| jina-embedding-l-en-v1 | 330m  |512| 1024      |
 
 ## Performance
 
 Please refer to the [Huggingface](https://huggingface.co/jinaai/jina-embedding-s-en-v1) page.
+
+## Citations
+
+If you find Jina Embeddings useful in your research, please cite the following paper:
+
+```text
+@misc{günther2023jina,
+      title={Jina Embeddings: A Novel Set of High-Performance Sentence Embedding Models}, 
+      author={Michael Günther and Louis Milliken and Jonathan Geuter and Georgios Mastrapas and Bo Wang and Han Xiao},
+      year={2023},
+      eprint={2307.11224},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL}
+}
+
+```