Merge pull request #327 from EricLBuehler/fix_chat_template_link

Improve chat templates docs
EricLBuehler · May 18, 2024 · 7da5b71 · 7da5b71
2 parents ca9bf7d + 76b4a44
commit 7da5b71
Show file tree

Hide file tree

Showing 4 changed files with 13 additions and 7 deletions.
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -111,4 +111,10 @@ jobs:
       - name: Typos check with custom config file
         uses: crate-ci/typos@master
         with:
-          config: .typos.toml
+          config: .typos.toml
+
+  # markdown-link-check:
+  #   runs-on: ubuntu-latest
+  #   steps:
+  #   - uses: actions/checkout@master
+  #   - uses: gaurav-nelson/github-action-markdown-link-check@v1
diff --git a/README.md b/README.md
@@ -218,7 +218,7 @@ To install mistral.rs, one should ensure they have Rust installed by following [
 
 7) Installing Python support
 
-    You can install Python support by following the guide [here](/mistralrs-pyo3/README.md).
+    You can install Python support by following the guide [here](mistralrs-pyo3/README.md).
 
 ### Getting models from HF Hub
 

diff --git a/docs/ADDING_MODELS.md b/docs/ADDING_MODELS.md
@@ -80,9 +80,7 @@ if q.rank() == 3 {
 ## 6) Implement a `Pipeline` and `Loader` in mistralrs-core/src/pipeline
 The `Loader` is in charge of downloading and loading the model. The `download_model` method is pretty general and can be copied from an existing implementation
 
-The `_setup_model` method instantiates the `Pipeline`. It handles loading the different model kinds. The `Pipeline` is responsible for running and sampling the model. For example, please see the [`mistral pipeline`](mistralrs-core/src/pipeline/mistral.rs).
-
-A quantized model should be handled by the `quantized_llama.rs` implementation. For example the [`llama pipeline`](mistralrs-core/src/pipeline/llama.rs) loads GGUF and GGML model with `quantized_llama`.rs
+The `_setup_model` method instantiates the `Pipeline`.The `Pipeline` is responsible for running and sampling the model. For example, please see the [`normal model pipeline`](../mistralrs-core/src/pipeline/normal.rs). 
 
 
 ## 7) Adding an X-LoRA counterpart

diff --git a/docs/CHAT_TOK.md b/docs/CHAT_TOK.md
@@ -1,7 +1,9 @@
 # Chat templates and tokenizer customization
 
 ## Chat templates
-Mistral.rs attempts to automatically load a chat template from the `tokenizer_config.json` file. This enables high flexibility across instruction-tuned models and ensures accurate chat templating. However, if the `chat_template` field is missing, then a JINJA chat template should be provided. The JINJA chat template may use `messages`, `add_generation_prompt`, `bos_token`, `eos_token`, and `unk_token` as inputs. Some chat templates are provided [here](chat_templates), and it is easy to modify or create others.
+Mistral.rs attempts to automatically load a chat template from the `tokenizer_config.json` file. This enables high flexibility across instruction-tuned models and ensures accurate chat templating. However, if the `chat_template` field is missing, then a JINJA chat template should be provided. The JINJA chat template may use `messages`, `add_generation_prompt`, `bos_token`, `eos_token`, and `unk_token` as inputs.
+
+We provide some chat templates [here](../chat_templates/), and it is easy to modify or create others to customize chat template behavior.
 
 For example, to use the `chatml` template, `--chat-template` is specified *before* the model architecture. For example:
 
@@ -11,7 +13,7 @@ For example, to use the `chatml` template, `--chat-template` is specified *befor
 
 ## Tokenizer
 
-Some models do not provide a `tokenizer.json` file although mistral.rs expects one. To solve this, please run [this](scripts/get_tokenizers_json.py) script. It will output the `tokenizer.json` file for your specific model. This may be used by passing the `--tokenizer-json` flag *after* the model architecture. For example:
+Some models do not provide a `tokenizer.json` file although mistral.rs expects one. To solve this, please run [this](../scripts/get_tokenizers_json.py) script. It will output the `tokenizer.json` file for your specific model. This may be used by passing the `--tokenizer-json` flag *after* the model architecture. For example:
 
 ```bash
 $ python3 scripts/get_tokenizers_json.py