Skip to content

Commit

Permalink
Merge pull request #327 from EricLBuehler/fix_chat_template_link
Browse files Browse the repository at this point in the history
Improve chat templates docs
  • Loading branch information
EricLBuehler authored May 18, 2024
2 parents ca9bf7d + 76b4a44 commit 7da5b71
Show file tree
Hide file tree
Showing 4 changed files with 13 additions and 7 deletions.
8 changes: 7 additions & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -111,4 +111,10 @@ jobs:
- name: Typos check with custom config file
uses: crate-ci/typos@master
with:
config: .typos.toml
config: .typos.toml

# markdown-link-check:
# runs-on: ubuntu-latest
# steps:
# - uses: actions/checkout@master
# - uses: gaurav-nelson/github-action-markdown-link-check@v1
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -218,7 +218,7 @@ To install mistral.rs, one should ensure they have Rust installed by following [

7) Installing Python support

You can install Python support by following the guide [here](/mistralrs-pyo3/README.md).
You can install Python support by following the guide [here](mistralrs-pyo3/README.md).

### Getting models from HF Hub

Expand Down
4 changes: 1 addition & 3 deletions docs/ADDING_MODELS.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,9 +80,7 @@ if q.rank() == 3 {
## 6) Implement a `Pipeline` and `Loader` in mistralrs-core/src/pipeline
The `Loader` is in charge of downloading and loading the model. The `download_model` method is pretty general and can be copied from an existing implementation

The `_setup_model` method instantiates the `Pipeline`. It handles loading the different model kinds. The `Pipeline` is responsible for running and sampling the model. For example, please see the [`mistral pipeline`](mistralrs-core/src/pipeline/mistral.rs).

A quantized model should be handled by the `quantized_llama.rs` implementation. For example the [`llama pipeline`](mistralrs-core/src/pipeline/llama.rs) loads GGUF and GGML model with `quantized_llama`.rs
The `_setup_model` method instantiates the `Pipeline`.The `Pipeline` is responsible for running and sampling the model. For example, please see the [`normal model pipeline`](../mistralrs-core/src/pipeline/normal.rs).


## 7) Adding an X-LoRA counterpart
Expand Down
6 changes: 4 additions & 2 deletions docs/CHAT_TOK.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
# Chat templates and tokenizer customization

## Chat templates
Mistral.rs attempts to automatically load a chat template from the `tokenizer_config.json` file. This enables high flexibility across instruction-tuned models and ensures accurate chat templating. However, if the `chat_template` field is missing, then a JINJA chat template should be provided. The JINJA chat template may use `messages`, `add_generation_prompt`, `bos_token`, `eos_token`, and `unk_token` as inputs. Some chat templates are provided [here](chat_templates), and it is easy to modify or create others.
Mistral.rs attempts to automatically load a chat template from the `tokenizer_config.json` file. This enables high flexibility across instruction-tuned models and ensures accurate chat templating. However, if the `chat_template` field is missing, then a JINJA chat template should be provided. The JINJA chat template may use `messages`, `add_generation_prompt`, `bos_token`, `eos_token`, and `unk_token` as inputs.

We provide some chat templates [here](../chat_templates/), and it is easy to modify or create others to customize chat template behavior.

For example, to use the `chatml` template, `--chat-template` is specified *before* the model architecture. For example:

Expand All @@ -11,7 +13,7 @@ For example, to use the `chatml` template, `--chat-template` is specified *befor

## Tokenizer

Some models do not provide a `tokenizer.json` file although mistral.rs expects one. To solve this, please run [this](scripts/get_tokenizers_json.py) script. It will output the `tokenizer.json` file for your specific model. This may be used by passing the `--tokenizer-json` flag *after* the model architecture. For example:
Some models do not provide a `tokenizer.json` file although mistral.rs expects one. To solve this, please run [this](../scripts/get_tokenizers_json.py) script. It will output the `tokenizer.json` file for your specific model. This may be used by passing the `--tokenizer-json` flag *after* the model architecture. For example:

```bash
$ python3 scripts/get_tokenizers_json.py
Expand Down

0 comments on commit 7da5b71

Please sign in to comment.