Skip to content

model_conversion_en

ymcui edited this page Jan 28, 2024 · 2 revisions

Model Conversion

This section describes the process of manually merging LoRA with the original Mixtral-8x7B-v0.1 to obtain a complete model. If you have sufficient network bandwidth, it is recommended to download the complete model directly.

Preparation

  1. Before running, ensure that the latest version of the repository is pulled: git pull
  2. Ensure that your machine has enough memory to load the full model for the model merge operation
  3. Install dependencies (requirements.txt in the root of this project):
$ pip install -r requirements.txt

Step 1: Obtain the Original Mixtral-8x7B-v0.1 Model

Original Mixtral-8x7B-v0.1: https://huggingface.co/mistralai/Mixtral-8x7B-v0.1

Related files (you only need to download the safetensors format model weights):

config.json
generation_config.json
model-00001-of-00019.safetensors
model-00002-of-00019.safetensors
...
model-00019-of-00019.safetensors
model.safetensors.index.json
special_tokens_map.json
tokenizer_config.json
tokenizer.json
tokenizer.model

Step 2: Merge LoRA Weights to Generate Full Model Weights

This step merges LoRA weights to generate full model weights (safetensors format). Execute the following command:

$ python scripts/merge_mixtral_with_chinese_lora_low_mem.py \
    --base_model path_to_original_mixtral_dir \
    --lora_model path_to_chinese_mixtral_lora \
    --output_dir path_to_output_dir 

Parameter Description:

  • --base_model: The directory where the original Mixtral-8x7B-v0.1 model weights and configuration files are stored
  • --lora_model: Directory where the decompressed files of Chinese Mixtral/Mixtral-Instruct LoRA are located, or the model call name on the HuggingFace Model Hub (will be downloaded automatically)
  • --output_dir: Specify the directory to save the full model weights, the default is ./
  • (Optional) --verbose: Display detailed information during the merge process