From 540a42e4da1a497b5f8f5d0b6e178e7e6e9c8a6a Mon Sep 17 00:00:00 2001 From: Louie Tsai Date: Fri, 20 Oct 2023 02:40:10 -0700 Subject: [PATCH] Add docker setup session for neuralchat finetuning sample (#496) * Update README.md to new added docker setup session Signed-off-by: Louie Tsai --- .../examples/instruction_tuning/README.md | 25 ++++++++++++++++++- 1 file changed, 24 insertions(+), 1 deletion(-) diff --git a/intel_extension_for_transformers/neural_chat/examples/instruction_tuning/README.md b/intel_extension_for_transformers/neural_chat/examples/instruction_tuning/README.md index 200c39843e5..6816dfa3cf6 100644 --- a/intel_extension_for_transformers/neural_chat/examples/instruction_tuning/README.md +++ b/intel_extension_for_transformers/neural_chat/examples/instruction_tuning/README.md @@ -14,12 +14,35 @@ This example demonstrates how to finetune the pretrained large language model (L # Prerequisite​ ## 1. Environment​ +### Bare Metal Recommend python 3.9 or higher version. ```shell pip install -r requirements.txt # To use ccl as the distributed backend in distributed training on CPU requires to install below requirement. python -m pip install oneccl_bind_pt==2.1.0 -f https://developer.intel.com/ipex-whl-stable-cpu ``` +### Docker +Pick either one of below options to setup docker environment. +#### Option 1 : Build Docker image from scratch +Please refer to this section : [How to build docker images for NeuralChat FineTuning](https://github.com/intel/intel-extension-for-transformers/tree/main/intel_extension_for_transformers/neural_chat/docker/finetuning#4-build-docker-image) to build docker image from scratch. +Once you have the docker image ready, please follow [run docker image](https://github.com/intel/intel-extension-for-transformers/tree/main/intel_extension_for_transformers/neural_chat/docker/finetuning#5-create-docker-container) session to launch a docker instance from the image. + +#### Option 2: Pull existing Docker image +Please follow the session [itrex docker setup](https://github.com/intel/intel-extension-for-transformers/tree/main/docker#set-up-docker-image) and use the docker pull command to pull itrex docker image. +Once you have itrex docker image, follow below section to update itrex docker instance for this finetuning example. +```shell +wget https://raw.githubusercontent.com/oneapi-src/oneAPI-samples/master/AI-and-Analytics/Getting-Started-Samples/IntelAIKitContainer_GettingStarted/run_oneapi_docker.sh +cp requirement.txt /tmp +# change intel/ai-tools:itrex-0.1.1 according to itrex docker setup session +./run_oneapi_docker.sh intel/ai-tools:itrex-0.1.1 +# don't need ipex in this sample +pip uninstall intel_extension_for_pytorch +# update ITREX pip package in the docker instance +pip install intel-extension-for-transformers --upgrade +``` +After those instructions, you should be able to run below steps inside the docker instance. + + ## 2. Prepare the Model @@ -480,4 +503,4 @@ For finetuning on SPR, add `--bf16` argument will speedup the finetuning process You could also indicate `--peft` to switch peft method in P-tuning, Prefix tuning, Prompt tuning, LLama Adapter, LoRA, see https://github.com/huggingface/peft. Note for MPT, only LoRA is supported. -Add option **"--use_fast_tokenizer False"** when using latest transformers if you met failure in llama fast tokenizer for llama, The `tokenizer_class` in `tokenizer_config.json` should be changed from `LLaMATokenizer` to `LlamaTokenizer`. \ No newline at end of file +Add option **"--use_fast_tokenizer False"** when using latest transformers if you met failure in llama fast tokenizer for llama, The `tokenizer_class` in `tokenizer_config.json` should be changed from `LLaMATokenizer` to `LlamaTokenizer`.