Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MLCube implementation for llama2 #749

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

davidjurado
Copy link
Contributor

@davidjurado davidjurado commented Jun 14, 2024

MLCube for Llama 2

MLCube™ GitHub repository. MLCube™ wiki.

Project setup

An important requirement is that you must have Docker installed.

# Create Python environment and install MLCube Docker runner 
virtualenv -p python3 ./env && source ./env/bin/activate && pip install pip==24.0
pip install mlcube-docker
# Fetch the implementation from GitHub
git clone https://github.com/mlcommons/training && cd ./training
git fetch origin pull/749/head:feature/mlcube_llama2 && git checkout feature/mlcube_llama2
cd ./llama2_70b_lora/mlcube

Inside the mlcube directory run the following command to check implemented tasks.

mlcube describe

Extra requirements

Install Rclone in your system, by following these instructions.

MLCommons hosts the model for download exclusively by MLCommons Members. You must first agree to the confidentiality notice.

When finishing the previous form, you will be redirected to a Drive folder containing a file called CLI Download Instructions, follow the instructions inside that file up to step: #3 Authenticate Rclone with Google Drive.

When finishing this step a configuration file for Rclone will contain the necessary data to download the dataset and models. To check where this file is located run the command:

 rclone config file

Default: ~/.config/rclone/rclone.conf

Finally copy that file inside the workspace folder that is located in the same path as this readme, it must have the name rclone.conf.

MLCube tasks

  • Core tasks:

Download dataset.

mlcube run --task=download_data -Pdocker.build_strategy=always

Train.

mlcube run --task=train -Pdocker.build_strategy=always
  • Demo tasks:

Here is a video explaining the demo steps:

IMAGE ALT TEXT HERE

Download demo dataset.

mlcube run --task=download_demo -Pdocker.build_strategy=always

Train demo.

mlcube run --task=demo -Pdocker.build_strategy=always

Execute the complete pipeline

You can execute the complete pipeline with one single command.

  • Core pipeline:
mlcube run --task=download_data,train -Pdocker.build_strategy=always
  • Demo pipeline:
mlcube run --task=download_demo,demo -Pdocker.build_strategy=always

@davidjurado davidjurado requested a review from a team as a code owner June 14, 2024 16:27
Copy link

github-actions bot commented Jun 14, 2024

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant