Skip to content

Latest commit

 

History

History
317 lines (205 loc) · 13.5 KB

README.md

File metadata and controls

317 lines (205 loc) · 13.5 KB

Helibrunna

A Hugging Face compatible Small Language Models trainer by Dr. Tristan Behrens

Helibrunna

Helibrunna is an advanced, open-source framework designed to facilitate the training and experimentation of Small Language Models. Developed by Dr. Tristan Behrens, this tool aims to explore the potentials of small models. It is especially tailored for use with datasets hosted on Hugging Face, making it a versatile tool for developers and researchers in the AI community.

In addition to the xLSTM models, the project also supports:

  • Pharia: An experimental architecture aimed at pushing the boundaries of efficiency in sequential tasks.
  • Vanilla Transformer: The original transformer architecture, known for its multi-head self-attention mechanism.
  • Mini Llama: A lightweight version of Llama, optimized for smaller devices and faster inference without sacrificing too much performance.
  • Llama Three: The latest iteration in the Llama family, bringing enhanced capabilities and performance improvements over its predecessors.

This expanded functionality makes Helibrunna a versatile tool not just for xLSTM exploration, but also for working with a variety of other state-of-the-art models, providing developers and researchers with a comprehensive suite of tools for AI experimentation.

Star History

Star History Chart

License Overview

This project is licensed under the GNU Affero General Public License (AGPL) version 3. We have chosen this license to maintain consistency with the xLSTM project, which is also licensed under the AGPL.

The AGPL is specifically designed to ensure that any modifications to the code, especially when deployed in a networked environment, are shared with the community. This aligns with the principles of the xLSTM project, promoting open collaboration and ensuring that improvements remain freely accessible to all users.

For more details on the license, please see the LICENSE file.

Acknowledgements

This repository is dedicated to my second hometown Heilbronn, who has become one of the most creative AI hubs in Germany.

This work is sponsored by KI Salon, who is a strong supporter of open-source AI.

We have built the functionality on top of the official xLSTM project.

Shoutout to experimenta, Bildungscampus, 42 coding school, IPAI, STACKIT and Dieter Schwarz Stiftung, who among others make Heilbronn a high-tech place.

Get in touch and get involved

Do not hesitate to report any issues that you might find here. And please connect on LinkedIn. We are happy about everyone who says "hello".

If you want to contribute, please fork the repository and send pull requests. Looking forward! Since this is an open-source project we are most eager for you to participate.

And if you want to join as a developer, let us know!

How to use, how to credit, and README

We would be very happy if you would go wild and train xLSTMs with Helibrunna. If you publish your work, please be so kind and give credit to Helibrunna and link the project.

You can use this banner:

Trained with Helibrunna

And this is an example of how to credit:

Trained with Helibrunna

Note, that everytime you train, a template README (aka modelcard) file will be generated. You can and you should edit it before uploading your models anywhere. Here is an example: musicxlstm on Hugging Face.

And please, if you have published anything, let us know. We would love to promote your work.

Features

Note, that as of now, this implementation is quite basic. It is our goal to accelerate the adoption of xLSTM to find out how superior it is to self-attention based transformers (if it is). This goal requires thorough experimentation.

In other words: This repo is currently in an early stage, and thus we cannot guarantee that it works.

These features are currently implemented:

  • Training xLSTM with datasets hosted on Hugging Face.
  • Support for Weights & Biases.
  • Distributed training Accelerate (untested).
  • Basic model inference with temperature sampling.
  • Uploading to Hugging Face.
  • Downloading from Hugging Face.

These features are planned or would be great to work on:

  • Exposing the model as an OpenAI compatible API.
  • Support for TensorBoard.
  • Fine-tuning.
  • Fine-tuning with LoRA adapters.
  • Quantization.
  • Training on a GPT2-size dataset, such as openwebtext.
  • More sophisticated sampling.
  • Porting to MLX.

Setting up things

So far, when it comes to compatibility, we have these configurations:

Apple (no silicon) Apple (silicon) Unix (NVIDIA) Unix (no NVIDIA) Raspberry Pi
xLSTM 🧐 🧐
Mamba 🧐 🧐
Pharia 🤞 🧐
Transformer 🤞 🤞
xLSTM ONNX 🧐 🧐
Mamba ONNX 🧐 🧐
Pharia ONNX 🧐 🧐
Transformer ONNX 🤞 🤞

✅ = tested and working, ❌ = tested and not working, ❔ = not tested, 🤞 = not tested but very likely, 🧐 = not tested but very unlikely

Note that ONNX support is rather rudimentary.

Unix (NVIDIA)

First, be so kind and install xLSTM following the instructions here: https://github.com/NX-AI/xlstm

This should be a walk in the park. Do not skip the step with the conda environment and please make sure this environment is active.

Then, please install additional dependencies using requirements.txt:

conda activate xlstm
pip install -r requirements.txt

Then you should be ready to go!

Apple (silicon)

Support is not fully implemented yet. We are working on it. Currently we believe that you should only use this platform for inference and not for training.

It is advised to create a new conda environment and install the dependencies from requirements.txt:

conda create -n "helibrunna" python=3.10.13
conda activate helibrunna
pip install -r requirements-mac.txt

Raspberry Pi

Support is not fully implemented yet. We are working on it. Currently we believe that you should only use this platform for inference and not for training.

It is advised to create a new conda environment and install the dependencies from requirements.txt:

sudo apt-get install cmake
conda create -n "helibrunna" python=3.12.4
conda activate helibrunna
pip install -r requirements-raspberry.txt

Note: Installing the dependencies might take a while, because ONNX is compiled from source using cmake.

Dataset Preprocessing

Usually the dateset preprocessing happens very early when you start a training. Some datasets might require you to preprocess in a separate step. This is how you can do it:

python train.py preprocess configs/musicxlstm.yaml

Training xLSTM and other models.

Here, we will collect a few examples. Make sure that the conda environment is active.

Note: Training is so far only tested on Unix (NVIDIA).

Training a music-xLSTM on Johann Sebastian Bach chorales

We have included a config file that will train xLSTM on symbolic music. You can run it like this:

python train.py configs/musicxlstm.yaml

Training an xLSTM on the written works by H.P. Lovecraft

There is also another config file that upcycles the GPT2 tokenizer and trains an xLSTM on the lovecraft corpus:

python train.py configs/lovecraft.yaml

Training with Accelerate

accelerate config
accelerate launch train.py configs/musicxlstm.yaml

Train on openwebtext

This one is a little more complex. Because of the dataset size it is advised to preprocess it first. And here the model config is standalone and must be loaded separatedly.

This is preprocessing with the separate model config:

python train.py preprocess configs/openwebtext.yaml configs/xlstm_7to1_01_125M.yaml

And this is how to train:

accelerate launch train.py configs/openwebtext.yaml configs/xlstm_7to1_01_125M.yaml

Running inference

This is how you can run inference with a trained model:

python generate.py --model_path_or_repo MODEL_PATH_OR_REPO --temperature 0.5 --max_length 100 --prompt "PROMPT"

Set MODEL_PATH_OR_REPO, and PROMPT properly. MODEL_PATH_OR_REPO is usually a directory that starts with run_, or of course and xLSTM that lives on Hugging Face.

Here is an example that will download and run musicxlstm.

python generate.py --model_path_or_repo TristanBehrens/musicxlstm --temperature 0.5 --max_length 100 --prompt "PIECE_START"

Uploading a model to Hugging Face.

Make sure that you are logged into Hugging Face. If you are not, do this:

huggingface-cli login

Make sure you use an access token that allows for writing.

This is how you can push a model. It will use the latest checkpoint:

python pushtohuggingface.py --model_path MODEL_PATH --username_or_orga USERNAME_OR_ORGA --repo_name REPO_NAME --private true

Make sure to fill in MODEL_PATH, USERNAME_OR_ORGA, and REPO_NAME. MODEL_PATH is usually a directory that starts with run_.

You might want to edit the README.md file.

Inference speeds

Here are some inference speeds for the models that we have trained. This is just a simple test for generating 128 tokens. Unit of measurement is tokens per second:

Apple (no silicon) Apple (silicon) Unix (NVIDIA) Unix (no NVIDIA) Raspberry Pi
xLSTM 230
Mamba 237
Pharia 688 364 51
Transformer 980 528 64
xLSTM ONNX ?
Mamba ONNX 876
Pharia ONNX ? ? ?
Transformer ONNX 1796 1881 400

A question mark means that the model has not been tested on this platform or that the experiment did not work.

These are the models that we have tested:

Apple (Silicon):

  • Platform: Darwin
  • Platform Version: Darwin Kernel Version 23.3.0: Wed Dec 20 21:31:10 PST 2023; root:xnu-10002.81.5~7/RELEASE_ARM64_T6031
  • Architecture: arm64
  • Processor: arm
  • Python Version: 3.10.13
  • CPU Cores (Logical): 16
  • CPU Cores (Physical): 16
  • Total Memory (GB): 48.0
  • Total Disk Space (GB): 926.3517189025879
  • GPU: No GPU detected

Unix (NVIDIA):

  • Platform: Linux
  • Platform Version: #129~20.04.1-Ubuntu SMP Wed Aug 7 13:07:13 UTC 2024
  • Architecture: x86_64
  • Processor: x86_64
  • Python Version: 3.10.9
  • CPU Cores (Logical): 24
  • CPU Cores (Physical): 12
  • Total Memory (GB): 31.250316619873047
  • Total Disk Space (GB): 915.3232879638672
  • GPU 0 Name: NVIDIA GeForce RTX 3090
  • GPU 0 Memory Total (GB): 24.0
  • GPU 0 Driver Version: 535.161.08
  • GPU 1 Name: NVIDIA GeForce RTX 3090
  • GPU 1 Memory Total (GB): 24.0
  • GPU 1 Driver Version: 535.161.08

Raspberry Pi:

  • Platform: Linux
  • Platform Version: #1 SMP PREEMPT Debian 1:6.6.31-1+rpt1 (2024-05-29)
  • Architecture: aarch64
  • Processor:
  • Python Version: 3.12.4
  • CPU Cores (Logical): 4
  • CPU Cores (Physical): 4
  • Total Memory (GB): 7.8636627197265625
  • Total Disk Space (GB): 116.66678619384766
  • GPU: GPUtil not installed or no GPU detected

THANKS!