Skip to content

VectorLM 0.1.2

Latest
Compare
Choose a tag to compare
@jacobthebanana jacobthebanana released this 04 Jun 01:17
9045f08

This release implements Low-Rank parameter-efficient fine-tuning (LoRA PEFT) for FSDP-sharded models and adds utilities for reporting the training throughput of the fine-tuning pipeline.

  • LoRA fine-tuning of LLMs such as Mixtral 8x7B on 4x A100 80GB through FSDP model parallelism: Torch FSDP splits model weights across GPUs since the model would not fit on one single GPU, whereas LoRA PEFT greatly reduces the GPU memory required for storing optimizer states. To enable FSDP-LoRA, uncomment the LoRA-PEFT section of config.yaml. Refer to the Memory & Compute documentation for more details.
  • Benchmarking tools and reference throughput table: To help researchers estimate the resources required for their experiments, we provide reference LLM finetuning training token throughput for a number of models on the Vector Vaughan cluster- both PEFT LoRA and full-rank. We also provide benchmarking tools for testing throughput on other models and environments.