Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ways to reproduce this approach #24

Open
Yuyz0112 opened this issue Oct 26, 2023 · 2 comments
Open

Ways to reproduce this approach #24

Yuyz0112 opened this issue Oct 26, 2023 · 2 comments

Comments

@Yuyz0112
Copy link

Hi @loubnabnl, thanks for this great repo.

I've seen a blog from the VMware OCTO, which described their works on fine-tuning star-coder, but modified the code provided by the [SantaCoder](https://github.com/loubnabnl/santacoder-finetuning) git repository for fine-tuning as it is focused on the code generation task..

There are some more details like:

  1. Accelerate and DeepSpeed are used to improve fine-tuning performance.
  2. Fine-tuning generates a small PEFT model .

I think this is not the best place to discuss their approach, but since you are the expert on fine-tuning santacoder/star-coder, are there any hints we can reproduce the approach in the blog on top of the current open-source code? I also checked the star-coder fine-tuning repo, but it looks like it suggests using instruction-based fine-tuning.

@loubnabnl
Copy link
Owner

Hi, you can use the StarCoder repo for using Peft or DeepSpeed you just need to change how you build the dataset samples e.g the prepare_sample_text can just be your code file instead of a question and answer like it's done for instruction tuning, same for the deespeed code.

@Yuyz0112
Copy link
Author

Yuyz0112 commented Nov 3, 2023

@loubnabnl Thank you for the help! I've started following your suggestion. BTW, could you give some hints on the hardware requirement for fine-tuning the starcoder? Issues in the starcoder repo seem not to have a clear answer. I have 4 A16 GPUs for fine-tuning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants