Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fine Tuning #55

Closed
miolini opened this issue Mar 12, 2023 · 3 comments
Closed

Fine Tuning #55

miolini opened this issue Mar 12, 2023 · 3 comments
Labels
model Model specific

Comments

@miolini
Copy link

miolini commented Mar 12, 2023

Hey!

Thank you for your amazing job!

I'm curious is it possible to use RLHF feedback after a response to make small incremental adjustments in a tuning process? For example, if the user decides to fine-tune after an incorrect answer, can the model spend 60 seconds in the fine-tuning phase, save a checkpoint to disk, and then move on to the next question?

@miolini miolini changed the title Inplace Fine Tuning Fine Tuning Mar 12, 2023
@gjmulder
Copy link
Collaborator

I believe llama.ccp is only for inference, not training. Check out chatllama, but you will likely need some high-end GPUs to do RLHF. Alternatively, look at accelerate trl for performing RHLF on models that fit on consumer GPUs.

@miolini
Copy link
Author

miolini commented Mar 12, 2023

@gjmulder I would like to continue narrative to run such processes on CPU only. Even if it super slow I think it's possible to spend some time budget (60 secs) to improve a bit weights and close the loop of self improvement like
in Gödel machines.

@gjmulder
Copy link
Collaborator

Check out thread #23. This would allow you to have ChatGPT-type narrative conversations with the model, but is not RLHF.

@gjmulder gjmulder added the model Model specific label Mar 15, 2023
@gjmulder gjmulder closed this as not planned Won't fix, can't repro, duplicate, stale Mar 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
model Model specific
Projects
None yet
Development

No branches or pull requests

2 participants