-
-
Notifications
You must be signed in to change notification settings - Fork 876
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Axolotl supports falcon + qlora #132
Conversation
Thanks! would you mind adding an Errata section to the bottom of the README, specifically that falcon/qlora/xformers doesn't work? someone will ultimately attempt to try that and just having it documented somewhere would be a great help. |
OK, I'll test the combination today based on the new xformer patch landed in the docker image, and add the section. Hopefully I'll have an idea about why it won't work too. I'll also test flash attention and change max packed sequence length to empty as suggested by caseus on Discord to see if it helps with the VRAM usage. |
By an Errata section, do you mean that for each combination that axolotl doesn't support yet, there's a short description about the unsupported reason like errors and tracking issue? Also, it would be nice to link check marks to example configs too, I was attempted to do so in the PR 😉 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some points I saw
I just tried your changes, they don't work Here's what I get:
|
micro_batch_size: 40 | ||
gradient_accumulation_steps: 2 | ||
num_epochs: 3 | ||
optimizer: paged_adamw_32bit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does paged_adamw_32bit
converge? I remember seeing some tests where this optimizer is problematic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems fine in my tests, and it seems to help surviving VRAM spike
@FarisHijazi Hi, this happened before with outdated deps. In my tests, I updated to deps to master. I'll try to reproduce this in a raw setting and work my way back to determine which dep should be updated ASAP. For now, please check https://github.com/utensil/llm-playground/blob/main/scripts/prepare_qlora.sh . |
I've just created a minimal Colab notebook, so anyone can jump start to try it out. It's using a free T4 GPU instance. The notebook works nice and clean as is. But I'll break it down here to describe issues I encountered during creating that notebook, since others might encounter similar issues when deviating from the notebook: ModuleNotFoundError: No module named 'peft'Fixed by It should have been installed by axolotl but it's still not found. Other packages in UPDATE: Found the root cause of this and raised #151 RuntimeError: self and mat2 must have the same dtypeThis error appears if one encounters the error above, then run ValueError: Your setup doesn't support bf16/gpu. You need torch>=1.10, using Ampere GPU with cuda>=11.0This is a T4-specific issue. Fixed by ValueError: --tf32 requires Ampere or a newer GPU arch, cuda>=11 and torch>=1.7Also a T4-specific issue. Fixed by |
I think most of these float type errors cna be solved by changing the gpu (colab peo or switching to another peovider) |
@FarisHijazi Thank you! The root cause of the issue you encountered has been found. It is described in #151 . It's easy to fix it by running |
damn..... shape error getting solved by upgrading a dependency... will try to run it now |
getting this error on my machine, will try some other things and update you
|
@FarisHijazi Possible fix to that: bitsandbytes-foundation/bitsandbytes#425 (comment) Can you elaborate on the environment or setup? (Maybe on a separate issue to better follow up) |
Will try to get on that when i finish work Btw i do get a bitsandbytes warning when i finetune the other models but it does train. But for falcon i get a warning and then later an error |
Hello @utensil , thank you for the extensive work and report. Since, you have tested it working for falcon+qlora, would you be interested in merging this first? I also saw that you have included ipynb. Do you want to also upload the notebook to the same folder as the config before merge? Regarding winglian's initial comment, he might've meant to add a short description for that combination here: https://github.com/OpenAccess-AI-Collective/axolotl#common-errors- Lastly, you can follow up with a separate Issue for your other task (multi-gpu etc)? |
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>
Hi @NanoCode012 , thank you for the review, I'm OK to merge it as is, and leave other tasks including the notebook tasks and the errata section (I'm not so sure if they're still there now) to future PRs. |
Would it be possible to fix the failing pre-commit install # if haven't before
pre-commit run --all-files |
Thank you for the amazing work! |
nice I gotta try this with Samantha
…On Thu, Jun 8, 2023 at 9:14 AM NanoCode012 ***@***.***> wrote:
Thank you for the amazing work!
—
Reply to this email directly, view it on GitHub
<#132 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAIQ4BOTTVMF56CLXIHDUWTXKH26FANCNFSM6AAAAAAYVWDPPY>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Axolotl supports falcon + qlora
This PR:
confirms that axolotl supports falcon + qlora
adds
example/falcon/config-7b-qlora.yml
tested with falcon 1b & 7b on A6000 (with
xformers_attention
turned off), training report: https://api.wandb.ai/links/utensil/y0iazatwchange
max_packed_sequence_len
to emptyuse
gradient_accumulation_steps
test falcon + qlora + xformer (training report updated)
test falcon 40b + qlora + xformer (training report updated)
Update qlora configs to the latest test ones, for now please see notebooks below
or this gist verified by @fearnworks (now merged)Isolate the qlora dependencies and produce a clean setup so everyone can use it
Add templates and notebooks to jump start
Try multi-GPU and compare training time to ft
Evaluate inference result
Errata section
test falcon + qlora + flash attention 🧵
To reproduce falcon + qlora:
winglian/axolotl-runpod:main-cu118-2.0.0
Disclaimer: the config works, but might not be optimal. Improvements welcome!