-
-
Notifications
You must be signed in to change notification settings - Fork 879
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Add config for RoPE scaling #342
Comments
Related #245 |
What do the inputs mean? I wanna try to use this but im just a little unsure of what they mean? |
@NanoCode012 Like rope_scaling_type and rope_scaling_factor |
Please see #343 for more details @ashercn97 . I added readme sample. I chose to choose the default names. |
@NanoCode012 Okay checking out rn! |
@ashercn97 ,, please let me know how it goes.. I don't have the compute nor time to run it as of now. |
@NanoCode012 What are the options for the two things? I think that for the factor is it greater than 1, but the only thing I saw about it somewhere else was like .5 so im a little confused, and also what are the two options for the type. Sorry one more thing whree does it go in the config file? |
OH WAIT I THINK I GOT IT LEMME TRY SETTING UP A RUNPOD AND RUNNING IT! |
Okay i got it but what do the float numbers do? Is there a resource you can point me to that has like what the number means? |
@NanoCode012 Do i cahnge the sequence_len thing when im doing the rope scaling? I saw somehwere something about .5 so Im doing that and it says it like doubles it or something so do i make sequence length bigger or keep the same? |
@ashercn97 , An example from llongma 2 is linear, factor 2 or 4. I would say experiment with both seq_len (keep default 4k, or increase to expected max). This is something I am also unsure on. |
Regarding factor reference, please see the linked ref in first post or the one linked in the PR. |
okay tysm @NanoCode012 !! |
Seems like we can open two config
Validation is done in
transformer
side.Ref: https://github.com/gante/transformers/blob/30409af6e1b2b5efb6d9932b3e3b4ce20cfdb30e/src/transformers/models/llama/configuration_llama.py#L155-L174
The text was updated successfully, but these errors were encountered: