Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Add config for RoPE scaling #342

Closed
NanoCode012 opened this issue Aug 5, 2023 · 13 comments · Fixed by #343
Closed

[Feature] Add config for RoPE scaling #342

NanoCode012 opened this issue Aug 5, 2023 · 13 comments · Fixed by #343
Labels
enhancement New feature or request

Comments

@NanoCode012
Copy link
Collaborator

Seems like we can open two config

  • rope_scaling_type:
  • rope_scaling_factor:

Validation is done in transformer side.

Ref: https://github.com/gante/transformers/blob/30409af6e1b2b5efb6d9932b3e3b4ce20cfdb30e/src/transformers/models/llama/configuration_llama.py#L155-L174

@NanoCode012 NanoCode012 added the enhancement New feature or request label Aug 5, 2023
@NanoCode012
Copy link
Collaborator Author

Related #245

@ashercn97
Copy link

What do the inputs mean? I wanna try to use this but im just a little unsure of what they mean?

@ashercn97
Copy link

@NanoCode012 Like rope_scaling_type and rope_scaling_factor

@NanoCode012
Copy link
Collaborator Author

Please see #343 for more details @ashercn97 . I added readme sample. I chose to choose the default names.

@ashercn97
Copy link

@NanoCode012 Okay checking out rn!

@NanoCode012
Copy link
Collaborator Author

NanoCode012 commented Aug 6, 2023

@ashercn97 ,, please let me know how it goes.. I don't have the compute nor time to run it as of now.

@ashercn97
Copy link

@NanoCode012 What are the options for the two things? I think that for the factor is it greater than 1, but the only thing I saw about it somewhere else was like .5 so im a little confused, and also what are the two options for the type. Sorry one more thing whree does it go in the config file?

@ashercn97
Copy link

OH WAIT I THINK I GOT IT LEMME TRY SETTING UP A RUNPOD AND RUNNING IT!

@ashercn97
Copy link

Okay i got it but what do the float numbers do? Is there a resource you can point me to that has like what the number means?

@ashercn97
Copy link

@NanoCode012 Do i cahnge the sequence_len thing when im doing the rope scaling? I saw somehwere something about .5 so Im doing that and it says it like doubles it or something so do i make sequence length bigger or keep the same?

@NanoCode012
Copy link
Collaborator Author

@ashercn97 , An example from llongma 2 is linear, factor 2 or 4. I would say experiment with both seq_len (keep default 4k, or increase to expected max).

This is something I am also unsure on.

@NanoCode012
Copy link
Collaborator Author

Regarding factor reference, please see the linked ref in first post or the one linked in the PR.

@ashercn97
Copy link

okay tysm @NanoCode012 !!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants