Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Yi-Coder #208

Closed
ryankert01 opened this issue Sep 4, 2024 · 7 comments
Closed

Support Yi-Coder #208

ryankert01 opened this issue Sep 4, 2024 · 7 comments
Labels

Comments

@ryankert01
Copy link
Contributor

ryankert01 commented Sep 4, 2024

🚀 The feature, motivation and pitch

to-dos:

  1. implement an API for yi-coder
  2. test yi-coder out with llama lce_forward

Alternatives

No response

Additional context

from discord discussion

@ryankert01
Copy link
Contributor Author

#take @ByronHsu

@ByronHsu
Copy link
Collaborator

ByronHsu commented Sep 6, 2024

any progress

@ryankert01
Copy link
Contributor Author

I'll open a pr by the weekends

@ryankert01
Copy link
Contributor Author

Hi @ByronHsu , just noticed huggingface llama is mapped with based model, and yicoder has its base model configured. I think maybe we don't have to do a code change. I'll test it out shortly if it works. (not sure if I'm wrong)

ref:

  1. https://github.com/huggingface/transformers/blob/main/src/transformers/models/auto/modeling_auto.py#L34
  2. https://huggingface.co/01-ai/Yi-Coder-9B-Chat/blob/main/config.json#L3

@ryankert01
Copy link
Contributor Author

ryankert01 commented Sep 8, 2024

UPDATE: got it, looks like it'll soon be solve by #199

Hi @ByronHsu , I just did the research, but I found an odd thing: when I only configure the SFTconfig with use_liger=True, the GPU usage is same as not use liger, but if I use

model = AutoLigerKernelForCausalLM.from_pretrained(model_name)

it's significant better. it's not align with our sfttrainer docs on huggingface.

could you help me look into it? research notebook

@shimizust
Copy link
Collaborator

@ryankert01 Thanks for the comment. #199 is ready and should get incorporated soon. Right now, the SFTConfig doesn't actually do anything with the use_liger flag unless you pass in a model path (and then it will load model using AutoLigerKernelForCausalLM) vs. an already instantiated model. After this change, will need to have SFTTrainer updated to call this new API.

@ryankert01
Copy link
Contributor Author

close by huggingface/transformers#33502

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants