Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SDXL lora on inf2 #359

Closed
MrD005 opened this issue Nov 29, 2023 · 10 comments
Closed

SDXL lora on inf2 #359

MrD005 opened this issue Nov 29, 2023 · 10 comments
Assignees

Comments

@MrD005
Copy link

MrD005 commented Nov 29, 2023

want to implement LoRA on SDXL model on inf2. Is there any support or process to implement this ?

@JingyaHuang JingyaHuang self-assigned this Dec 3, 2023
@JingyaHuang
Copy link
Collaborator

Hi @MrD005,

Thanks for opening the issue, and the contribution is very welcomed!

To support LoRA, the first thing we need to check is whether the neuron compiler support the compilation of text encoder / unet loaded with LoRA weight (theorectically yes).

To do so, we can do a quick hack:

with for example

pipeline.load_lora_weights("ostris/super-cereal-sdxl-lora", weight_name="cereal_box_sdxl_v1.safetensors")

If the model compiled successfully, and doesn't meet any issue during the inference, then we can add offically support to the exporter:

  • Add a flag like "--lora_weight" in optimum/commands/export/neuronx.py for supporting it in optimum CLI
  • Add lora info to main_export throught the argument submodels
  • Load the LoRA weight through the function replace_stable_diffusion_submodels
    def replace_stable_diffusion_submodels(pipeline, submodels):

Then we are normally all set, the compiled artifacts should contain the LoRA weights and there is nothing else to add during the inference.

Much appreciation if you would like to work on it 🙏 . Feel free to ping me for review or any further question!

@MrD005
Copy link
Author

MrD005 commented Dec 4, 2023

Thanks @JingyaHuang i will implement this and revert back if its works or not

@Dev-hestabit
Copy link

@JingyaHuang i have one more query if you can help into that also
when i am running SDXL model it is truncating my prompt upto 77 length and i found different solution for GPU and try to implement them on inf2 also but stuck on tensor calculation. if you can help in this also it will be very helpful for me

@HuggingFaceDocBuilderDev

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!

3 similar comments
@HuggingFaceDocBuilderDev

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!

@HuggingFaceDocBuilderDev

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!

@HuggingFaceDocBuilderDev

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!

@khurramkhalil
Copy link

Hi there,
Just wanting to know if there is any update on adding LORAs support on inf2?
Thanks

@JingyaHuang
Copy link
Collaborator

Hi @khurramkhalil, it's already supported through this PR: #483.

You could also find an example here: https://huggingface.co/docs/optimum-neuron/en/tutorials/stable_diffusion#load-adapters

@khurramkhalil
Copy link

Thank you very much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants