Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve config support for transformers with accelerate #630

Merged
merged 4 commits into from
Aug 4, 2024

Conversation

touchwolf
Copy link
Contributor

This pull request improves the configuration support for transformers when using the accelerate library. The changes address issues with incomplete consideration of the transformer's config in the original implementation.

Changes made:

  • Modified train.py to enhance config support.

Please review the changes and let me know if there are any issues or further modifications needed.

@bghira
Copy link
Owner

bghira commented Aug 4, 2024

oh, i forgot to pull this in from my local dev branch. i manually swapped these when testing schnell. thank you for fixing that - there's another one, where the sequence length is cut to 256 for dev and schnell, but i'm not sure they actually have the proper model config for that to work.

train.py Outdated Show resolved Hide resolved
train.py Outdated Show resolved Hide resolved
train.py Outdated Show resolved Hide resolved
@bghira bghira merged commit f58a5db into bghira:main Aug 4, 2024
1 check passed
@facok
Copy link

facok commented Aug 4, 2024

oh, i forgot to pull this in from my local dev branch. i manually swapped these when testing schnell. thank you for fixing that - there's another one, where the sequence length is cut to 256 for dev and schnell, but i'm not sure they actually have the proper model config for that to work.哦,我忘了从我本地的开发分支中获取它。我在测试 schnell 时手动交换了这些。感谢您解决这个问题 - 还有另一个问题,其中 dev 和 schnell 的序列长度被削减为 256,但我不确定他们实际上是否拥有正确的模型配置来使其工作。

I saw these instructions, what impact will they have?

https://github.com/huggingface/diffusers/blob/c370b90ff184a61bcbd58d486975ad4de095275e/docs/source/en/api/pipelines/flux.md

Flux comes in two variants:
    Timestep-distilled (black-forest-labs/FLUX.1-schnell)
    Guidance-distilled (black-forest-labs/FLUX.1-dev)

Timestep-distilled
    max_sequence_length cannot be more than 256.
    guidance_scale needs to be 0.
    As this is a timestep-distilled model, it benefits from fewer sampling steps.

Guidance-distilled
    The guidance-distilled variant takes about 50 sampling steps for good-quality generation.
    It doesn't have any limitations around the max_sequence_length.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants