Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flux latents fix #9929

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open

Flux latents fix #9929

wants to merge 8 commits into from

Conversation

DN6
Copy link
Collaborator

@DN6 DN6 commented Nov 14, 2024

What does this PR do?

This PR: #9711 changed the latent preparation step so that the vae_scale_factor and default sample size are more clear.

However, we overlooked the fact that this operation (with scale_factor == 16)

        height = 2 * (int(height) // self.vae_scale_factor)
        width = 2 * (int(width) // self.vae_scale_factor)

Is not equivalent to this operation with scale_factor == 8

height = int(height) // self.vae_scale_factor

For resolutions that are divisible by 8 but not by 16 the current pipeline on main will error out, because the implementation does not account for the fact that the latent width and height must be divisible by 2 for the packing step to work.

The old implementation scaled the height and width by //16 and then multiplied by 2. This results in slightly resized images if the height or width aren't multiples of 16.

This PR:

  1. Adds steps to account for latent packing. Which will result in resized output images (like we had before)
  2. Adds a warning that the image will be resized if an incompatible image is provided
  3. Adds fast tests to check Flux with expected and unexpected image shapes to verify that the scaling is properly applied.

Fixes # (issue)

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@DN6 DN6 requested a review from yiyixuxu November 14, 2024 11:23
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@bghira
Copy link
Contributor

bghira commented Nov 16, 2024

this seems really important but considering how slowly things move around here, not so sure it will be merged in a timely manner.

Copy link
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh thanks for the fix!
I made a comment

width = int(width) // self.vae_scale_factor
# VAE applies 8x compression on images but we must also account for packing which requires
# latent height and width to be divisible by 2.
height = int(height) // self.vae_scale_factor - ((int(height) // self.vae_scale_factor) % 2)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this same as this? (basically revert back to the original code but divisible by self.vae_scale_factor*2
a little bit easier to understand, no?

height = 2 * (int(height) // (self.vae_scale_factor*2))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants