[DC-AE] Add the official Deep Compression Autoencoder code(32x,64x,128x compression ratio); #9708

lawrence-cj · 2024-10-18T09:49:34Z

What does this PR do?

This PR will add the official DC-AE (Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models) into the diffusers lib. DC-AE first makes the Autoencoder is able to compress images into 32x, 64x, and 128x latent space without performance degradation. It's also an AE used by the powerful T2I base model SANA

Paper: https://arxiv.org/abs/2410.10733v1
Original code repo: https://github.com/mit-han-lab/efficientvit/tree/master/applications/dc_ae

Core contributor of DC-AE:
work with @chenjy2003

Core library:

Docs: @stevhliu and @sayakpaul
General functionalities: @sayakpaul @yiyixuxu @DN6

We want to collaborate on this PR together with friends from HF. Feel free to contact me here. Cc: @sayakpaul

Abhinay1997 · 2024-10-19T04:03:54Z

Looking forward to this @lawrence-cj!

sayakpaul

Some minor comments. But to make progress on this PR:

We need to try to eliminate unnecessary dependencies like omegaconf.
Follow how we implement Autoencoders in diffusers. Example: https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/autoencoders/autoencoder_kl.py
Try to reuse the blocks as done in https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/autoencoders/autoencoder_kl.py.
If we cannot reuse blocks, okay to define them in the modeling file. However, we need to keep things to native torch only, for now.
All major model classes like AutoencoderKL should inherit from ModelMixin.

src/diffusers/models/attention_processor.py

src/diffusers/models/autoencoders/autoencoder_DCAE.py

sayakpaul

Left two comments.

src/diffusers/models/normalization.py

src/diffusers/models/autoencoders/dc_ae.py

# Conflicts: # src/diffusers/models/normalization.py

sayakpaul

Thanks for your work.

I have left some comments but let's wait for @yiyixuxu's comments, as well before making any changes.

Yiyi, this autoencoder is going to be crucial to support efficient models like SANA: https://arxiv.org/abs/2410.10629 (which will land after this PR).

src/diffusers/models/autoencoders/dc_ae.py

yiyixuxu

thanks for the PR!
I left some comments to start with, let me know if you have any questions:)

src/diffusers/models/autoencoders/dc_ae.py

chenjy2003 · 2024-12-05T11:25:12Z

@lawrence-cj @a-r-r-o-w @yiyixuxu @sayakpaul I have double checked this PR and made some minor modifications. And I will upload the converted weights soon. Could you please check if the modifications still meet the requirements of diffusers? If so, I think this PR is ready to merge. Thank you all for the efforts!

a-r-r-o-w · 2024-12-05T18:30:54Z

@chenjy2003 Thanks, the changes look great and the outputs are still the same! I simplified those branches since none of the current checkpoints seemed to use them, but still good to have. Will merge this PR once you give us the go regarding the diffusers-format checkpoints

chenjy2003 · 2024-12-06T01:27:17Z

Hi @a-r-r-o-w, all the converted weights are uploaded. Thanks!
dc-ae-f32c32-sana-1.0-diffusers
dc-ae-f32c32-in-1.0-diffusers
dc-ae-f64c128-in-1.0-diffusers
dc-ae-f128c512-in-1.0-diffusers
dc-ae-f32c32-mix-1.0-diffusers
dc-ae-f64c128-mix-1.0-diffusers
dc-ae-f128c512-mix-1.0-diffusers

DN6 · 2024-12-06T09:18:17Z

src/diffusers/loaders/single_file_utils.py

@@ -92,6 +97,7 @@
        "double_blocks.0.img_attn.norm.key_norm.scale",
        "model.diffusion_model.double_blocks.0.img_attn.norm.key_norm.scale",
    ],
+    "autoencoder_dc": "decoder.stages.0.op_list.0.main.conv.conv.weight",


We would need to infer the model repo type using this key right? That still has to be added.

Oh sorry, missed it. Adding now, but not sure how this worked before then 🤔

DN6 · 2024-12-06T09:19:25Z

src/diffusers/loaders/single_file_utils.py

@@ -2198,3 +2204,250 @@ def swap_scale_shift(weight):
    )

    return converted_state_dict
+
+
+def create_autoencoder_dc_config_from_original(original_config, checkpoint, **kwargs):


I think for new single file models let's not rely on the original configs anymore. This was for legacy support for the SD1.5/XL models with yaml configs. It's better to infer the diffusers config from the checkpoint and use that for loading.

This might be a little difficult here, so please lmk if you have any suggestions on what to do.

Some DCAE checkpoints have the exact same structure and configuration, except for scaling_factor. For example, dc-ae-f128c512-in-1.0-diffusers and dc-ae-f128c512-mix-1.0-diffusers` only differ in their scaling factor.

I'm unsure how we would determine this just by the model structure. Do we rely on the user passing it as a config correctly, and document this info somewhere?

I think that's fine since in the snippet in the docs, we're doing the same thing just with original_config instead of config right?

Updated usage to config now and verified that it works. Thank you for the fixes and suggestions!

stevhliu

Thanks for adding docs!

docs/source/en/api/models/autoencoder_dc.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

a-r-r-o-w · 2024-12-06T19:01:31Z

@lawrence-cj @chenjy2003 We have removed support for loading original-format Autoencoder because of some complications in this PR. @DN6 will take it up soon to add support correctly. Sorry for the delay! Just doing some final cleanup and will merge after

first add a script for DC-AE;

6e616a9

sayakpaul marked this pull request as draft October 18, 2024 10:27

chenjy2003 and others added 4 commits October 22, 2024 23:18

Merge remote-tracking branch 'upstream/main' into DC-AE

d2e187a

DC-AE init

90e8939

replace triton with custom implementation

825c975

1. rename file and remove un-used codes;

3a44fa4

sayakpaul reviewed Oct 23, 2024

View reviewed changes

chenjy2003 added 9 commits October 24, 2024 20:48

no longer rely on omegaconf and dataclass

55b2615

merge

6fb7fdb

Merge remote-tracking branch 'upstream/main' into DC-AE

c323e76

replace custom activation with diffuers activation

da7caa5

remove dc_ae attention in attention_processor.py

fb6d92a

iinherit from ModelMixin

5e63a1a

inherit from ConfigMixin

72cce2b

dc-ae reduce to one file

8f9b4e4

Merge remote-tracking branch 'upstream/main' into DC-AE

b7f68f9

sayakpaul reviewed Oct 31, 2024

View reviewed changes

src/diffusers/models/normalization.py Outdated Show resolved Hide resolved

src/diffusers/models/autoencoders/dc_ae.py Outdated Show resolved Hide resolved

lawrence-cj added 2 commits November 4, 2024 23:19

Merge branch 'huggingface:main' into DC-AE

6d96b95

Merge remote-tracking branch 'refs/remotes/origin/main' into DC-AE

3c3cc51

# Conflicts: # src/diffusers/models/normalization.py

sayakpaul mentioned this pull request Nov 8, 2024

Integration of DC-AE (Deep Compression Autoencoder) #9894

Closed

chenjy2003 and others added 5 commits November 8, 2024 19:47

update downsample and upsample

1448681

merge

bf40fe8

clean code

dd7718a

support DecoderOutput

19986a5

Merge branch 'main' into DC-AE

3481e23

sayakpaul reviewed Nov 9, 2024

View reviewed changes

yiyixuxu reviewed Nov 12, 2024

View reviewed changes

lawrence-cj and others added 2 commits November 13, 2024 18:02

Merge branch 'main' into DC-AE

0e818df

remove get_same_padding and val2tuple

c6eb233

a-r-r-o-w and others added 8 commits December 5, 2024 07:51

fix tests

f862bae

update

f9fce24

Merge branch 'main' into DC-AE

e594745

minor fix

3c0b1ca

minor fix

91057d4

Merge branch 'main' into DC-AE

67aa715

minor fix & in/out shortcut rename

eda66e1

minor fix

e3d33e6

a-r-r-o-w added 2 commits December 5, 2024 23:55

Merge branch 'main' into DC-AE

cc97502

make style

2b370df

chenjy2003 and others added 3 commits December 5, 2024 23:05

fix paper link

94355ab

Merge branch 'main' into DC-AE

a191f07

update docs

116c049

DN6 reviewed Dec 6, 2024

View reviewed changes

lawrence-cj mentioned this pull request Dec 6, 2024

Pre-emptive Feature anticipation NVlabs/Sana#4

Open

11 tasks

a-r-r-o-w added 3 commits December 6, 2024 14:26

update single file loading

b6e0aba

Merge branch 'main' into DC-AE

ec4e84f

make style

dbae8f1

stevhliu approved these changes Dec 6, 2024

View reviewed changes

a-r-r-o-w and others added 3 commits December 6, 2024 19:58

remove single file loading support; todo for DN6

042c2a0

Apply suggestions from code review

f2525b9

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

Merge branch 'main' into DC-AE

d3d224c

add abstract

6122b84

a-r-r-o-w approved these changes Dec 6, 2024

View reviewed changes

a-r-r-o-w merged commit cd89204 into huggingface:main Dec 6, 2024
15 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DC-AE] Add the official Deep Compression Autoencoder code(32x,64x,128x compression ratio); #9708

[DC-AE] Add the official Deep Compression Autoencoder code(32x,64x,128x compression ratio); #9708

lawrence-cj commented Oct 18, 2024 •

edited

Loading

Abhinay1997 commented Oct 19, 2024

sayakpaul left a comment

sayakpaul left a comment

sayakpaul left a comment

yiyixuxu left a comment

chenjy2003 commented Dec 5, 2024

a-r-r-o-w commented Dec 5, 2024

chenjy2003 commented Dec 6, 2024

DN6 Dec 6, 2024

a-r-r-o-w Dec 6, 2024

DN6 Dec 6, 2024

a-r-r-o-w Dec 6, 2024 •

edited

Loading

DN6 Dec 6, 2024

a-r-r-o-w Dec 6, 2024

stevhliu left a comment

a-r-r-o-w commented Dec 6, 2024

[DC-AE] Add the official Deep Compression Autoencoder code(32x,64x,128x compression ratio); #9708

[DC-AE] Add the official Deep Compression Autoencoder code(32x,64x,128x compression ratio); #9708

Conversation

lawrence-cj commented Oct 18, 2024 • edited Loading

What does this PR do?

Abhinay1997 commented Oct 19, 2024

sayakpaul left a comment

Choose a reason for hiding this comment

sayakpaul left a comment

Choose a reason for hiding this comment

sayakpaul left a comment

Choose a reason for hiding this comment

yiyixuxu left a comment

Choose a reason for hiding this comment

chenjy2003 commented Dec 5, 2024

a-r-r-o-w commented Dec 5, 2024

chenjy2003 commented Dec 6, 2024

DN6 Dec 6, 2024

Choose a reason for hiding this comment

a-r-r-o-w Dec 6, 2024

Choose a reason for hiding this comment

DN6 Dec 6, 2024

Choose a reason for hiding this comment

a-r-r-o-w Dec 6, 2024 • edited Loading

Choose a reason for hiding this comment

DN6 Dec 6, 2024

Choose a reason for hiding this comment

a-r-r-o-w Dec 6, 2024

Choose a reason for hiding this comment

stevhliu left a comment

Choose a reason for hiding this comment

a-r-r-o-w commented Dec 6, 2024

lawrence-cj commented Oct 18, 2024 •

edited

Loading

a-r-r-o-w Dec 6, 2024 •

edited

Loading