Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

minor bug fix of sd model memory management #15350

Merged
merged 1 commit into from
Mar 30, 2024
Merged

minor bug fix of sd model memory management #15350

merged 1 commit into from
Mar 30, 2024

Conversation

gopricy
Copy link

@gopricy gopricy commented Mar 21, 2024

Description

  • fixed a bug that could cause sd_model not being sent to CPU as expected

Screenshots/videos:

Checklist:

@gopricy gopricy requested a review from AUTOMATIC1111 as a code owner March 21, 2024 22:44
@AUTOMATIC1111
Copy link
Owner

how to reproduce

@gopricy
Copy link
Author

gopricy commented Mar 24, 2024

Let me know if I am wrong here but the original code looks like a typo to me, we are sending sd_model to cpu multiple times in this for loop(sd_model is the current loaded model).
This pr applies the setting of sd_checkpoints_keep_in_cpu that all loaded models in gpu will be sent to cpu

def reuse_model_from_already_loaded(sd_model, checkpoint_info, timer):
    already_loaded = None
    for i in reversed(range(len(model_data.loaded_sd_models))):
        loaded_model = model_data.loaded_sd_models[i]
        ...
        if shared.opts.sd_checkpoints_keep_in_cpu:
            send_model_to_cpu(sd_model)
            timer.record("send model to cpu")```

@AUTOMATIC1111
Copy link
Owner

Well, while I agree that this shouldn't be done in a loop, the original logic was to send the current model, not all of them. I guess your situation is that we have like 5 models loaded, then we enable the sd_checkpoints_keep_in_cpu setting, and after that only one of them is sent to CPU?

In any case I'd rather not change the arguments of the function because there may be someone using it, and it kind of looks like things will break if send_model_to_cpu is called on the model that send_model_to_trash has been called on.

@gopricy
Copy link
Author

gopricy commented Mar 25, 2024

I guess your situation is that we have like 5 models loaded, then we enable the sd_checkpoints_keep_in_cpu setting, and after that only one of them is sent to CPU?

yea this is what I have seen, but it doesn't cause oom and will gradually put all models in CPU after more model switches so I think it is fine for most people

In any case I'd rather not change the arguments of the function because there may be someone using it, and it kind of looks like things will break if send_model_to_cpu is called on the model that send_model_to_trash has been called on.

good point, I reverted the argument change and moved send_model_to_cpu out of the loop

@AUTOMATIC1111
Copy link
Owner

This presents a problem of needlessly sending the model to CPU when it's the one we want to use.

@gopricy
Copy link
Author

gopricy commented Mar 25, 2024

updated again, let me know if it meets the requirement now

@AUTOMATIC1111 AUTOMATIC1111 merged commit 8bebfde into AUTOMATIC1111:dev Mar 30, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants