Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Colab error while training hypernetwork #13080

Closed
1 task done
andrewontheb opened this issue Sep 5, 2023 · 5 comments · Fixed by #13084
Closed
1 task done

[Bug]: Colab error while training hypernetwork #13080

andrewontheb opened this issue Sep 5, 2023 · 5 comments · Fixed by #13084
Assignees
Labels
bug Report of a confirmed bug

Comments

@andrewontheb
Copy link

andrewontheb commented Sep 5, 2023

Is there an existing issue for this?

  • I have searched the existing issues and checked the recent builds/commits

What happened?

I've been using fast-stable-diffusion notebook in google Colab. And training process is stopped at the step when an image should be saved to textual_inversion log. I’ve already tried to change checkpoint models, training settings and materials

Traceback (most recent call last):
File "/sd/stable-diffusion-webui/modules/hypernetworks/hypernetwork.py", line 701, in train_hypernetwork
p.sampler_name = sd_samplers.samplers[preview_sampler_index].name
TypeError: list indices must be integers or slices, not str

Steps to reproduce the problem

  1. Go to ....
  2. Press ....
  3. ...
    4...

What should have happened?

Save a training progress image to log directory every N steps

Sysinfo

What browsers do you use to access the UI ?

Apple Safari

Console logs

Startup time: 38.3s (import torch: 15.0s, import gradio: 2.0s, setup paths: 4.0s, initialize shared: 0.5s, other imports: 3.3s, setup codeformer: 0.6s, setup gfpgan: 0.1s, list SD models: 0.2s, load scripts: 4.2s, create ui: 3.2s, gradio launch: 4.9s, add APIs: 0.2s).
9aba26abdfcd46073e0a1d42027a3a3bcc969f562d58a03637bf0a0ded6586c9
Loading weights [9aba26abdf] from /content/gdrive/MyDrive/sd/stable-diffusion-webui/models/Stable-diffusion/deliberate_v2.safetensors
Creating model from config: /content/gdrive/MyDrive/sd/stable-diffusion-webui/configs/v1-inference.yaml
Downloading (…)olve/main/vocab.json100% 961k/961k [00:00<00:00, 15.4MB/s]
Downloading (…)olve/main/merges.txt100% 525k/525k [00:00<00:00, 7.23MB/s]
Downloading (…)cial_tokens_map.json100% 389/389 [00:00<00:00, 1.92MB/s]
Downloading (…)okenizer_config.json100% 905/905 [00:00<00:00, 6.89MB/s]
Downloading (…)lve/main/config.json100% 4.52k/4.52k [00:00<00:00, 27.0MB/s]
Applying attention optimization: xformers... done.
Model loaded in 47.5s (calculate hash: 24.9s, load weights from disk: 1.3s, create model: 1.8s, apply weights to model: 15.5s, calculate empty prompt: 3.9s).
Preprocessing [Image 10/11]: 100% 11/11 [00:04<00:00,  2.63it/s]
Applying attention optimization: xformers... done.
*** Error completing request
*** Arguments: ('task(egtmx3f2ofk35e0)', '', '2e-5:800, 8e-6:1600, 5e-6', 1, 1, '/content/gdrive/MyDrive/sd/Processed', 'textual_inversion', 512, 512, False, 10000, 'disabled', '0.1', False, 0, 'once', False, 100, 100, 'custom.txt', True, 'a photo of a young woman ', '', 20, 'Euler a', 7, -1, 512, 512) {}
    Traceback (most recent call last):
      File "/content/gdrive/MyDrive/sd/stable-diffusion-webui/modules/call_queue.py", line 57, in f
        res = list(func(*args, **kwargs))
      File "/content/gdrive/MyDrive/sd/stable-diffusion-webui/modules/call_queue.py", line 36, in f
        res = func(*args, **kwargs)
      File "/content/gdrive/MyDrive/sd/stable-diffusion-webui/modules/hypernetworks/ui.py", line 25, in train_hypernetwork
        hypernetwork, filename = modules.hypernetworks.hypernetwork.train_hypernetwork(*args)
      File "/content/gdrive/MyDrive/sd/stable-diffusion-webui/modules/hypernetworks/hypernetwork.py", line 477, in train_hypernetwork
        textual_inversion.validate_train_inputs(hypernetwork_name, learn_rate, batch_size, gradient_step, data_root, template_file, template_filename, steps, save_hypernetwork_every, create_image_every, log_directory, name="hypernetwork")
      File "/content/gdrive/MyDrive/sd/stable-diffusion-webui/modules/textual_inversion/textual_inversion.py", line 366, in validate_train_inputs
        assert model_name, f"{name} not selected"
    AssertionError: hypernetwork not selected
Calculating sha256 for /content/gdrive/MyDrive/sd/stable-diffusion-webui/models/hypernetworks/Joki .pt: ee9e9198f019b24aad90f2d54ee7b78f2e9f557763f594e1bb9cd1521fbf6dc7
Training at rate of 2e-05 until step 800
Preparing dataset...
100% 11/11 [00:02<00:00,  4.34it/s]
Training hypernetwork [Epoch 9: 1/11]loss: 0.0650765:   1% 99/10000 [01:14<1:52:48,  1.46it/s] *** Exception in training hypernetwork
    Traceback (most recent call last):
      File "/content/gdrive/MyDrive/sd/stable-diffusion-webui/modules/hypernetworks/hypernetwork.py", line 701, in train_hypernetwork
        p.sampler_name = sd_samplers.samplers[preview_sampler_index].name
    TypeError: list indices must be integers or slices, not str

Additional information

No response

@andrewontheb andrewontheb added the bug-report Report of a bug, yet to be confirmed label Sep 5, 2023
@andrewontheb andrewontheb changed the title [Bug]: error while training hypernetwork [Bug]: Colab error while training hypernetwork Sep 5, 2023
@aria1th aria1th self-assigned this Sep 5, 2023
aria1th added a commit that referenced this issue Sep 5, 2023
Fixes sampler name reference

Same patch will be done for TI.
@aria1th
Copy link
Collaborator

aria1th commented Sep 5, 2023

@andrewontheb Thank you for the bug report! the bug seems to be fixed with patch .

Since we can't find the commit 9aba26abdfcd46073e0a1d42027a3a3bcc969f562d58a03637bf0a0ded6586c9, unfortunately I can't offer you the separate branch that you can test.

You can test the fixed option with the branch
https://github.com/AUTOMATIC1111/stable-diffusion-webui/tree/fix-preview-while-generation

by performing git switch fix-preview-while-generation in the repository path, maybe.

@catboxanon catboxanon added bug Report of a confirmed bug and removed bug-report Report of a bug, yet to be confirmed labels Sep 5, 2023
@catboxanon
Copy link
Collaborator

9aba26abdfcd46073e0a1d42027a3a3bcc969f562d58a03637bf0a0ded6586c9 is a model hash, not commit hash. Since no sysinfo is provided it's probably safe to assume they're running v1.6.0.

@illuculent
Copy link

Claims this is fixed and closed but I still get this issue. I have the latest update to auto1111. AUTOMATIC1111 deleted the fix-preview-while-generation branch 2 weeks ago. It appears the fix was merged and then deleted so I cannot so what the fix was.

@B-Hard
Copy link

B-Hard commented Sep 24, 2023

Same here. Before an image gets saved in log directory it aborts the embedding or training of hypernetwork:

** Error training embedding Traceback (most recent call last): File "C:\AI\stable-diffusion-webui\modules\textual_inversion\textual_inversion.py", line 593, in train_embedding p.sampler_name = sd_samplers.samplers[preview_sampler_index].name TypeError: list indices must be integers or slices, not str

@B-Hard
Copy link

B-Hard commented Sep 24, 2023

This fix solved the issue for me. de5bb4c but results are really bad and not useable or nearly recognizable as something close to references.

BUT if using Read parameters - text2img prompt checkbox. It still aborts the whole operation when an image should be generated... :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Report of a confirmed bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants