-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training datasets #93
Comments
Thanks for these models! ControlNet results with my 1.5 models were awesome, but I have trained so many 2.1 embeddings I'd love to use with this. |
I'm sure we could get together funds to train on some A100's, but the training data is the real problem. Is the data even able to be released? Legal issues? |
Training seems fast and not really expensive. |
Given the current complicated situation outside research community, we refrain from disclosing more details about data. Nevertheless, researchers may take a look at that dataset project everyone know. |
Thanks @lllyasviel for your reply. do you plan to train with sd2.1? |
If I can help with funds I'd be happy to help. I'm disappointed in current
Open landscape.
…On Sun, Feb 19, 2023, 12:55 PM thibaudart ***@***.***> wrote:
Thanks @lllyasviel <https://github.com/lllyasviel> for your reply.
do you plan to train with sd2.1?
—
Reply to this email directly, view it on GitHub
<#93 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/A567NIQTY2GYWNSFF5PMFNLWYKCEXANCNFSM6AAAAAAU7MPJBQ>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
@notrydo the first step is having a dataset for the training. If you have 100/300K of image (512x512) of quality and varied, it could be useful. If not, we will need to find a prompt dataset and generate them. (it takes roughly 24 hours to generate 40K images, so around 10 days to have the images, after that it'll take few days to BLIP them and to generate the preprocessed versions) |
Any additional specifications?
…On Mon, Feb 20, 2023, 10:52 AM thibaudart ***@***.***> wrote:
@notrydo <https://github.com/notrydo> the first step is having a dataset
for the training. If you have 100/300K of image (512x512) of quality and
varied, it could be useful. If not, we will need to find a prompt dataset
and generate them. (it takes roughly 24 hours to generate 40K images, so
around 10 days to have the images, after that it'll take few days to BLIP
them and to generate the preprocessed versions)
—
Reply to this email directly, view it on GitHub
<#93 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/A567NIWTLOADIQWRJXORD3TWYO4NJANCNFSM6AAAAAAU7MPJBQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
If known where we can purchase drop link.
…On Mon, Feb 20, 2023, 2:58 PM Amanda Besemer ***@***.***> wrote:
Any additional specifications?
On Mon, Feb 20, 2023, 10:52 AM thibaudart ***@***.***>
wrote:
> @notrydo <https://github.com/notrydo> the first step is having a dataset
> for the training. If you have 100/300K of image (512x512) of quality and
> varied, it could be useful. If not, we will need to find a prompt dataset
> and generate them. (it takes roughly 24 hours to generate 40K images, so
> around 10 days to have the images, after that it'll take few days to BLIP
> them and to generate the preprocessed versions)
>
> —
> Reply to this email directly, view it on GitHub
> <#93 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/A567NIWTLOADIQWRJXORD3TWYO4NJANCNFSM6AAAAAAU7MPJBQ>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
|
See also updated last section of https://github.com/lllyasviel/ControlNet/blob/main/docs/train.md |
Don’t know.
Le lun. 20 févr. 2023 à 18:59, notrydo ***@***.***> a écrit :
If known where we can purchase drop link.
On Mon, Feb 20, 2023, 2:58 PM Amanda Besemer ***@***.***> wrote:
> Any additional specifications?
>
> On Mon, Feb 20, 2023, 10:52 AM thibaudart ***@***.***>
> wrote:
>
>> @notrydo <https://github.com/notrydo> the first step is having a
dataset
>> for the training. If you have 100/300K of image (512x512) of quality and
>> varied, it could be useful. If not, we will need to find a prompt
dataset
>> and generate them. (it takes roughly 24 hours to generate 40K images, so
>> around 10 days to have the images, after that it'll take few days to
BLIP
>> them and to generate the preprocessed versions)
>>
>> —
>> Reply to this email directly, view it on GitHub
>> <
#93 (comment)
>,
>> or unsubscribe
>> <
https://github.com/notifications/unsubscribe-auth/A567NIWTLOADIQWRJXORD3TWYO4NJANCNFSM6AAAAAAU7MPJBQ
>
>> .
>> You are receiving this because you were mentioned.Message ID:
>> ***@***.***>
>>
>
—
Reply to this email directly, view it on GitHub
<#93 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AYWNXDBGXV62KHSC6UTZSJDWYPZOHANCNFSM6AAAAAAU7MPJBQ>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
--
T
|
@lllyasviel if we paid you for the training, could you do it for 2.1?
Le lun. 20 févr. 2023 à 19:58, lllyasviel ***@***.***> a
écrit :
I am currently training a sketch to image model on Waifu Diffusion 1.5
(which uses SD 2.1 v prediction). I made a dataset of 1 million sketch
image pairs, and I'm training with 50% unconditional chance (like in the
paper). Here are the results so far at 150k samples seen: [image: image]
<https://user-images.githubusercontent.com/18043686/220213337-21c349b1-c39b-4095-94df-f032ec3c3e0d.png> [image:
image]
<https://user-images.githubusercontent.com/18043686/220213345-61279016-3d6c-4220-8227-f013728b6004.png> [image:
image]
<https://user-images.githubusercontent.com/18043686/220213350-9cf593cc-d9fa-4777-92bc-3e70b0c0f909.png> [image:
image]
<https://user-images.githubusercontent.com/18043686/220213353-0780274f-2bdf-44fd-a3ff-3a76b6d8c0d8.png>
Anime models needs larger batchsize and lower (or disabling) text dropping
because their tags are dense.
—
Reply to this email directly, view it on GitHub
<#93 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AYWNXDHBOBYL6TJ73GE3HKDWYQALDANCNFSM6AAAAAAU7MPJBQ>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
--
T
|
@lllyasviel I'll make the changes in the unconditional dropping, I might copy over the "partial dropout" code from waifu diffusion training, where we train with a variable percentage of the prompt (50% chance to have 0% to 100% of the tags, 50% chance to have 100%), except maybe moving the percentages to maybe have a 30% chance of partial dropout. Very interesting about the sudden converge phenomenon. I've noticed this phenomenon with normal Waifu Diffusion 1.5 as well. I don't quite see how changing the gradient accumulation steps helps with this though, could you explain that part further? Would love to talk about this more with you, is there a better way of contacting you (email, discord?) |
Because that "sudden converge" always happens, lets say "sudden converge" will happen at 3k step and our money can optimize 90k step, then we have two options: (1) train 3k steps, sudden converge, then train 87k steps. (2) 30x gradient accumulation, train 3k steps (90k real computation steps), then sudden converge. In my experiments, (2) is usually better than (1). However, in real cases, perhaps you may need to balance the steps before and after the "sudden converge" on your own to find a balance. The training after "sudden converge" is also important. |
@lllyasviel Just read your edit, do you mean that after the "sudden converge", I should reduce my gradient accumulation steps? |
no. The batch size should not be reduced under any circumstances. In addition, we should always remember that we are not training layers from scratch, we are optimizing some projections between existing layers. We are still fine tuning a SD. Any bad training that can fail SD fine tuning will fail controlnet training. |
just for sake of reference ... is this the correct approach for grad_acc ? ? |
When modifying batch size or gradient accumulation, shoud I modify learning rate? |
@lllyasviel can you share hyperparameters you used for training e.g. batch size, effective batch size, number of GPUs, number of worker nodes, learning rate, number of training steps, etc? By effective batch size, I refer to this value |
@whydna it means that there is a 50% chance for the text prompt input to be dropped (set to empty string) when training the model and only the control image will be used (the sketch in this case). This forces the model to not rely too much on the text and try to generate the entire image just from the control image alone. |
@off99555 thanks for the explanation - makes sense. Is this achieved by just omitting prompts for 50% of the data set in prompts.json? Or is there some param to do it in the training function? |
It should be done in the code dynamically. Here is example code I found in another repository that does 10% dropping of text prompt: I'm not sure where this piece of code exists in ControlNet repo. |
The code doesn't exist in the controlnet repo, you have to write it yourself. Also, I talked to the author and he said that 50% is too high for sketch, it should be more like 0-10% |
@lllyasviel could you please share the details of each feature extracters, such as the threshold used by canny(), mlsd() and midas ? |
Could you share the hyperprameter you use? what is the learning rate and effective batch size? |
First of all, Thanks a lot for your work on this amazing tool!
Could you share the datasets used for training? With that we can make the training on sd2.1!
The text was updated successfully, but these errors were encountered: