You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am very interested in training a new controlnet model. After studying the kandinsky-2-2-controlnet-depth model uploaded in HuggingFace, I found that its architecture seems to be different from the controlnet model of the traditional stable diffusion model.
In my understanding, the structure of the unet model corresponding to kandinsky-2-2-controlnet-depth has been modified compared to the unet model of kandinsky-2-2-decoder. The "in_channels" parameter of conv_in has been changed to 8, and an additional module called "input_hint_block" has been added.
In terms of parameters, the weights and biases are also completely different from the unet model of kandinsky-2-2-decoder.
My training approach is as follows: First, download the unet models corresponding to kandinsky-2-2-controlnet-depth and kandinsky-2-2-decoder. Then, copy the overall parameters of the unet in kandinsky-2-2-decoder to the corresponding unet parameters in kandinsky-2-2-controlnet-depth (except for the parts with different structures).
Afterward, train the new unet model based on the fill50k dataset.
I wonder if there are any issues with this approach? I would greatly appreciate any help or suggestions you can provide.
Additionally, I seem to have not found the training code specifically for kandinsky-2-2-controlnet-depth. I would greatly appreciate it if you could provide information on where to find it.
I am very interested in training a new controlnet model. After studying the kandinsky-2-2-controlnet-depth model uploaded in HuggingFace, I found that its architecture seems to be different from the controlnet model of the traditional stable diffusion model.
In my understanding, the structure of the unet model corresponding to kandinsky-2-2-controlnet-depth has been modified compared to the unet model of kandinsky-2-2-decoder. The "in_channels" parameter of conv_in has been changed to 8, and an additional module called "input_hint_block" has been added.
In terms of parameters, the weights and biases are also completely different from the unet model of kandinsky-2-2-decoder.
My training approach is as follows: First, download the unet models corresponding to kandinsky-2-2-controlnet-depth and kandinsky-2-2-decoder. Then, copy the overall parameters of the unet in kandinsky-2-2-decoder to the corresponding unet parameters in kandinsky-2-2-controlnet-depth (except for the parts with different structures).
Afterward, train the new unet model based on the fill50k dataset.
I wonder if there are any issues with this approach? I would greatly appreciate any help or suggestions you can provide.
Additionally, I seem to have not found the training code specifically for kandinsky-2-2-controlnet-depth. I would greatly appreciate it if you could provide information on where to find it.
@Blucknote
The text was updated successfully, but these errors were encountered: