How ControlNet works
In this section, we will drill down into the ControlNet structure and see how ControlNet works internally.
ControlNet works by injecting additional conditions into the blocks of a neural network. As shown in Figure 13.6, the trainable copy is the ControlNet block that adds additional guidance to the original SD UNet block:
Figure 13.6: Adding ControlNet components
During the training stage, we take a copy of the target layer block as the ControlNet block. In Figure 13.6, it is denoted as a trainable copy. Unlike typical neural network initialization with Gaussian distributions for all parameters, ControlNet utilizes pre-trained weights from the Stable Diffusion base model. Most of these base model parameters are frozen (with the option to unfreeze them later) and only the additional ControlNet components are trained from scratch.
During training and inference, the input x is usually a 3D dimensional vector, x ∈ ℝ...