#controlnet

waynerad@diasp.org

"Funky AI-generated spiraling medieval village captivates social media."

Reminds me of M.C. Escher.

And check out the slideshow in the middle of the checkerboard medieval village.

The images are made with ControlNet. ControlNet allow you to, in addition to the text prompt, provide an additional image which will "condition" the generated image. If the condition is a simple black-and-white pattern, like a spiral or checkerboard, it turns out this effect is created in the output.

How exactly ControlNet works I can't really explain for you. Looking through the paper I can tell you that what it does is take a diffusion network and "lock down" the parameters. Then, on a block-by-block basis, it copies the blocks and makes the parameters on the copy changeable again. Diffusion networks are made of ResNet (residual network) "blocks" where a "block" is a group of layers of the same dimensions that function together.

While the original blocks are connnected together in a sequence, the copied blocks are treated as "nested", with the output from the first blocks going back into the original network near the end, and the output from later blocks feeding back into the original network near the center. When the output is fed back in, it first goes through a convolution the inventors call a "zero convolution". This doesn't mean the area used for convolution has zero size, like you might think. No, the "zero" refers to the idea that the convolution weights are initialized to all zeros, and grow from there to some optimized values over the course of training.

Anyway, somehow this creates a tension between the original image and the additional "condition" image such that the locked-down network pushes to preserve the style and structure of the original image while the trainable copy pushes to match the pattern in the condition image, and the end result is a combination of both.

Funky AI-generated spiraling medieval village captivates social media

#solidstatelife #ai #computervision #genai #diffusionnetworks #controlnet