[memo]OminiControl: Minimal and Universal Control for Diffusion Transformer



This content originally appeared on DEV Community and was authored by Takara Taniguchi

Zhenxiong Tan, NUS のXinchao wangのグループ

Omnicontrol, novel approach addressing how image conditions are integrated into DiT

Subjects200k, a large-scale dataset of image pairs.

Effective image control

Contribution

  • They propose omnicontrol, cotrol framework for DiT models
  • Two key technical innovations
    • unified sequence processing
    • adaptive position encoding
  • flexible attention bias mechanism →conditioning
  • Subjects200k, large scale dataset

Related works

Diffusion models

  • DiT
  • FLUX

Controllable generation

  • Controlnet
  • T2I adapter: lightweight adapter
  • Unicontrol
  • mixture of experts

Methods

Preliminary

Diffusion transformer model is used for FLUX.1, SD3, PixArt.

Minimal architecture

Unified sequence processing

Experiment

Outperformed controlnet

Conclusion

Omniccontrol offers parameter-efficient image conditional control for DiT.

Required parameters are only 0.1 % compared to the traditional one.

感想

ControlnetでやってることをDiTでやってる感じですかね


This content originally appeared on DEV Community and was authored by Takara Taniguchi