stable diffusion 3 发布，有哪些亮点值得关注？

23 0 0

作者：小小将
链接：https://www.zhihu.com/question/645441220/answer/3405687724
来源：知乎
著作权归作者所有。商业转载请联系作者获得授权，非商业转载请注明出处。

这次Stable Diffusion 3最大的亮点应该是基于Diffusion Transformer吧，这巧了不是，OpenAI刚刚发布的文生视频模型Sora也是基于Diffusion Transformer，果真就是Transformer is all you need！

其实SDXL的技术报告也提到了使用Transformer，不过按照当时说法是使用Transformer并没有更好，但是作者坚信调好参数使用更大的Transformer应该会有效的，所以这次Stable Diffusion 3采用Transformer是有前兆的，并不是要凑Sora的热点。我估计这也是为啥之前的SDXL不是叫SD3，因为SD3要用全新的架构。

Architecture: During the exploration stage of this work, we briefly experimented with transformer-based architectures such as UViT [16] and DiT [33], but found no immediate benefit. We remain, however, optimistic that a careful hyperparameter study will eventually enable scaling to much larger transformer-dominated architectures.

此外，Stable Diffusion还结合了Meta在22年提出的Flow Matching来改进扩散模型，Flow Matching可以让训练更高效稳定，而且还可以支持更快的采样生成。

# AI应用