Simpler Diffusion: 1.5 FID on ImageNet512 with Pixel-space Diffusion

Abstract

Simpler Diffusion shows that pixel-space diffusion can compete with latent diffusion at high resolution. A compact recipe for loss weighting, architecture simplification, and high-resolution scaling reaches 1.5 FID on ImageNet512.

Publication
IEEE/CVF Conference on Computer Vision and Pattern Recognition 2025