Models & AlgorithmsKR

DiT: Replacing U-Net with Transformer Finally Made Scaling Laws Work (Sora Foundation)

U-Net shows diminishing returns when scaled up. DiT improves consistently with size. Complete analysis of the architecture behind Sora.

DiT: Replacing U-Net with Transformer Finally Made Scaling Laws Work (Sora Foundation)

DiT: Diffusion Transformer - A New Paradigm Beyond U-Net

Blog Image
TL;DR: DiT replaces the Diffusion model's backbone from U-Net to Vision Transformer. Scaling laws apply, so performance consistently improves as the model grows larger. It's the foundation technology behind Sora.

1. Limitations of U-Net

1.1 Why U-Net?

🔒

Sign in to continue reading

Create a free account to access the full content.

Related Posts