Video walkthrough · Talk

OmniDreams: a real-time generative world model for driving

A talk walkthrough of NVIDIA's OmniDreams — a causal autoregressive DiT world model that closes the loop between policy and simulation, with three-stage training to kill error accumulation.

18:29 18 slides Jun 27, 2026 Watch on Bilibili

What the talk covers

Closed-loop core: the policy outputs actions, the world model renders the matching frames, and those frames feed back to the policy — round after round.
Causal autoregressive DiT with a KV cache makes generation real-time instead of re-denoising from scratch each step.
Three-stage training (diffusion forcing → self-forcing → DMD distillation) suppresses the error accumulation that breaks long rollouts.
Supports scene editing and out-of-distribution object insertion, and reliably replaces reconstruction-based simulators for policy evaluation.