Video walkthrough · Talk

OmniDreams: a real-time generative world model for driving

18:29 18 slides Jun 27, 2026

What the talk covers

  • Closed-loop core: the policy outputs actions, the world model renders the matching frames, and those frames feed back to the policy — round after round.
  • Causal autoregressive DiT with a KV cache makes generation real-time instead of re-denoising from scratch each step.
  • Three-stage training (diffusion forcing → self-forcing → DMD distillation) suppresses the error accumulation that breaks long rollouts.
  • Supports scene editing and out-of-distribution object insertion, and reliably replaces reconstruction-based simulators for policy evaluation.