Video walkthrough · Talk
OmniDreams: a real-time generative world model for driving
What the talk covers
- Closed-loop core: the policy outputs actions, the world model renders the matching frames, and those frames feed back to the policy — round after round.
- Causal autoregressive DiT with a KV cache makes generation real-time instead of re-denoising from scratch each step.
- Three-stage training (diffusion forcing → self-forcing → DMD distillation) suppresses the error accumulation that breaks long rollouts.
- Supports scene editing and out-of-distribution object insertion, and reliably replaces reconstruction-based simulators for policy evaluation.