Video walkthrough · Talk

GE-Sim 2.0: a closed-loop video world simulator for manipulation

12:52 11 slides Jul 1, 2026

What the talk covers

  • Body-state expert — reads joint angles and gripper state straight out of the video latent space.
  • World judge — a VLM that automatically scores task success as reward.
  • Pixel-aligned action conditioning — draws end-effector trajectories into the image for cross-robot control.
  • Together they form a policy-act → world-sim → reward-score flywheel, so policies are evaluated and improved at scale before touching real hardware.