ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.19017
12
0

WorldEval: World Model as Real-World Robot Policies Evaluator

25 May 2025
Yaxuan Li
Yichen Zhu
Junjie Wen
Chaomin Shen
Yi Xu
    OffRLVGen
ArXiv (abs)PDFHTML
Main:9 Pages
10 Figures
Bibliography:7 Pages
4 Tables
Appendix:4 Pages
Abstract

The field of robotics has made significant strides toward developing generalist robot manipulation policies. However, evaluating these policies in real-world scenarios remains time-consuming and challenging, particularly as the number of tasks scales and environmental conditions change. In this work, we demonstrate that world models can serve as a scalable, reproducible, and reliable proxy for real-world robot policy evaluation. A key challenge is generating accurate policy videos from world models that faithfully reflect the robot actions. We observe that directly inputting robot actions or using high-dimensional encoding methods often fails to generate action-following videos. To address this, we propose Policy2Vec, a simple yet effective approach to turn a video generation model into a world simulator that follows latent action to generate the robot video. We then introduce WorldEval, an automated pipeline designed to evaluate real-world robot policies entirely online. WorldEval effectively ranks various robot policies and individual checkpoints within a single policy, and functions as a safety detector to prevent dangerous actions by newly developed robot models. Through comprehensive paired evaluations of manipulation policies in real-world environments, we demonstrate a strong correlation between policy performance in WorldEval and real-world scenarios. Furthermore, our method significantly outperforms popular methods such as real-to-sim approach.

View on arXiv
@article{li2025_2505.19017,
  title={ WorldEval: World Model as Real-World Robot Policies Evaluator },
  author={ Yaxuan Li and Yichen Zhu and Junjie Wen and Chaomin Shen and Yi Xu },
  journal={arXiv preprint arXiv:2505.19017},
  year={ 2025 }
}
Comments on this paper