ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2508.05186
74
1
v1v2v3 (latest)

Learning to See and Act: Task-Aware View Planning for Robotic Manipulation

7 August 2025
Yongjie Bai
Zhouxia Wang
Rahul Gupta
Weixing Chen
Ziliang Chen
Mingtong Dai
Yongsen Zheng
Lingbo Liu
Guanbin Li
Liang Lin
ArXiv (abs)PDFHTMLGithub (5049★)
Main:8 Pages
8 Figures
Bibliography:7 Pages
9 Tables
Appendix:3 Pages
Abstract

Recent vision-language-action (VLA) models for multi-task robotic manipulation commonly rely on static viewpoints and shared visual encoders, which limit 3D perception and cause task interference, hindering robustness and generalization. In this work, we propose Task-Aware View Planning (TAVP), a framework designed to overcome these challenges by integrating active view planning with task-specific representation learning. TAVP employs an efficient exploration policy, accelerated by a novel pseudo-environment, to actively acquire informative views. Furthermore, we introduce a Mixture-of-Experts (MoE) visual encoder to disentangle features across different tasks, boosting both representation fidelity and task generalization. By learning to see the world in a task-aware way, TAVP generates more complete and discriminative visual representations, demonstrating significantly enhanced action prediction across a wide array of manipulation challenges. Extensive experiments on RLBench tasks show that our proposed TAVP model achieves superior performance over state-of-the-art fixed-view approaches. Visual results and code are provided at:this https URL.

View on arXiv
Comments on this paper