Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2206.11795
Cited By
Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos
23 June 2022
Bowen Baker
Ilge Akkaya
Peter Zhokhov
Joost Huizinga
Jie Tang
Adrien Ecoffet
Brandon Houghton
Raul Sampedro
Jeff Clune
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos"
50 / 77 papers shown
Title
Video-Enhanced Offline Reinforcement Learning: A Model-Based Approach
Minting Pan
Yitao Zheng
Jiajian Li
Yunbo Wang
Xiaokang Yang
OffRL
48
0
0
10 May 2025
CLAM: Continuous Latent Action Models for Robot Learning from Unlabeled Demonstrations
Anthony Liang
Pavel Czempin
Matthew Hong
Yutai Zhou
Erdem Biyik
Stephen Tu
47
0
0
08 May 2025
Learning to Drive Anywhere with Model-Based Reannotation
Noriaki Hirose
Lydia Ignatova
Kyle Stachowicz
Catherine Glossop
Sergey Levine
Dhruv Shah
24
0
0
08 May 2025
ViSA-Flow: Accelerating Robot Skill Learning via Large-Scale Video Semantic Action Flow
Changhe Chen
Quantao Yang
Xiaohao Xu
Nima Fazeli
Olov Andersson
26
0
0
02 May 2025
Learning to Drive from a World Model
Mitchell Goff
Greg Hogan
George Hotz
Armand du Parc Locmaria
Kacper Raczy
Harald Schäfer
Adeeb Shihadeh
Weixing Zhang
Yassine Yousfi
39
0
0
27 Apr 2025
Generative AI in Embodied Systems: System-Level Analysis of Performance, Efficiency and Scalability
Zishen Wan
Jiayi Qian
Yuhang Du
Jason J. Jabbour
Yilun Du
Yang Katie Zhao
A. Raychowdhury
Tushar Krishna
Vijay Janapa Reddi
LM&Ro
91
0
0
26 Apr 2025
Collaborating Action by Action: A Multi-agent LLM Framework for Embodied Reasoning
Isadora White
Kolby Nottingham
Ayush Maniar
Max Robinson
Hansen Lillemark
Mehul Maheshwari
Lianhui Qin
Prithviraj Ammanabrolu
LLMAG
LM&Ro
115
0
0
24 Apr 2025
Scaling Video-Language Models to 10K Frames via Hierarchical Differential Distillation
Chuanqi Cheng
Jian Guan
Wei Wu
Rui Yan
VLM
52
0
0
03 Apr 2025
AdaWorld: Learning Adaptable World Models with Latent Actions
Shenyuan Gao
Siyuan Zhou
Yilun Du
Jun Zhang
Chuang Gan
VGen
62
4
0
24 Mar 2025
GR00T N1: An Open Foundation Model for Generalist Humanoid Robots
Nvidia
Johan Bjorck
Fernando Castañeda
Nikita Cherniadev
Xingye Da
...
Ao Zhang
Hao Zhang
Yizhou Zhao
Ruijie Zheng
Yuke Zhu
VLM
68
25
0
18 Mar 2025
VaViM and VaVAM: Autonomous Driving through Video Generative Modeling
Florent Bartoccioni
Elias Ramzi
Victor Besnier
Shashanka Venkataramanan
Tuan-Hung Vu
...
Mickael Chen
Éloi Zablocki
Andrei Bursuc
Eduardo Valle
Matthieu Cord
VGen
86
1
0
24 Feb 2025
AgentStudio: A Toolkit for Building General Virtual Agents
Longtao Zheng
Zhiyuan Huang
Zhenghai Xue
Xinrun Wang
Bo An
Shuicheng Yan
88
14
0
17 Feb 2025
Motion Tracks: A Unified Representation for Human-Robot Transfer in Few-Shot Imitation Learning
Juntao Ren
Priya Sundaresan
Dorsa Sadigh
Sanjiban Choudhury
Jeannette Bohg
37
15
0
13 Jan 2025
Sample-efficient Unsupervised Policy Cloning from Ensemble Self-supervised Labeled Videos
Xin Liu
Yaran Chen
Haoran Li
SSL
94
0
0
14 Dec 2024
Grounding Video Models to Actions through Goal Conditioned Exploration
Yunhao Luo
Yilun Du
LM&Ro
VGen
85
1
0
11 Nov 2024
SPOT: SE(3) Pose Trajectory Diffusion for Object-Centric Manipulation
Cheng-Chun Hsu
Bowen Wen
Jie Xu
Yashraj S. Narang
Xiaolong Wang
Yuke Zhu
Joydeep Biswas
Stan Birchfield
DiffM
41
8
0
01 Nov 2024
Latent Action Pretraining from Videos
Seonghyeon Ye
Joel Jang
Byeongguk Jeon
Sejune Joo
Jianwei Yang
...
Kimin Lee
J. Gao
Luke Zettlemoyer
Dieter Fox
Minjoon Seo
35
28
0
15 Oct 2024
VideoAgent: Self-Improving Video Generation
Achint Soni
Sreyas Venkataraman
Abhranil Chandra
Sebastian Fischmeister
Percy Liang
Bo Dai
Sherry Yang
LM&Ro
VGen
58
7
0
14 Oct 2024
Imitation Learning with Limited Actions via Diffusion Planners and Deep Koopman Controllers
Jianxin Bi
Kelvin Lim
Kaiqi Chen
Yifei Huang
Harold Soh
37
0
0
10 Oct 2024
Open-World Reinforcement Learning over Long Short-Term Imagination
Jiajian Li
Q. Wang
Yunbo Wang
Xin Jin
Yang Li
Wenjun Zeng
Xiaokang Yang
OCL
VLM
62
1
0
04 Oct 2024
Game On: Towards Language Models as RL Experimenters
Jingwei Zhang
Thomas Lampe
A. Abdolmaleki
Jost Tobias Springenberg
Martin Riedmiller
LM&Ro
36
0
0
05 Sep 2024
MAPF-GPT: Imitation Learning for Multi-Agent Pathfinding at Scale
Anton Andreychuk
Konstantin Yakovlev
Aleksandr I. Panov
A. Skrynnik
AI4CE
65
3
0
29 Aug 2024
Agent-E: From Autonomous Web Navigation to Foundational Design Principles in Agentic Systems
Tamer Abuelsaad
Deepak Akkil
Prasenjit Dey
Ashish Jagmohan
Aditya Vempaty
Ravi Kokku
46
23
0
17 Jul 2024
Aligning Agents like Large Language Models
Adam Jelley
Yuhan Cao
Dave Bignell
Sam Devlin
Tabish Rashid
LM&Ro
44
1
0
06 Jun 2024
Reward Machines for Deep RL in Noisy and Uncertain Environments
Andrew C. Li
Zizhao Chen
Toryn Q. Klassen
Pashootan Vaezipoor
Rodrigo Toro Icarte
Sheila A. McIlraith
48
6
0
31 May 2024
Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability
Shenyuan Gao
Jiazhi Yang
Li Chen
Kashyap Chitta
Yihang Qiu
Andreas Geiger
Jun Zhang
Hongyang Li
71
75
0
27 May 2024
LARM: Large Auto-Regressive Model for Long-Horizon Embodied Intelligence
Zhuoling Li
Xiaogang Xu
Zhenhua Xu
Sernam Lim
Hengshuang Zhao
LM&Ro
51
2
0
27 May 2024
A Survey on Vision-Language-Action Models for Embodied AI
Yueen Ma
Zixing Song
Yuzheng Zhuang
Jianye Hao
Irwin King
LM&Ro
82
43
0
23 May 2024
Rank2Reward: Learning Shaped Reward Functions from Passive Video
Daniel Yang
Davin Tjia
Jacob Berg
Dima Damen
Pulkit Agrawal
Abhishek Gupta
OffRL
40
5
0
23 Apr 2024
OPEx: A Component-Wise Analysis of LLM-Centric Agents in Embodied Instruction Following
Haochen Shi
Zhiyuan Sun
Xingdi Yuan
Marc-Alexandre Côté
Bang Liu
LLMAG
40
10
0
05 Mar 2024
ELA: Exploited Level Augmentation for Offline Learning in Zero-Sum Games
Shiqi Lei
Kanghoon Lee
Linjing Li
Jinkyoo Park
Jiachen Li
OffRL
31
1
0
28 Feb 2024
VLN-Video: Utilizing Driving Videos for Outdoor Vision-and-Language Navigation
Jialu Li
Aishwarya Padmakumar
Gaurav Sukhatme
Mohit Bansal
29
6
0
05 Feb 2024
QUAR-VLA: Vision-Language-Action Model for Quadruped Robots
Pengxiang Ding
Han Zhao
Wenxuan Song
Zhitao Wang
Zhenyu Wei
Shangke Lyu
Ningxi Yang
Donglin Wang
32
19
0
22 Dec 2023
Learning to Act without Actions
Dominik Schmidt
Minqi Jiang
OffRL
34
31
0
17 Dec 2023
MP5: A Multi-modal Open-ended Embodied System in Minecraft via Active Perception
Yiran Qin
Enshen Zhou
Qichang Liu
Zhen-fei Yin
Lu Sheng
Ruimao Zhang
Yu Qiao
Jing Shao
LM&Ro
32
39
0
12 Dec 2023
BEDD: The MineRL BASALT Evaluation and Demonstrations Dataset for Training and Benchmarking Agents that Solve Fuzzy Tasks
Stephanie Milani
Anssi Kanervisto
Karolis Ramanauskas
Sander Schulhoff
Brandon Houghton
Rohin Shah
23
7
0
05 Dec 2023
Visual Encoders for Data-Efficient Imitation Learning in Modern Video Games
Lukas Schäfer
Logan Jones
Anssi Kanervisto
Yuhan Cao
Tabish Rashid
Raluca Georgescu
David Bignell
Siddhartha Sen
Andrea Trevino Gavito
Sam Devlin
90
3
0
04 Dec 2023
MM-VID: Advancing Video Understanding with GPT-4V(ision)
Kevin Qinghong Lin
Faisal Ahmed
Linjie Li
Chung-Ching Lin
E. Azarnasab
...
Lin Liang
Zicheng Liu
Yumao Lu
Ce Liu
Lijuan Wang
MLLM
28
63
0
30 Oct 2023
TD-MPC2: Scalable, Robust World Models for Continuous Control
Nicklas Hansen
Hao Su
Xiaolong Wang
MU
32
127
0
25 Oct 2023
Mastering Robot Manipulation with Multimodal Prompts through Pretraining and Multi-task Fine-tuning
Jiachen Li
Qiaozi Gao
Michael Johnston
Xiaofeng Gao
Xuehai He
Suhaila Shakiah
Hangjie Shi
R. Ghanadan
William Y. Wang
LM&Ro
27
12
0
14 Oct 2023
Quilt-1M: One Million Image-Text Pairs for Histopathology
Wisdom O. Ikezogwo
M. S. Seyfioglu
Fatemeh Ghezloo
Dylan Stefan Chan Geva
Fatwir Sheikh Mohammed
Pavan Kumar Anand
Ranjay Krishna
Linda G. Shapiro
CLIP
VLM
139
114
0
20 Jun 2023
Behavioral Cloning via Search in Embedded Demonstration Dataset
Federico Malato
Florian Leopold
Ville Hautamaki
Andrew Melnik
OffRL
27
3
0
15 Jun 2023
Thought Cloning: Learning to Think while Acting by Imitating Human Thinking
Shengran Hu
Jeff Clune
LM&Ro
OffRL
LRM
AI4CE
35
27
0
01 Jun 2023
Pre-training Contextualized World Models with In-the-wild Videos for Reinforcement Learning
Jialong Wu
Haoyu Ma
Chao Deng
Mingsheng Long
OffRL
31
25
0
29 May 2023
Inverse Dynamics Pretraining Learns Good Representations for Multitask Imitation
David Brandfonbrener
Ofir Nachum
Joan Bruna
AI4CE
26
21
0
26 May 2023
Adaptive Policy Learning to Additional Tasks
Wenjian Hao
Zehui Lu
Zihao Liang
Tianyu Zhou
Shaoshuai Mou
29
0
0
24 May 2023
Policy Learning based on Deep Koopman Representation
Wenjian Hao
Paulo Heredia
Bowen Huang
Zehui Lu
Zihao Liang
Shaoshuai Mou
36
1
0
24 May 2023
Barkour: Benchmarking Animal-level Agility with Quadruped Robots
Ken Caluwaerts
Atil Iscen
J. Kew
Wenhao Yu
Tingnan Zhang
...
J. Seto
Carolina Parada
Vikas Sindhwani
Vincent Vanhoucke
Jie Tan
27
59
0
24 May 2023
Think Before You Act: Unified Policy for Interleaving Language Reasoning with Actions
Lina Mezghani
Piotr Bojanowski
Alahari Karteek
Sainbayar Sukhbaatar
LM&Ro
OffRL
LRM
21
8
0
18 Apr 2023
Reinforcement Learning from Passive Data via Latent Intentions
Dibya Ghosh
Chethan Bhateja
Sergey Levine
OffRL
26
43
0
10 Apr 2023
1
2
Next