ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.11758
  4. Cited By
Latent Action Pretraining from Videos
v1v2 (latest)

Latent Action Pretraining from Videos

15 October 2024
Seonghyeon Ye
Joel Jang
Byeongguk Jeon
Sejune Joo
Jianwei Yang
Baolin Peng
Ajay Mandlekar
Reuben Tan
Yu-Wei Chao
Bill Yuchen Lin
Lars Liden
Kimin Lee
J. Gao
Luke Zettlemoyer
Dieter Fox
Minjoon Seo
ArXiv (abs)PDFHTML

Papers citing "Latent Action Pretraining from Videos"

50 / 80 papers shown
Title
Bigger, Regularized, Categorical: High-Capacity Value Functions are Efficient Multi-Task Learners
Bigger, Regularized, Categorical: High-Capacity Value Functions are Efficient Multi-Task Learners
Michal Nauman
Marek Cygan
Carmelo Sferrazza
Aviral Kumar
Pieter Abbeel
OffRL
57
0
0
29 May 2025
Learning Generalizable Robot Policy with Human Demonstration Video as a Prompt
Learning Generalizable Robot Policy with Human Demonstration Video as a Prompt
Xiang Zhu
Yichen Liu
Hezhong Li
Jianyu Chen
99
0
0
27 May 2025
DreamGen: Unlocking Generalization in Robot Learning through Video World Models
DreamGen: Unlocking Generalization in Robot Learning through Video World Models
Joel Jang
Seonghyeon Ye
Zongyu Lin
Jiannan Xiang
Johan Bjorck
...
Dieter Fox
Jan Kautz
Scott Reed
Yuke Zhu
Linxi Fan
VGenOffRLAI4TS
80
0
0
19 May 2025
UniSkill: Imitating Human Videos via Cross-Embodiment Skill Representations
UniSkill: Imitating Human Videos via Cross-Embodiment Skill Representations
Hanjung Kim
Jaehyun Kang
Hyolim Kang
Meedeum Cho
Seon Joo Kim
Youngwoon Lee
81
0
0
13 May 2025
DexWild: Dexterous Human Interactions for In-the-Wild Robot Policies
DexWild: Dexterous Human Interactions for In-the-Wild Robot Policies
Tony Tao
Mohan Kumar Srirama
Jason Jingzhou Liu
Kenneth Shaw
Deepak Pathak
71
1
0
12 May 2025
UniVLA: Learning to Act Anywhere with Task-centric Latent Actions
UniVLA: Learning to Act Anywhere with Task-centric Latent Actions
Qingwen Bu
Yanting Yang
Jisong Cai
Shenyuan Gao
Guanghui Ren
Maoqing Yao
Ping Luo
Hongyang Li
373
7
0
09 May 2025
Benchmarking Vision, Language, & Action Models in Procedurally Generated, Open Ended Action Environments
Benchmarking Vision, Language, & Action Models in Procedurally Generated, Open Ended Action Environments
Pranav Guruprasad
Yangyue Wang
Sudipta Chowdhury
Harshvardhan Sikka
Paul Pu Liang
LM&RoVLM
407
1
0
08 May 2025
CLAM: Continuous Latent Action Models for Robot Learning from Unlabeled Demonstrations
CLAM: Continuous Latent Action Models for Robot Learning from Unlabeled Demonstrations
Anthony Liang
Pavel Czempin
Matthew Hong
Yutai Zhou
Erdem Biyik
Stephen Tu
138
1
0
08 May 2025
Crossing the Human-Robot Embodiment Gap with Sim-to-Real RL using One Human Demonstration
Crossing the Human-Robot Embodiment Gap with Sim-to-Real RL using One Human Demonstration
Tyler Ga Wei Lum
Olivia Y. Lee
C. Karen Liu
Jeannette Bohg
86
1
0
17 Apr 2025
Unified World Models: Coupling Video and Action Diffusion for Pretraining on Large Robotic Datasets
Unified World Models: Coupling Video and Action Diffusion for Pretraining on Large Robotic Datasets
Chuning Zhu
Raymond Yu
S. Feng
Benjamin Burchfiel
Paarth Shah
Abhishek Gupta
VGen
115
6
0
03 Apr 2025
Slot-Level Robotic Placement via Visual Imitation from Single Human Video
Slot-Level Robotic Placement via Visual Imitation from Single Human Video
Dandan Shan
Kaichun Mo
Wei Yang
Yu-Wei Chao
David Fouhey
Dieter Fox
Arsalan Mousavian
83
0
0
02 Apr 2025
ZeroMimic: Distilling Robotic Manipulation Skills from Web Videos
ZeroMimic: Distilling Robotic Manipulation Skills from Web Videos
Junyao Shi
Zhuolun Zhao
Tianyou Wang
Ian Pedroza
Amy Luo
Jie Wang
Jason Ma
Dinesh Jayaraman
LM&Ro
93
0
0
31 Mar 2025
AdaWorld: Learning Adaptable World Models with Latent Actions
AdaWorld: Learning Adaptable World Models with Latent Actions
Shenyuan Gao
Siyuan Zhou
Yilun Du
Jun Zhang
Chuang Gan
VGen
138
5
0
24 Mar 2025
GR00T N1: An Open Foundation Model for Generalist Humanoid Robots
GR00T N1: An Open Foundation Model for Generalist Humanoid Robots
Nvidia
Johan Bjorck
Fernando Castañeda
Nikita Cherniadev
Xingye Da
...
Ao Zhang
Hao Zhang
Yizhou Zhao
Ruijie Zheng
Yuke Zhu
VLM
152
52
0
18 Mar 2025
AirExo-2: Scaling up Generalizable Robotic Imitation Learning with Low-Cost Exoskeletons
AirExo-2: Scaling up Generalizable Robotic Imitation Learning with Low-Cost Exoskeletons
Hongjie Fang
Chenxi Wang
Yiming Wang
J. Chen
Shangning Xia
...
Xinyu Zhan
Lixin Yang
Weiming Wang
Cewu Lu
Hao-Shu Fang
152
2
0
05 Mar 2025
Magma: A Foundation Model for Multimodal AI Agents
Magma: A Foundation Model for Multimodal AI Agents
Jianwei Yang
Reuben Tan
Qianhui Wu
Ruijie Zheng
Baolin Peng
...
Seonghyeon Ye
Joel Jang
Yuquan Deng
Lars Liden
Jianfeng Gao
VLMAI4TS
146
16
0
18 Feb 2025
Pre-training Auto-regressive Robotic Models with 4D Representations
Pre-training Auto-regressive Robotic Models with 4D Representations
Dantong Niu
Yuvan Sharma
Haoru Xue
Giscard Biamby
Junyi Zhang
Ziteng Ji
Trevor Darrell
Roei Herzig
139
1
0
18 Feb 2025
Object-Centric Latent Action Learning
Object-Centric Latent Action Learning
Albina Klepach
Alexander Nikulin
Ilya Zisman
Denis Tarasov
Alexander Derevyagin
Andrei Polubarov
Nikita Lyubaykin
Vladislav Kurenkov
113
0
0
13 Feb 2025
PlaySlot: Learning Inverse Latent Dynamics for Controllable Object-Centric Video Prediction and Planning
PlaySlot: Learning Inverse Latent Dynamics for Controllable Object-Centric Video Prediction and Planning
Angel Villar-Corrales
Sven Behnke
200
4
0
11 Feb 2025
Future Research Avenues for Artificial Intelligence in Digital Gaming:
  An Exploratory Report
Future Research Avenues for Artificial Intelligence in Digital Gaming: An Exploratory Report
Markus Dablander
144
0
0
18 Dec 2024
Towards Generalist Robot Policies: What Matters in Building
  Vision-Language-Action Models
Towards Generalist Robot Policies: What Matters in Building Vision-Language-Action Models
Xinghang Li
Peiyan Li
Minghuan Liu
Dong Wang
Jirong Liu
Bingyi Kang
Xiao Ma
Tao Kong
Hanbo Zhang
Huaping Liu
LM&Ro
141
25
0
18 Dec 2024
Gen2Act: Human Video Generation in Novel Scenarios enables Generalizable
  Robot Manipulation
Gen2Act: Human Video Generation in Novel Scenarios enables Generalizable Robot Manipulation
Homanga Bharadhwaj
Debidatta Dwibedi
Abhinav Gupta
Shubham Tulsiani
Carl Doersch
Ted Xiao
Dhruv Shah
Fei Xia
Dorsa Sadigh
Sean Kirmani
VGenLM&Ro
90
37
0
24 Sep 2024
MotIF: Motion Instruction Fine-tuning
MotIF: Motion Instruction Fine-tuning
Minyoung Hwang
Joey Hejna
Dorsa Sadigh
Yonatan Bisk
82
1
0
16 Sep 2024
Robot Learning as an Empirical Science: Best Practices for Policy
  Evaluation
Robot Learning as an Empirical Science: Best Practices for Policy Evaluation
H. Kress-Gazit
Kunimatsu Hashimoto
Naveen Kuppuswamy
Paarth Shah
Phoebe Horgan
Gordon Richardson
Siyuan Feng
Benjamin Burchfiel
52
5
0
14 Sep 2024
Diffusion Models Are Real-Time Game Engines
Diffusion Models Are Real-Time Game Engines
Dani Valevski
Yaniv Leviathan
Moab Arar
Shlomi Fruchter
DiffMVGenAI4CE
96
79
0
27 Aug 2024
Scaling LLM Test-Time Compute Optimally can be More Effective than
  Scaling Model Parameters
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
Charlie Snell
Jaehoon Lee
Kelvin Xu
Aviral Kumar
LRM
182
681
0
06 Aug 2024
HRP: Human Affordances for Robotic Pre-Training
HRP: Human Affordances for Robotic Pre-Training
Mohan Kumar Srirama
Sudeep Dasari
Shikhar Bahl
Abhinav Gupta
81
16
0
26 Jul 2024
QueST: Self-Supervised Skill Abstractions for Learning Continuous
  Control
QueST: Self-Supervised Skill Abstractions for Learning Continuous Control
Atharva Mete
Haotian Xue
Albert Wilcox
Yongxin Chen
Animesh Garg
SSL
104
22
0
22 Jul 2024
Robotic Control via Embodied Chain-of-Thought Reasoning
Robotic Control via Embodied Chain-of-Thought Reasoning
Michał Zawalski
William Chen
Karl Pertsch
Oier Mees
Chelsea Finn
Sergey Levine
LRMLM&Ro
123
76
0
11 Jul 2024
EquiBot: SIM(3)-Equivariant Diffusion Policy for Generalizable and Data
  Efficient Learning
EquiBot: SIM(3)-Equivariant Diffusion Policy for Generalizable and Data Efficient Learning
Jingyun Yang
Zi-ang Cao
Congyue Deng
Rika Antonova
Shuran Song
Jeannette Bohg
DiffM
92
36
0
01 Jul 2024
LLaRA: Supercharging Robot Learning Data for Vision-Language Policy
LLaRA: Supercharging Robot Learning Data for Vision-Language Policy
Xiang Li
Cristina Mata
J. Park
Kumara Kahatapitiya
Yoo Sung Jang
...
Kanchana Ranasinghe
R. Burgert
Mu Cai
Yong Jae Lee
Michael S. Ryoo
LM&Ro
112
29
0
28 Jun 2024
Dreamitate: Real-World Visuomotor Policy Learning via Video Generation
Dreamitate: Real-World Visuomotor Policy Learning via Video Generation
Junbang Liang
Ruoshi Liu
Ege Ozguroglu
Sruthi Sudhakar
Achal Dave
P. Tokmakov
Shuran Song
Carl Vondrick
VGen
76
29
0
24 Jun 2024
LLARVA: Vision-Action Instruction Tuning Enhances Robot Learning
LLARVA: Vision-Action Instruction Tuning Enhances Robot Learning
Dantong Niu
Yuvan Sharma
Giscard Biamby
Jerome Quenum
Yutong Bai
Baifeng Shi
Trevor Darrell
Roei Herzig
LM&RoVLM
91
27
0
17 Jun 2024
OpenVLA: An Open-Source Vision-Language-Action Model
OpenVLA: An Open-Source Vision-Language-Action Model
Moo Jin Kim
Karl Pertsch
Siddharth Karamcheti
Ted Xiao
Ashwin Balakrishna
...
Russ Tedrake
Dorsa Sadigh
Sergey Levine
Percy Liang
Chelsea Finn
LM&RoVLM
235
517
0
13 Jun 2024
Learning Manipulation by Predicting Interaction
Learning Manipulation by Predicting Interaction
Jia Zeng
Qingwen Bu
Bangjun Wang
Wenke Xia
Li Chen
...
Heming Cui
Bin Zhao
Xuelong Li
Yu Qiao
Hongyang Li
107
26
0
01 Jun 2024
Vision-based Manipulation from Single Human Video with Open-World Object
  Graphs
Vision-based Manipulation from Single Human Video with Open-World Object Graphs
Yifeng Zhu
Arisrei Lim
Peter Stone
Yuke Zhu
85
38
0
30 May 2024
Octo: An Open-Source Generalist Robot Policy
Octo: An Open-Source Generalist Robot Policy
Octo Model Team
Dibya Ghosh
Homer Walke
Karl Pertsch
Kevin Black
...
Quan Vuong
Ted Xiao
Dorsa Sadigh
Chelsea Finn
Sergey Levine
189
424
0
20 May 2024
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Chameleon Team
MLLM
186
309
0
16 May 2024
Evaluating Real-World Robot Manipulation Policies in Simulation
Evaluating Real-World Robot Manipulation Policies in Simulation
Xuanlin Li
Kyle Hsu
Jiayuan Gu
Karl Pertsch
Oier Mees
...
Jiajun Wu
Chelsea Finn
Hao Su
Q. Vuong
Ted Xiao
OffRL
90
77
0
09 May 2024
Track2Act: Predicting Point Tracks from Internet Videos enables Diverse
  Zero-shot Robot Manipulation
Track2Act: Predicting Point Tracks from Internet Videos enables Diverse Zero-shot Robot Manipulation
Homanga Bharadhwaj
Roozbeh Mottaghi
Abhinav Gupta
Shubham Tulsiani
3DPC
120
19
0
02 May 2024
What Foundation Models can Bring for Robot Learning in Manipulation : A
  Survey
What Foundation Models can Bring for Robot Learning in Manipulation : A Survey
Dingzhe Li
Yixiang Jin
A. Yong
Hongze Yu
Jun Shi
Xiaoshuai Hao
Peng Hao
Huaping Liu
Gang Hua
Bin Fang
AI4CELM&Ro
152
14
0
28 Apr 2024
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your
  Phone
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Marah Abdin
Sam Ade Jacobs
A. A. Awan
J. Aneja
Ahmed Hassan Awadallah
...
Li Zhang
Yi Zhang
Yue Zhang
Yunan Zhang
Xiren Zhou
LRMALM
139
1,249
0
22 Apr 2024
Behavior Generation with Latent Actions
Behavior Generation with Latent Actions
Seungjae Lee
Yibin Wang
Haritheja Etukuru
H. J. Kim
Mahi Shafiullah
Lerrel Pinto
VGenOffRL
97
79
0
05 Mar 2024
RT-H: Action Hierarchies Using Language
RT-H: Action Hierarchies Using Language
Suneel Belkhale
Tianli Ding
Ted Xiao
P. Sermanet
Quon Vuong
Jonathan Tompson
Yevgen Chebotar
Debidatta Dwibedi
Dorsa Sadigh
LM&Ro
91
87
0
04 Mar 2024
Video as the New Language for Real-World Decision Making
Video as the New Language for Real-World Decision Making
Sherry Yang
Jacob Walker
Jack Parker-Holder
Yilun Du
Jake Bruce
Andre Barreto
Pieter Abbeel
Dale Schuurmans
VGen
109
55
0
27 Feb 2024
Genie: Generative Interactive Environments
Genie: Generative Interactive Environments
Jake Bruce
Michael Dennis
Ashley D. Edwards
Jack Parker-Holder
Yuge Shi
...
Konrad Zolna
Jeff Clune
Nando de Freitas
Satinder Singh
Tim Rocktaschel
VGenVLM
142
171
0
23 Feb 2024
Any-point Trajectory Modeling for Policy Learning
Any-point Trajectory Modeling for Policy Learning
Chuan Wen
Xingyu Lin
John So
Kai-xiang Chen
Qi Dou
Yang Gao
Pieter Abbeel
PINNVGen
105
98
0
28 Dec 2023
Unleashing Large-Scale Video Generative Pre-training for Visual Robot
  Manipulation
Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation
Hongtao Wu
Ya Jing
Chi-Hou Cheang
Guangzeng Chen
Jiafeng Xu
Xinghang Li
Minghuan Liu
Hang Li
Tao Kong
111
108
0
20 Dec 2023
Learning to Act without Actions
Learning to Act without Actions
Dominik Schmidt
Minqi Jiang
OffRL
101
37
0
17 Dec 2023
H-GAP: Humanoid Control with a Generalist Planner
H-GAP: Humanoid Control with a Generalist Planner
Zhengyao Jiang
Yingchen Xu
Nolan Wagener
Yicheng Luo
Michael Janner
Edward Grefenstette
Tim Rocktaschel
Yuandong Tian
AI4CE
84
6
0
05 Dec 2023
12
Next