ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2411.19650
  4. Cited By
CogACT: A Foundational Vision-Language-Action Model for Synergizing
  Cognition and Action in Robotic Manipulation

CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation

29 November 2024
Qixiu Li
Yaobo Liang
Zeyu Wang
Lin Luo
Xi Chen
Mozheng Liao
Fangyun Wei
Yu Deng
Sicheng Xu
Yize Zhang
Xiaofan Wang
Bei Liu
Jianlong Fu
Jianmin Bao
Dong Chen
Yuanchun Shi
Jiaolong Yang
B. Guo
    LM&Ro
ArXiv (abs)PDFHTML

Papers citing "CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation"

26 / 26 papers shown
Title
From Grounding to Manipulation: Case Studies of Foundation Model Integration in Embodied Robotic Systems
From Grounding to Manipulation: Case Studies of Foundation Model Integration in Embodied Robotic Systems
Xiuchao Sui
Daiying Tian
Qi Sun
Ruirui Chen
Dongkyu Choi
Kenneth Kwok
Soujanya Poria
LM&Ro
104
0
0
21 May 2025
A0: An Affordance-Aware Hierarchical Model for General Robotic Manipulation
A0: An Affordance-Aware Hierarchical Model for General Robotic Manipulation
Rongtao Xu
Junxuan Zhang
Minghao Guo
Youpeng Wen
H. Yang
...
Liqiong Wang
Yuxuan Kuang
Meng Cao
Feng Zheng
Xiaodan Liang
125
4
0
17 Apr 2025
MoLe-VLA: Dynamic Layer-skipping Vision Language Action Model via Mixture-of-Layers for Efficient Robot Manipulation
MoLe-VLA: Dynamic Layer-skipping Vision Language Action Model via Mixture-of-Layers for Efficient Robot Manipulation
Rongyu Zhang
Menghang Dong
Yuan Zhang
Liang Heng
Xiaowei Chi
Gaole Dai
Li Du
Dan Wang
Yuan Du
MoE
146
4
0
26 Mar 2025
HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model
HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model
Jiaming Liu
Hao Chen
Pengju An
Zhuoyang Liu
Renrui Zhang
...
Chengkai Hou
Mengdi Zhao
KC alex Zhou
Pheng-Ann Heng
Shanghang Zhang
165
19
0
13 Mar 2025
RoboMIND: Benchmark on Multi-embodiment Intelligence Normative Data for Robot Manipulation
RoboMIND: Benchmark on Multi-embodiment Intelligence Normative Data for Robot Manipulation
Kun Wu
Chengkai Hou
Jiaming Liu
Zhengping Che
Xiaozhu Ju
...
Zhenyu Wang
Pengju An
Siyuan Qian
Shanghang Zhang
Jian Tang
LM&Ro
210
23
0
18 Dec 2024
RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation
RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation
Songming Liu
Lingxuan Wu
Bangguo Li
Hengkai Tan
Huayu Chen
Zhengyi Wang
Ke Xu
Hang Su
Jun Zhu
125
122
0
10 Oct 2024
TinyVLA: Towards Fast, Data-Efficient Vision-Language-Action Models for Robotic Manipulation
TinyVLA: Towards Fast, Data-Efficient Vision-Language-Action Models for Robotic Manipulation
Junjie Wen
Yinlin Zhu
Jinming Li
Minjie Zhu
Kun Wu
...
Ran Cheng
Yaxin Peng
Chaomin Shen
Feifei Feng
Jian Tang
LM&Ro
137
68
0
19 Sep 2024
Multi-Stage Cable Routing through Hierarchical Imitation Learning
Multi-Stage Cable Routing through Hierarchical Imitation Learning
Jianlan Luo
Charles Xu
Xinyang Geng
Gilbert Feng
Kuan Fang
L. Tan
S. Schaal
Sergey Levine
108
58
0
18 Jul 2023
Train Offline, Test Online: A Real Robot Learning Benchmark
Train Offline, Test Online: A Real Robot Learning Benchmark
G. Zhou
Victoria Dean
Mohan Kumar Srirama
Aravind Rajeswaran
Jyothish Pari
...
Tianhe Yu
Pieter Abbeel
Lerrel Pinto
Chelsea Finn
Abhi Gupta
OffRL
119
40
0
01 Jun 2023
FurnitureBench: Reproducible Real-World Benchmark for Long-Horizon
  Complex Manipulation
FurnitureBench: Reproducible Real-World Benchmark for Long-Horizon Complex Manipulation
Minho Heo
Youngwoon Lee
Doohyun Lee
Joseph J. Lim
93
96
0
22 May 2023
Visual Instruction Tuning
Visual Instruction Tuning
Haotian Liu
Chunyuan Li
Qingyang Wu
Yong Jae Lee
SyDaVLMMLLM
569
4,910
0
17 Apr 2023
Sigmoid Loss for Language Image Pre-Training
Sigmoid Loss for Language Image Pre-Training
Xiaohua Zhai
Basil Mustafa
Alexander Kolesnikov
Lucas Beyer
CLIPVLM
251
1,200
0
27 Mar 2023
GPT-4 Technical Report
GPT-4 Technical Report
OpenAI OpenAI
OpenAI Josh Achiam
Steven Adler
Sandhini Agarwal
Lama Ahmad
...
Shengjia Zhao
Tianhao Zheng
Juntang Zhuang
William Zhuk
Barret Zoph
LLMAGMLLM
1.5K
14,748
0
15 Mar 2023
Diffusion Policy: Visuomotor Policy Learning via Action Diffusion
Diffusion Policy: Visuomotor Policy Learning via Action Diffusion
Cheng Chi
Zhenjia Xu
S. Feng
Eric A. Cousineau
Yilun Du
Benjamin Burchfiel
Russ Tedrake
Shuran Song
349
1,231
0
07 Mar 2023
Scalable Diffusion Models with Transformers
Scalable Diffusion Models with Transformers
William S. Peebles
Saining Xie
GNN
118
2,434
0
19 Dec 2022
Learning and Retrieval from Prior Data for Skill-based Imitation
  Learning
Learning and Retrieval from Prior Data for Skill-based Imitation Learning
Soroush Nasiriany
Tian Gao
Ajay Mandlekar
Yuke Zhu
SSL
93
50
0
20 Oct 2022
Interactive Language: Talking to Robots in Real Time
Interactive Language: Talking to Robots in Real Time
Corey Lynch
Ayzaan Wahid
Jonathan Tompson
Tianli Ding
James Betker
Robert Baruch
Travis Armstrong
Peter R. Florence
LM&Ro
96
229
0
12 Oct 2022
Latent Plans for Task-Agnostic Offline Reinforcement Learning
Latent Plans for Task-Agnostic Offline Reinforcement Learning
Erick Rosete-Beas
Oier Mees
Gabriel Kalweit
Joschka Boedecker
Wolfram Burgard
OffRL
106
85
0
19 Sep 2022
Classifier-Free Diffusion Guidance
Classifier-Free Diffusion Guidance
Jonathan Ho
Tim Salimans
FaML
196
3,963
0
26 Jul 2022
R3M: A Universal Visual Representation for Robot Manipulation
R3M: A Universal Visual Representation for Robot Manipulation
Suraj Nair
Aravind Rajeswaran
Vikash Kumar
Chelsea Finn
Abhi Gupta
LM&Ro
101
587
0
23 Mar 2022
BC-Z: Zero-Shot Task Generalization with Robotic Imitation Learning
BC-Z: Zero-Shot Task Generalization with Robotic Imitation Learning
Eric Jang
A. Irpan
Mohi Khansari
Daniel Kappler
F. Ebert
Corey Lynch
Sergey Levine
Chelsea Finn
LM&Ro
263
549
0
04 Feb 2022
Bottom-Up Skill Discovery from Unsegmented Demonstrations for
  Long-Horizon Robot Manipulation
Bottom-Up Skill Discovery from Unsegmented Demonstrations for Long-Horizon Robot Manipulation
Yifeng Zhu
Peter Stone
Yuke Zhu
112
63
0
28 Sep 2021
Improved Denoising Diffusion Probabilistic Models
Improved Denoising Diffusion Probabilistic Models
Alex Nichol
Prafulla Dhariwal
DiffM
352
3,716
0
18 Feb 2021
Denoising Diffusion Implicit Models
Denoising Diffusion Implicit Models
Jiaming Song
Chenlin Meng
Stefano Ermon
VLMDiffM
292
7,492
0
06 Oct 2020
Language Models are Few-Shot Learners
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
880
42,379
0
28 May 2020
Neural Discrete Representation Learning
Neural Discrete Representation Learning
Aaron van den Oord
Oriol Vinyals
Koray Kavukcuoglu
BDLSSLOCL
230
5,071
0
02 Nov 2017
1