ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2408.01366
  4. Cited By
Play to the Score: Stage-Guided Dynamic Multi-Sensory Fusion for Robotic
  Manipulation

Play to the Score: Stage-Guided Dynamic Multi-Sensory Fusion for Robotic Manipulation

2 August 2024
Runze Yuan
Tao Liu
Wenke Ma
Xuelong Li
ArXiv (abs)PDFHTML

Papers citing "Play to the Score: Stage-Guided Dynamic Multi-Sensory Fusion for Robotic Manipulation"

34 / 34 papers shown
Title
AnyTouch: Learning Unified Static-Dynamic Representation across Multiple Visuo-tactile Sensors
AnyTouch: Learning Unified Static-Dynamic Representation across Multiple Visuo-tactile Sensors
Ruoxuan Feng
Jiangyu Hu
Wenke Xia
Tianci Gao
Ao Shen
Yuhao Sun
Bin Fang
Di Hu
97
9
0
15 Feb 2025
Visual-auditory Extrinsic Contact Estimation
Visual-auditory Extrinsic Contact Estimation
Xili Yi
Jayjun Lee
Nima Fazeli
86
3
0
22 Sep 2024
Depth Helps: Improving Pre-trained RGB-based Policy with Depth
  Information Injection
Depth Helps: Improving Pre-trained RGB-based Policy with Depth Information Injection
Xincheng Pang
Wenke Xia
Zhigang Wang
Bin Zhao
Di Hu
Dong Wang
Xuelong Li
101
4
0
09 Aug 2024
RT-H: Action Hierarchies Using Language
RT-H: Action Hierarchies Using Language
Suneel Belkhale
Tianli Ding
Ted Xiao
P. Sermanet
Quon Vuong
Jonathan Tompson
Yevgen Chebotar
Debidatta Dwibedi
Dorsa Sadigh
LM&Ro
106
89
0
04 Mar 2024
Kinematic-aware Prompting for Generalizable Articulated Object
  Manipulation with LLMs
Kinematic-aware Prompting for Generalizable Articulated Object Manipulation with LLMs
Wenke Xia
Dong Wang
Xincheng Pang
Zhigang Wang
Bin Zhao
Di Hu
Xuelong Li
LM&Ro
83
21
0
06 Nov 2023
RoboVQA: Multimodal Long-Horizon Reasoning for Robotics
RoboVQA: Multimodal Long-Horizon Reasoning for Robotics
P. Sermanet
Tianli Ding
Jeffrey Zhao
Fei Xia
Debidatta Dwibedi
...
Pannag R Sanketi
Karol Hausman
Izhak Shafran
Brian Ichter
Yuan Cao
LM&Ro
107
54
0
01 Nov 2023
Enhancing multimodal cooperation via sample-level modality valuation
Enhancing multimodal cooperation via sample-level modality valuation
Yake Wei
Ruoxuan Feng
Zihe Wang
Di Hu
44
16
0
12 Sep 2023
Multi-Stage Cable Routing through Hierarchical Imitation Learning
Multi-Stage Cable Routing through Hierarchical Imitation Learning
Jianlan Luo
Charles Xu
Xinyang Geng
Gilbert Feng
Kuan Fang
L. Tan
S. Schaal
Sergey Levine
127
58
0
18 Jul 2023
Provable Dynamic Fusion for Low-Quality Multimodal Data
Provable Dynamic Fusion for Low-Quality Multimodal Data
Qingyang Zhang
Haitao Wu
Changqing Zhang
Qinghua Hu
Huazhu Fu
Qiufeng Wang
Xi Peng
103
62
0
03 Jun 2023
Chain-of-Thought Predictive Control
Chain-of-Thought Predictive Control
Zhiwei Jia
Vineet Thumuluri
Fangchen Liu
Ling-Hao Chen
Zhiao Huang
H. Su
LM&Ro
106
20
0
03 Apr 2023
Diffusion Policy: Visuomotor Policy Learning via Action Diffusion
Diffusion Policy: Visuomotor Policy Learning via Action Diffusion
Cheng Chi
Zhenjia Xu
S. Feng
Eric A. Cousineau
Yilun Du
Benjamin Burchfiel
Russ Tedrake
Shuran Song
349
1,242
0
07 Mar 2023
Visual Language Maps for Robot Navigation
Visual Language Maps for Robot Navigation
Chen Huang
Oier Mees
Andy Zeng
Wolfram Burgard
LM&Ro
247
369
0
11 Oct 2022
Instruction-driven history-aware policies for robotic manipulations
Instruction-driven history-aware policies for robotic manipulations
Pierre-Louis Guhur
Shizhe Chen
Ricardo Garcia Pinel
Makarand Tapaswi
Ivan Laptev
Cordelia Schmid
LM&Ro
168
108
0
11 Sep 2022
Inner Monologue: Embodied Reasoning through Planning with Language
  Models
Inner Monologue: Embodied Reasoning through Planning with Language Models
Wenlong Huang
F. Xia
Ted Xiao
Harris Chan
Jacky Liang
...
Tomas Jackson
Linda Luu
Sergey Levine
Karol Hausman
Brian Ichter
LLMAGLM&RoLRM
134
922
0
12 Jul 2022
M2FNet: Multi-modal Fusion Network for Emotion Recognition in
  Conversation
M2FNet: Multi-modal Fusion Network for Emotion Recognition in Conversation
Vishal M. Chudasama
P. Kar
Ashish Gudmalwar
Nirmesh J. Shah
Pankaj Wasnik
N. Onoe
56
113
0
05 Jun 2022
Play it by Ear: Learning Skills amidst Occlusion through Audio-Visual
  Imitation Learning
Play it by Ear: Learning Skills amidst Occlusion through Audio-Visual Imitation Learning
Maximilian Du
Olivia Y. Lee
Suraj Nair
Chelsea Finn
OffRL
95
33
0
30 May 2022
ObjectFolder 2.0: A Multisensory Object Dataset for Sim2Real Transfer
ObjectFolder 2.0: A Multisensory Object Dataset for Sim2Real Transfer
Ruohan Gao
Zilin Si
Yen-Yu Chang
Samuel Clarke
Jeannette Bohg
Li Fei-Fei
Wenzhen Yuan
Jiajun Wu
68
88
0
05 Apr 2022
Learning to Answer Questions in Dynamic Audio-Visual Scenarios
Learning to Answer Questions in Dynamic Audio-Visual Scenarios
Guangyao Li
Yake Wei
Yapeng Tian
Chenliang Xu
Ji-Rong Wen
Di Hu
123
153
0
26 Mar 2022
Learning Generalizable Vision-Tactile Robotic Grasping Strategy for
  Deformable Objects via Transformer
Learning Generalizable Vision-Tactile Robotic Grasping Strategy for Deformable Objects via Transformer
Yunhai Han
Kelin Yu
Rahul Batra
Nathan Boyd
Chaitanya Mehta
T. Zhao
Y. She
S. Hutchinson
Ye Zhao
ViT
101
50
0
13 Dec 2021
Catch Me If You Hear Me: Audio-Visual Navigation in Complex Unmapped
  Environments with Moving Sounds
Catch Me If You Hear Me: Audio-Visual Navigation in Complex Unmapped Environments with Moving Sounds
Abdelrahman Younes
Daniel Honerkamp
Tim Welschehold
Abhinav Valada
84
42
0
29 Nov 2021
ObjectFolder: A Dataset of Objects with Implicit Visual, Auditory, and
  Tactile Representations
ObjectFolder: A Dataset of Objects with Implicit Visual, Auditory, and Tactile Representations
Ruohan Gao
Yen-Yu Chang
Shivani Mall
Li Fei-Fei
Jiajun Wu
96
84
0
16 Sep 2021
Hierarchical Few-Shot Imitation with Skill Transition Models
Hierarchical Few-Shot Imitation with Skill Transition Models
Kourosh Hakhamaneshi
Ruihan Zhao
Albert Zhan
Pieter Abbeel
Michael Laskin
OffRL
80
42
0
19 Jul 2021
Attention Bottlenecks for Multimodal Fusion
Attention Bottlenecks for Multimodal Fusion
Arsha Nagrani
Shan Yang
Anurag Arnab
A. Jansen
Cordelia Schmid
Chen Sun
106
569
0
30 Jun 2021
FFB6D: A Full Flow Bidirectional Fusion Network for 6D Pose Estimation
FFB6D: A Full Flow Bidirectional Fusion Network for 6D Pose Estimation
Yisheng He
Haibin Huang
Haoqiang Fan
Qifeng Chen
Jian Sun
3DPC3DH
83
273
0
03 Mar 2021
3D Shape Reconstruction from Vision and Touch
3D Shape Reconstruction from Vision and Touch
Edward James Smith
Roberto Calandra
Adriana Romero
Georgia Gkioxari
David Meger
Jitendra Malik
M. Drozdzal
79
72
0
07 Jul 2020
Grasp State Assessment of Deformable Objects Using Visual-Tactile Fusion
  Perception
Grasp State Assessment of Deformable Objects Using Visual-Tactile Fusion Perception
Shaowei Cui
Rui Wang
Junhang Wei
Fanrong Li
Shuo Wang
43
42
0
23 Jun 2020
Robust Robotic Pouring using Audition and Haptics
Robust Robotic Pouring using Audition and Haptics
Hongzhuo Liang
Chuangchuang Zhou
Shuang Li
Xiaojian Ma
Norman Hendrich
Timo Gerkmann
F. Sun
Marcus Stoffel
Jianwei Zhang
62
20
0
29 Feb 2020
Robust 6D Object Pose Estimation by Learning RGB-D Features
Robust 6D Object Pose Estimation by Learning RGB-D Features
Meng Tian
Liang Pan
M. Ang
Gim Hee Lee
3DPC
90
50
0
29 Feb 2020
Safe Robot Navigation via Multi-Modal Anomaly Detection
Safe Robot Navigation via Multi-Modal Anomaly Detection
Lorenz Wellhausen
René Ranftl
Marco Hutter
78
77
0
22 Jan 2020
Third-Person Visual Imitation Learning via Decoupled Hierarchical
  Controller
Third-Person Visual Imitation Learning via Decoupled Hierarchical Controller
Pratyusha Sharma
Deepak Pathak
Abhinav Gupta
SSL
80
120
0
21 Nov 2019
Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal
  Representations for Contact-Rich Tasks
Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks
Michelle A. Lee
Yuke Zhu
K. Srinivasan
Parth Shah
Silvio Savarese
Li Fei-Fei
Animesh Garg
Jeannette Bohg
SSL
95
370
0
24 Oct 2018
Data-Efficient Hierarchical Reinforcement Learning
Data-Efficient Hierarchical Reinforcement Learning
Ofir Nachum
S. Gu
Honglak Lee
Sergey Levine
OffRL
102
812
0
21 May 2018
Learning Multimodal Word Representation via Dynamic Fusion Methods
Learning Multimodal Word Representation via Dynamic Fusion Methods
Shaonan Wang
Jiajun Zhang
Chengqing Zong
61
33
0
02 Jan 2018
Gated Multimodal Units for Information Fusion
Gated Multimodal Units for Information Fusion
John Arevalo
Thamar Solorio
Manuel Montes-y-Gómez
Fabio Gonzalez
95
382
0
07 Feb 2017
1