ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2301.09749
  4. Cited By
A Data-Efficient Visual-Audio Representation with Intuitive Fine-tuning
  for Voice-Controlled Robots

A Data-Efficient Visual-Audio Representation with Intuitive Fine-tuning for Voice-Controlled Robots

23 January 2023
Peixin Chang
Shuijing Liu
Tianchen Ji
Neeloy Chakraborty
Kaiwen Hong
Katherine Driggs-Campbell
ArXivPDFHTML

Papers citing "A Data-Efficient Visual-Audio Representation with Intuitive Fine-tuning for Voice-Controlled Robots"

9 / 9 papers shown
Title
HEIGHT: Heterogeneous Interaction Graph Transformer for Robot Navigation in Crowded and Constrained Environments
HEIGHT: Heterogeneous Interaction Graph Transformer for Robot Navigation in Crowded and Constrained Environments
Shuijing Liu
Haochen Xia
Fatemeh Cheraghi Pouria
Kaiwen Hong
Neeloy Chakraborty
Katherine Rose Driggs-Campbell
Joydeep Biswas
Katherine Driggs-Campbell
111
1
0
19 Nov 2024
BUMBLE: Unifying Reasoning and Acting with Vision-Language Models for
  Building-wide Mobile Manipulation
BUMBLE: Unifying Reasoning and Acting with Vision-Language Models for Building-wide Mobile Manipulation
Rutav Shah
Albert Yu
Yifeng Zhu
Yuke Zhu
Roberto Martín-Martín
LM&Ro
37
6
0
08 Oct 2024
Visual Language Maps for Robot Navigation
Visual Language Maps for Robot Navigation
Chen Huang
Oier Mees
Andy Zeng
Wolfram Burgard
LM&Ro
156
344
0
11 Oct 2022
Grounding Language with Visual Affordances over Unstructured Data
Grounding Language with Visual Affordances over Unstructured Data
Oier Mees
Jessica Borja-Diaz
Wolfram Burgard
LM&Ro
121
108
0
04 Oct 2022
Perceiver-Actor: A Multi-Task Transformer for Robotic Manipulation
Perceiver-Actor: A Multi-Task Transformer for Robotic Manipulation
Mohit Shridhar
Lucas Manuelli
D. Fox
LM&Ro
161
457
0
12 Sep 2022
LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language,
  Vision, and Action
LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action
Dhruv Shah
B. Osinski
Brian Ichter
Sergey Levine
LM&Ro
158
436
0
10 Jul 2022
Play it by Ear: Learning Skills amidst Occlusion through Audio-Visual
  Imitation Learning
Play it by Ear: Learning Skills amidst Occlusion through Audio-Visual Imitation Learning
Maximilian Du
Olivia Y. Lee
Suraj Nair
Chelsea Finn
OffRL
54
32
0
30 May 2022
FILM: Following Instructions in Language with Modular Methods
FILM: Following Instructions in Language with Modular Methods
So Yeon Min
Devendra Singh Chaplot
Pradeep Ravikumar
Yonatan Bisk
Ruslan Salakhutdinov
LM&Ro
214
159
0
12 Oct 2021
A Persistent Spatial Semantic Representation for High-level Natural
  Language Instruction Execution
A Persistent Spatial Semantic Representation for High-level Natural Language Instruction Execution
Valts Blukis
Chris Paxton
D. Fox
Animesh Garg
Yoav Artzi
LM&Ro
212
134
0
12 Jul 2021
1