ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.13948
  4. Cited By
Core Challenges in Embodied Vision-Language Planning

Core Challenges in Embodied Vision-Language Planning

26 June 2021
Jonathan M Francis
Nariaki Kitamura
Felix Labelle
Xiaopeng Lu
Ingrid Navarro
Jean Oh
    LM&Ro
ArXivPDFHTML

Papers citing "Core Challenges in Embodied Vision-Language Planning"

42 / 42 papers shown
Title
Online Reasoning Video Segmentation with Just-in-Time Digital Twins
Online Reasoning Video Segmentation with Just-in-Time Digital Twins
Yiqing Shen
Bohan Liu
Chenjia Li
Lalithkumar Seenivasan
Mathias Unberath
VOS
83
2
0
27 Mar 2025
Synergistic Dual Spatial-aware Generation of Image-to-Text and
  Text-to-Image
Synergistic Dual Spatial-aware Generation of Image-to-Text and Text-to-Image
Yu Zhao
Hao Fei
Xiangtai Li
L. Qin
Jiayi Ji
Hongyuan Zhu
Meishan Zhang
M. Zhang
Jianguo Wei
DiffM
26
1
0
20 Oct 2024
EmbodiedCity: A Benchmark Platform for Embodied Agent in Real-world City
  Environment
EmbodiedCity: A Benchmark Platform for Embodied Agent in Real-world City Environment
Chen Gao
Baining Zhao
Weichen Zhang
Jinzhu Mao
Jun Zhang
...
Jianjie Fang
Zile Zhou
Jinqiang Cui
X. Chen
Yong Li
LM&Ro
37
10
0
12 Oct 2024
Can-Do! A Dataset and Neuro-Symbolic Grounded Framework for Embodied
  Planning with Large Multimodal Models
Can-Do! A Dataset and Neuro-Symbolic Grounded Framework for Embodied Planning with Large Multimodal Models
Yew Ken Chia
Qi Sun
Lidong Bing
Soujanya Poria
LM&Ro
29
1
0
22 Sep 2024
Cognitive LLMs: Towards Integrating Cognitive Architectures and Large
  Language Models for Manufacturing Decision-making
Cognitive LLMs: Towards Integrating Cognitive Architectures and Large Language Models for Manufacturing Decision-making
Siyu Wu
A. Oltramari
Jonathan M Francis
C. L. Giles
Frank E. Ritter
40
0
0
17 Aug 2024
RoPotter: Toward Robotic Pottery and Deformable Object Manipulation with
  Structural Priors
RoPotter: Toward Robotic Pottery and Deformable Object Manipulation with Structural Priors
Uksang Yoo
Adam Hung
Jonathan M Francis
Jean Oh
Jeffrey Ichnowski
36
2
0
05 Aug 2024
NavHint: Vision and Language Navigation Agent with a Hint Generator
NavHint: Vision and Language Navigation Agent with a Hint Generator
Yue Zhang
Quan Guo
Parisa Kordjamshidi
LLMAG
28
9
0
04 Feb 2024
LLM-SAP: Large Language Models Situational Awareness Based Planning
LLM-SAP: Large Language Models Situational Awareness Based Planning
Liman Wang
Hanyang Zhong
LLMAG
27
2
0
26 Dec 2023
Toward General-Purpose Robots via Foundation Models: A Survey and
  Meta-Analysis
Toward General-Purpose Robots via Foundation Models: A Survey and Meta-Analysis
Yafei Hu
Quanting Xie
Vidhi Jain
Jonathan M Francis
Jay Patrikar
...
Xiaolong Wang
Sebastian A. Scherer
Z. Kira
Fei Xia
Yonatan Bisk
LM&Ro
AI4CE
32
63
0
14 Dec 2023
SafeShift: Safety-Informed Distribution Shifts for Robust Trajectory
  Prediction in Autonomous Driving
SafeShift: Safety-Informed Distribution Shifts for Robust Trajectory Prediction in Autonomous Driving
Ben Stoler
Ingrid Navarro
Meghdeep Jana
Soonmin Hwang
Jonathan M Francis
Jean Oh
16
8
0
16 Sep 2023
MOSAIC: Learning Unified Multi-Sensory Object Property Representations
  for Robot Learning via Interactive Perception
MOSAIC: Learning Unified Multi-Sensory Object Property Representations for Robot Learning via Interactive Perception
Gyan Tatiya
Jonathan M Francis
Ho-Hsiang Wu
Yonatan Bisk
Jivko Sinapov
29
1
0
15 Sep 2023
Planning with Logical Graph-based Language Model for Instruction
  Generation
Planning with Logical Graph-based Language Model for Instruction Generation
Fan Zhang
Kebing Jin
H. Zhuo
LRM
32
3
0
26 Aug 2023
What Went Wrong? Closing the Sim-to-Real Gap via Differentiable Causal
  Discovery
What Went Wrong? Closing the Sim-to-Real Gap via Differentiable Causal Discovery
Peide Huang
Xilun Zhang
Ziang Cao
Shiqi Liu
Mengdi Xu
Wenhao Ding
Jonathan M Francis
Bingqing Chen
Ding Zhao
36
24
0
28 Jun 2023
Knowledge-enhanced Agents for Interactive Text Games
Knowledge-enhanced Agents for Interactive Text Games
P. Chhikara
Jiarui Zhang
Filip Ilievski
Jonathan M Francis
Kaixin Ma
LLMAG
29
8
0
08 May 2023
Cross-Tool and Cross-Behavior Perceptual Knowledge Transfer for Grounded
  Object Recognition
Cross-Tool and Cross-Behavior Perceptual Knowledge Transfer for Grounded Object Recognition
Gyan Tatiya
Jonathan M Francis
Jivko Sinapov
20
4
0
07 Mar 2023
VLN-Trans: Translator for the Vision and Language Navigation Agent
VLN-Trans: Translator for the Vision and Language Navigation Agent
Yue Zhang
Parisa Kordjamshidi
32
16
0
18 Feb 2023
Learning by Asking for Embodied Visual Navigation and Task Completion
Learning by Asking for Embodied Visual Navigation and Task Completion
Ying Shen
Ismini Lourentzou
26
2
0
09 Feb 2023
Knowledge-driven Scene Priors for Semantic Audio-Visual Embodied
  Navigation
Knowledge-driven Scene Priors for Semantic Audio-Visual Embodied Navigation
Gyan Tatiya
Jonathan M Francis
Luca Bondi
Ingrid Navarro
Eric Nyberg
Jivko Sinapov
Jean Oh
22
8
0
21 Dec 2022
Distribution-aware Goal Prediction and Conformant Model-based Planning
  for Safe Autonomous Driving
Distribution-aware Goal Prediction and Conformant Model-based Planning for Safe Autonomous Driving
Jonathan M Francis
Bingqing Chen
Weiran Yao
Eric Nyberg
Jean Oh
OOD
26
5
0
16 Dec 2022
Automaton-Based Representations of Task Knowledge from Generative
  Language Models
Automaton-Based Representations of Task Knowledge from Generative Language Models
Yunhao Yang
Jean-Raphael Gaglione
Cyrus Neary
Ufuk Topcu
27
11
0
04 Dec 2022
ViLPAct: A Benchmark for Compositional Generalization on Multimodal
  Human Activities
ViLPAct: A Benchmark for Compositional Generalization on Multimodal Human Activities
Terry Yue Zhuo
Yaqing Liao
Yuecheng Lei
Lizhen Qu
Gerard de Melo
Xiaojun Chang
Yazhou Ren
Zenglin Xu
34
2
0
11 Oct 2022
Transferring Implicit Knowledge of Non-Visual Object Properties Across
  Heterogeneous Robot Morphologies
Transferring Implicit Knowledge of Non-Visual Object Properties Across Heterogeneous Robot Morphologies
Gyan Tatiya
Jonathan M Francis
Jivko Sinapov
25
13
0
14 Sep 2022
A Simple Approach for Visual Rearrangement: 3D Mapping and Semantic
  Search
A Simple Approach for Visual Rearrangement: 3D Mapping and Semantic Search
Brandon Trabucco
Gunnar A. Sigurdsson
Robinson Piramuthu
Gaurav Sukhatme
Ruslan Salakhutdinov
OCL
28
7
0
21 Jun 2022
Learn-to-Race Challenge 2022: Benchmarking Safe Learning and
  Cross-domain Generalisation in Autonomous Racing
Learn-to-Race Challenge 2022: Benchmarking Safe Learning and Cross-domain Generalisation in Autonomous Racing
Jonathan M Francis
Bingqing Chen
Siddha Ganju
Sidharth Kathpal
Jyotish Poonganam
...
Ivan Zhukov
Max Kumskoy
Anirudh Koul
Jean Oh
Eric Nyberg
11
11
0
05 May 2022
Generalizable Neuro-symbolic Systems for Commonsense Question Answering
Generalizable Neuro-symbolic Systems for Commonsense Question Answering
A. Oltramari
Jonathan M Francis
Filip Ilievski
Kaixin Ma
Roshanak Mirzaee
NAI
18
8
0
17 Jan 2022
Safe Autonomous Racing via Approximate Reachability on Ego-vision
Safe Autonomous Racing via Approximate Reachability on Ego-vision
Bingqing Chen
Jonathan M Francis
Jean Oh
Eric Nyberg
Sylvia L. Herbert
46
14
0
14 Oct 2021
Waypoint Models for Instruction-guided Navigation in Continuous
  Environments
Waypoint Models for Instruction-guided Navigation in Continuous Environments
Jacob Krantz
Aaron Gokaslan
Dhruv Batra
Stefan Lee
Oleksandr Maksymets
LM&Ro
134
76
0
05 Oct 2021
Skill Induction and Planning with Latent Language
Skill Induction and Planning with Latent Language
Pratyusha Sharma
Antonio Torralba
Jacob Andreas
LM&Ro
194
108
0
04 Oct 2021
TEACh: Task-driven Embodied Agents that Chat
TEACh: Task-driven Embodied Agents that Chat
Aishwarya Padmakumar
Jesse Thomason
Ayush Shrivastava
P. Lange
Anjali Narayan-Chen
Spandana Gella
Robinson Piramithu
Gökhan Tür
Dilek Z. Hakkani-Tür
LM&Ro
163
180
0
01 Oct 2021
Reference-Centric Models for Grounded Collaborative Dialogue
Reference-Centric Models for Grounded Collaborative Dialogue
Daniel Fried
Justin T. Chiu
Dan Klein
44
21
0
10 Sep 2021
iGibson 2.0: Object-Centric Simulation for Robot Learning of Everyday
  Household Tasks
iGibson 2.0: Object-Centric Simulation for Robot Learning of Everyday Household Tasks
Chengshu Li
Fei Xia
Roberto Martín-Martín
Michael Lingelbach
S. Srivastava
...
Karen Liu
H. Gweon
Jiajun Wu
Li Fei-Fei
Silvio Savarese
LM&Ro
153
221
0
06 Aug 2021
How Much Can CLIP Benefit Vision-and-Language Tasks?
How Much Can CLIP Benefit Vision-and-Language Tasks?
Sheng Shen
Liunian Harold Li
Hao Tan
Mohit Bansal
Anna Rohrbach
Kai-Wei Chang
Z. Yao
Kurt Keutzer
CLIP
VLM
MLLM
196
405
0
13 Jul 2021
ManipulaTHOR: A Framework for Visual Object Manipulation
ManipulaTHOR: A Framework for Visual Object Manipulation
Kiana Ehsani
Winson Han
Alvaro Herrasti
Eli VanderBilt
Luca Weihs
Eric Kolve
Aniruddha Kembhavi
Roozbeh Mottaghi
LM&Ro
171
124
0
22 Apr 2021
Learn-to-Race: A Multimodal Control Environment for Autonomous Racing
Learn-to-Race: A Multimodal Control Environment for Autonomous Racing
James Herman
Jonathan M Francis
Siddha Ganju
Bingqing Chen
Anirudh Koul
Abhinav Gupta
Alexey Skabelkin
Ivan Zhukov
Max Kumskoy
Eric Nyberg
23
35
0
22 Mar 2021
On the Evaluation of Vision-and-Language Navigation Instructions
On the Evaluation of Vision-and-Language Navigation Instructions
Mingde Zhao
Peter Anderson
Vihan Jain
Su Wang
Alexander Ku
Jason Baldridge
Eugene Ie
231
51
0
26 Jan 2021
Transformers in Vision: A Survey
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
F. Khan
M. Shah
ViT
227
2,428
0
04 Jan 2021
Multimodal Research in Vision and Language: A Review of Current and
  Emerging Trends
Multimodal Research in Vision and Language: A Review of Current and Emerging Trends
Shagun Uppal
Sarthak Bhagat
Devamanyu Hazarika
Navonil Majumdar
Soujanya Poria
Roger Zimmermann
Amir Zadeh
20
6
0
19 Oct 2020
SAPIEN: A SimulAted Part-based Interactive ENvironment
SAPIEN: A SimulAted Part-based Interactive ENvironment
Fanbo Xiang
Yuzhe Qin
Kaichun Mo
Yikuan Xia
Hao Zhu
...
He-Nan Wang
Li Yi
Angel X. Chang
Leonidas J. Guibas
Hao Su
218
485
0
19 Mar 2020
Diverse and Admissible Trajectory Forecasting through Multimodal Context
  Understanding
Diverse and Admissible Trajectory Forecasting through Multimodal Context Understanding
Seonguk Park
Gyubok Lee
Manoj Bhat
Jimin Seo
Minseok Kang
Jonathan M Francis
Ashwin R. Jadhav
Paul Pu Liang
Louis-Philippe Morency
136
119
0
06 Mar 2020
Help, Anna! Visual Navigation with Natural Multimodal Assistance via
  Retrospective Curiosity-Encouraging Imitation Learning
Help, Anna! Visual Navigation with Natural Multimodal Assistance via Retrospective Curiosity-Encouraging Imitation Learning
Khanh Nguyen
Hal Daumé
LM&Ro
EgoV
178
150
0
04 Sep 2019
Neural Modular Control for Embodied Question Answering
Neural Modular Control for Embodied Question Answering
Abhishek Das
Georgia Gkioxari
Stefan Lee
Devi Parikh
Dhruv Batra
LM&Ro
132
127
0
26 Oct 2018
Speaker-Follower Models for Vision-and-Language Navigation
Speaker-Follower Models for Vision-and-Language Navigation
Daniel Fried
Ronghang Hu
Volkan Cirik
Anna Rohrbach
Jacob Andreas
Louis-Philippe Morency
Taylor Berg-Kirkpatrick
Kate Saenko
Dan Klein
Trevor Darrell
LM&Ro
LRM
257
496
0
07 Jun 2018
1