ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1809.00786
  4. Cited By
Mapping Instructions to Actions in 3D Environments with Visual Goal
  Prediction

Mapping Instructions to Actions in 3D Environments with Visual Goal Prediction

4 September 2018
Dipendra Kumar Misra
Andrew Bennett
Valts Blukis
Eyvind Niklasson
Max Shatkhin
Yoav Artzi
    LM&Ro
ArXivPDFHTML

Papers citing "Mapping Instructions to Actions in 3D Environments with Visual Goal Prediction"

50 / 133 papers shown
Title
CityNavAgent: Aerial Vision-and-Language Navigation with Hierarchical Semantic Planning and Global Memory
CityNavAgent: Aerial Vision-and-Language Navigation with Hierarchical Semantic Planning and Global Memory
Weichen Zhang
Chen Gao
Shiquan Yu
Ruiying Peng
Baining Zhao
Qian Zhang
Jinqiang Cui
Xinlei Chen
Yong Li
LLMAG
LM&Ro
49
0
0
08 May 2025
UAV-VLN: End-to-End Vision Language guided Navigation for UAVs
UAV-VLN: End-to-End Vision Language guided Navigation for UAVs
Pranav Saxena
Nishant Raghuvanshi
Neena Goveas
77
0
0
30 Apr 2025
Aerial Vision-and-Language Navigation with Grid-based View Selection and Map Construction
Ganlong Zhao
Guanbin Li
Jia-Yu Pan
Yizhou Yu
45
1
0
14 Mar 2025
OpenFly: A Comprehensive Platform for Aerial Vision-Language Navigation
OpenFly: A Comprehensive Platform for Aerial Vision-Language Navigation
Yunpeng Gao
C. Li
Zhongrui You
Jun Liu
Zhen Li
...
Yan Ding
Dong Wang
Zihan Wang
Bin Zhao
Xuelong Li
47
4
0
25 Feb 2025
Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models
Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models
Yue Zhang
Ziqiao Ma
Jialu Li
Yanyuan Qiao
Zun Wang
J. Chai
Qi Wu
Joey Tianyi Zhou
Parisa Kordjamshidi
LRM
63
19
0
31 Dec 2024
NavAgent: Multi-scale Urban Street View Fusion For UAV Embodied
  Vision-and-Language Navigation
NavAgent: Multi-scale Urban Street View Fusion For UAV Embodied Vision-and-Language Navigation
Youzhi Liu
Fanglong Yao
Yuanchang Yue
Guangluan Xu
Xian Sun
Kun Fu
LM&Ro
39
3
0
13 Nov 2024
Constrained Human-AI Cooperation: An Inclusive Embodied Social
  Intelligence Challenge
Constrained Human-AI Cooperation: An Inclusive Embodied Social Intelligence Challenge
Weihua Du
Qiushi Lyu
Jiaming Shan
Zhenting Qi
Hongxin Zhang
...
Andi Peng
Tianmin Shu
Kwonjoon Lee
Behzad Dariush
Chuang Gan
45
1
0
04 Nov 2024
Aerial Vision-and-Language Navigation via Semantic-Topo-Metric
  Representation Guided LLM Reasoning
Aerial Vision-and-Language Navigation via Semantic-Topo-Metric Representation Guided LLM Reasoning
Yunpeng Gao
Zhigang Wang
Linglin Jing
Dong Wang
Xuelong Li
Bin Zhao
58
14
0
11 Oct 2024
Towards Realistic UAV Vision-Language Navigation: Platform, Benchmark,
  and Methodology
Towards Realistic UAV Vision-Language Navigation: Platform, Benchmark, and Methodology
Xueliang Wang
Donglin Yang
Ziqin Wang
Hohin Kwan
Jinyu Chen
Wenjun Wu
Hongsheng Li
Yue Liao
Si Liu
29
14
0
09 Oct 2024
PREDICT: Preference Reasoning by Evaluating Decomposed preferences
  Inferred from Candidate Trajectories
PREDICT: Preference Reasoning by Evaluating Decomposed preferences Inferred from Candidate Trajectories
Stephane Aroca-Ouellette
Natalie Mackraz
B. Theobald
Katherine Metcalf
33
0
0
08 Oct 2024
ReALFRED: An Embodied Instruction Following Benchmark in Photo-Realistic
  Environments
ReALFRED: An Embodied Instruction Following Benchmark in Photo-Realistic Environments
Taewoong Kim
Cheolhong Min
Byeonghwi Kim
Jinyeon Kim
Wonje Jeung
Jonghyun Choi
LM&Ro
42
4
0
26 Jul 2024
WorldAPIs: The World Is Worth How Many APIs? A Thought Experiment
WorldAPIs: The World Is Worth How Many APIs? A Thought Experiment
Jiefu Ou
Arda Uzunoglu
Benjamin Van Durme
Daniel Khashabi
LM&Ro
VGen
30
3
0
10 Jul 2024
Into the Unknown: Generating Geospatial Descriptions for New
  Environments
Into the Unknown: Generating Geospatial Descriptions for New Environments
Tzuf Paz-Argaman
John Palowitch
Sayali Kulkarni
Reut Tsarfaty
Jason Baldridge
34
1
0
28 Jun 2024
CityNav: Language-Goal Aerial Navigation Dataset with Geographic
  Information
CityNav: Language-Goal Aerial Navigation Dataset with Geographic Information
Jungdae Lee
Taiki Miyanishi
Shuhei Kurita
Koya Sakamoto
Daichi Azuma
Yutaka Matsuo
Nakamasa Inoue
47
14
0
20 Jun 2024
A Survey on Vision-Language-Action Models for Embodied AI
A Survey on Vision-Language-Action Models for Embodied AI
Yueen Ma
Zixing Song
Yuzheng Zhuang
Jianye Hao
Irwin King
LM&Ro
82
43
0
23 May 2024
COMBO: Compositional World Models for Embodied Multi-Agent Cooperation
COMBO: Compositional World Models for Embodied Multi-Agent Cooperation
Hongxin Zhang
Zeyuan Wang
Qiushi Lyu
Zheyuan Zhang
Sunli Chen
Tianmin Shu
Yilun Du
Kwonjoon Lee
Yilun Du
Chuang Gan
48
12
0
16 Apr 2024
Provable Interactive Learning with Hindsight Instruction Feedback
Provable Interactive Learning with Hindsight Instruction Feedback
Dipendra Kumar Misra
Aldo Pacchiano
Rob Schapire
44
1
0
14 Apr 2024
Semantic Map-based Generation of Navigation Instructions
Semantic Map-based Generation of Navigation Instructions
Chengzu Li
Chao Zhang
Simone Teufel
R. Doddipatla
Svetlana Stoyanchev
34
2
0
28 Mar 2024
DiaLoc: An Iterative Approach to Embodied Dialog Localization
DiaLoc: An Iterative Approach to Embodied Dialog Localization
Chao Zhang
Mohan Li
Ignas Budvytis
Stephan Liwicki
52
2
0
11 Mar 2024
Learning with Language-Guided State Abstractions
Learning with Language-Guided State Abstractions
Andi Peng
Ilia Sucholutsky
Belinda Z. Li
T. Sumers
Thomas Griffiths
Jacob Andreas
Julie A. Shah
LM&Ro
49
13
0
28 Feb 2024
Where Do We Go from Here? Multi-scale Allocentric Relational Inference
  from Natural Spatial Descriptions
Where Do We Go from Here? Multi-scale Allocentric Relational Inference from Natural Spatial Descriptions
Tzuf Paz-Argaman
Sayali Kulkarni
John Palowitch
Jason Baldridge
Reut Tsarfaty
29
3
0
26 Feb 2024
RoboEXP: Action-Conditioned Scene Graph via Interactive Exploration for
  Robotic Manipulation
RoboEXP: Action-Conditioned Scene Graph via Interactive Exploration for Robotic Manipulation
Hanxiao Jiang
Binghao Huang
Ruihai Wu
Zhuoran Li
Shubham Garg
H. Nayyeri
Shenlong Wang
Yunzhu Li
42
17
0
23 Feb 2024
Vision-Language Navigation with Embodied Intelligence: A Survey
Peng Gao
Peng Wang
Feng Gao
Fei Wang
Ruyue Yuan
LM&Ro
43
2
0
22 Feb 2024
VLN-Video: Utilizing Driving Videos for Outdoor Vision-and-Language
  Navigation
VLN-Video: Utilizing Driving Videos for Outdoor Vision-and-Language Navigation
Jialu Li
Aishwarya Padmakumar
Gaurav Sukhatme
Mohit Bansal
29
6
0
05 Feb 2024
Language-Guided World Models: A Model-Based Approach to AI Control
Language-Guided World Models: A Model-Based Approach to AI Control
Alex Zhang
Khanh Nguyen
Jens Tuyls
Albert Lin
Karthik R. Narasimhan
LLMAG
37
5
0
24 Jan 2024
Seeing the Unseen: Visual Common Sense for Semantic Placement
Seeing the Unseen: Visual Common Sense for Semantic Placement
Ram Ramrakhya
Aniruddha Kembhavi
Dhruv Batra
Z. Kira
Kuo-Hao Zeng
Luca Weihs
VLM
41
5
0
15 Jan 2024
LLF-Bench: Benchmark for Interactive Learning from Language Feedback
LLF-Bench: Benchmark for Interactive Learning from Language Feedback
Ching-An Cheng
Andrey Kolobov
Dipendra Kumar Misra
Allen Nie
Adith Swaminathan
32
19
0
11 Dec 2023
Interaction is all You Need? A Study of Robots Ability to Understand and
  Execute
Interaction is all You Need? A Study of Robots Ability to Understand and Execute
Kushal Koshti
Nidhir Bhavsar
55
1
0
13 Nov 2023
tagE: Enabling an Embodied Agent to Understand Human Instructions
tagE: Enabling an Embodied Agent to Understand Human Instructions
Chayan Sarkar
Avik Mitra
Pradip Pramanick
Tapas Nayak
LM&Ro
49
1
0
24 Oct 2023
LACMA: Language-Aligning Contrastive Learning with Meta-Actions for
  Embodied Instruction Following
LACMA: Language-Aligning Contrastive Learning with Meta-Actions for Embodied Instruction Following
Cheng Yang
Yen-Chun Chen
Jianwei Yang
Xiyang Dai
Lu Yuan
Yu-Chiang Frank Wang
Kai-Wei Chang
LM&Ro
28
10
0
18 Oct 2023
Multi-model fusion for Aerial Vision and Dialog Navigation based on
  human attention aids
Multi-model fusion for Aerial Vision and Dialog Navigation based on human attention aids
Xinyi Wang
Xuan Cui
Danxu Li
Fang Liu
Licheng Jiao
18
0
0
27 Aug 2023
Grounded Entity-Landmark Adaptive Pre-training for Vision-and-Language
  Navigation
Grounded Entity-Landmark Adaptive Pre-training for Vision-and-Language Navigation
Yibo Cui
Liang Xie
Yakun Zhang
Meishan Zhang
Ye Yan
Erwei Yin
LM&Ro
34
16
0
24 Aug 2023
AerialVLN: Vision-and-Language Navigation for UAVs
AerialVLN: Vision-and-Language Navigation for UAVs
Shubo Liu
Hongsheng Zhang
Yuankai Qi
Peifeng Wang
Yaning Zhang
Qi Wu
CoGe
34
41
0
13 Aug 2023
Breaking Down the Task: A Unit-Grained Hybrid Training Framework for
  Vision and Language Decision Making
Breaking Down the Task: A Unit-Grained Hybrid Training Framework for Vision and Language Decision Making
Ruipu Luo
Jiwen Zhang
Zhongyu Wei
VLM
16
0
0
16 Jul 2023
Building Cooperative Embodied Agents Modularly with Large Language Models
Building Cooperative Embodied Agents Modularly with Large Language Models
Hongxin Zhang
Weihua Du
Jiaming Shan
Qinhong Zhou
Yilun Du
J. Tenenbaum
Tianmin Shu
Chuang Gan
LLMAG
LM&Ro
59
157
0
05 Jul 2023
CorNav: Autonomous Agent with Self-Corrected Planning for Zero-Shot
  Vision-and-Language Navigation
CorNav: Autonomous Agent with Self-Corrected Planning for Zero-Shot Vision-and-Language Navigation
Xiwen Liang
Liang Ma
Shanshan Guo
Jianhua Han
Hang Xu
Shikui Ma
Xiaodan Liang
LM&Ro
LLMAG
90
4
0
17 Jun 2023
Emergent Incident Response for Unmanned Warehouses with Multi-agent
  Systems*
Emergent Incident Response for Unmanned Warehouses with Multi-agent Systems*
Yibo Guo
Mingxin Li
Jingting Zong
Mingliang Xu
24
0
0
29 May 2023
Demo2Code: From Summarizing Demonstrations to Synthesizing Code via
  Extended Chain-of-Thought
Demo2Code: From Summarizing Demonstrations to Synthesizing Code via Extended Chain-of-Thought
Huaxiaoyue Wang
Gonzalo Gonzalez-Pumariega
Yash Sharma
Sanjiban Choudhury
LM&Ro
34
33
0
26 May 2023
Improving Vision-and-Language Navigation by Generating Future-View Image
  Semantics
Improving Vision-and-Language Navigation by Generating Future-View Image Semantics
Jialu Li
Joey Tianyi Zhou
29
34
0
11 Apr 2023
MOPA: Modular Object Navigation with PointGoal Agents
MOPA: Modular Object Navigation with PointGoal Agents
Sonia Raychaudhuri
Tommaso Campari
Unnat Jain
Manolis Savva
Angel X. Chang
3DPC
29
8
0
07 Apr 2023
Alexa Arena: A User-Centric Interactive Platform for Embodied AI
Alexa Arena: A User-Centric Interactive Platform for Embodied AI
Qiaozi Gao
Govind Thattai
Suhaila Shakiah
Xiaofeng Gao
Shreyas Pansare
...
Michael Johnston
R. Ghanadan
Arindam Mandal
Dilek Z. Hakkani-Tür
Premkumar Natarajan
6
27
0
02 Mar 2023
Natural Language-conditioned Reinforcement Learning with Inside-out Task
  Language Development and Translation
Natural Language-conditioned Reinforcement Learning with Inside-out Task Language Development and Translation
Jing-Cheng Pang
Xinyi Yang
Sibei Yang
Yang Yu
29
8
0
18 Feb 2023
Controllable Text Generation with Language Constraints
Controllable Text Generation with Language Constraints
Howard Chen
Huihan Li
Danqi Chen
Karthik R. Narasimhan
14
16
0
20 Dec 2022
Continual Learning for Instruction Following from Realtime Feedback
Continual Learning for Instruction Following from Realtime Feedback
Alane Suhr
Yoav Artzi
29
17
0
19 Dec 2022
ViLPAct: A Benchmark for Compositional Generalization on Multimodal
  Human Activities
ViLPAct: A Benchmark for Compositional Generalization on Multimodal Human Activities
Terry Yue Zhuo
Yaqing Liao
Yuecheng Lei
Lizhen Qu
Gerard de Melo
Xiaojun Chang
Yazhou Ren
Zenglin Xu
42
2
0
11 Oct 2022
Grounding Language with Visual Affordances over Unstructured Data
Grounding Language with Visual Affordances over Unstructured Data
Oier Mees
Jessica Borja-Diaz
Wolfram Burgard
LM&Ro
121
108
0
04 Oct 2022
Perceiver-Actor: A Multi-Task Transformer for Robotic Manipulation
Perceiver-Actor: A Multi-Task Transformer for Robotic Manipulation
Mohit Shridhar
Lucas Manuelli
Dieter Fox
LM&Ro
166
460
0
12 Sep 2022
JARVIS: A Neuro-Symbolic Commonsense Reasoning Framework for
  Conversational Embodied Agents
JARVIS: A Neuro-Symbolic Commonsense Reasoning Framework for Conversational Embodied Agents
Kai Zheng
KAI-QING Zhou
Jing Gu
Yue Fan
Jialu Wang
Zong-xiao Li
Xuehai He
Qing Guo
LM&Ro
33
39
0
28 Aug 2022
A Computational Interface to Translate Strategic Intent from
  Unstructured Language in a Low-Data Setting
A Computational Interface to Translate Strategic Intent from Unstructured Language in a Low-Data Setting
Pradyumna Tambwekar
Lakshita Dodeja
Nathan Vaska
Wei Xu
Matthew C. Gombolay
33
0
0
17 Aug 2022
LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language,
  Vision, and Action
LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action
Dhruv Shah
B. Osinski
Brian Ichter
Sergey Levine
LM&Ro
158
437
0
10 Jul 2022
123
Next