ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2012.08977
  4. Cited By
Visually Grounding Language Instruction for History-Dependent
  Manipulation

Visually Grounding Language Instruction for History-Dependent Manipulation

16 December 2020
Hyemin Ahn
Obin Kwon
Kyungdo Kim
Jaeyeon Jeong
Howoong Jun
Hongjung Lee
Dongheui Lee
Songhwai Oh
    LM&Ro
ArXivPDFHTML

Papers citing "Visually Grounding Language Instruction for History-Dependent Manipulation"

46 / 46 papers shown
Title
A Parameter-Efficient Tuning Framework for Language-guided Object Grounding and Robot Grasping
A Parameter-Efficient Tuning Framework for Language-guided Object Grounding and Robot Grasping
Houjian Yu
Mingen Li
Alireza Rezazadeh
Yang Yang
Changhyun Choi
67
1
0
28 Sep 2024
ArraMon: A Joint Navigation-Assembly Instruction Interpretation Task in
  Dynamic Environments
ArraMon: A Joint Navigation-Assembly Instruction Interpretation Task in Dynamic Environments
Hyounghun Kim
Abhaysinh Zala
Graham Burri
Hao Tan
Joey Tianyi Zhou
LM&Ro
35
16
0
15 Nov 2020
The RobotSlang Benchmark: Dialog-guided Robot Localization and
  Navigation
The RobotSlang Benchmark: Dialog-guided Robot Localization and Navigation
Shurjo Banerjee
Jesse Thomason
Jason J. Corso
LM&Ro
111
30
0
23 Oct 2020
Language-Conditioned Imitation Learning for Robot Manipulation Tasks
Language-Conditioned Imitation Learning for Robot Manipulation Tasks
Simon Stepputtis
Joseph Campbell
Mariano Phielipp
Stefan Lee
Chitta Baral
H. B. Amor
LM&Ro
161
198
0
22 Oct 2020
Room-Across-Room: Multilingual Vision-and-Language Navigation with Dense
  Spatiotemporal Grounding
Room-Across-Room: Multilingual Vision-and-Language Navigation with Dense Spatiotemporal Grounding
Alexander Ku
Peter Anderson
Roma Patel
Eugene Ie
Jason Baldridge
62
305
0
15 Oct 2020
Language Conditioned Imitation Learning over Unstructured Data
Language Conditioned Imitation Learning over Unstructured Data
Corey Lynch
P. Sermanet
LM&Ro
55
244
0
15 May 2020
BabyWalk: Going Farther in Vision-and-Language Navigation by Taking Baby
  Steps
BabyWalk: Going Farther in Vision-and-Language Navigation by Taking Baby Steps
Wang Zhu
Hexiang Hu
Jiacheng Chen
Zhiwei Deng
Vihan Jain
Eugene Ie
Fei Sha
LM&Ro
33
71
0
10 May 2020
Multi-task Collaborative Network for Joint Referring Expression
  Comprehension and Segmentation
Multi-task Collaborative Network for Joint Referring Expression Comprehension and Segmentation
Gen Luo
Yiyi Zhou
Xiaoshuai Sun
Liujuan Cao
Chenglin Wu
Cheng Deng
Rongrong Ji
ObjD
222
288
0
19 Mar 2020
Vision-Dialog Navigation by Exploring Cross-modal Memory
Vision-Dialog Navigation by Exploring Cross-modal Memory
Yi Zhu
Fengda Zhu
Zhaohuan Zhan
Bingqian Lin
Jianbin Jiao
Xiaojun Chang
Xiaodan Liang
VLM
51
49
0
15 Mar 2020
ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday
  Tasks
ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks
Mohit Shridhar
Jesse Thomason
Daniel Gordon
Yonatan Bisk
Winson Han
Roozbeh Mottaghi
Luke Zettlemoyer
Dieter Fox
LM&Ro
75
758
0
03 Dec 2019
Just Ask:An Interactive Learning Framework for Vision and Language
  Navigation
Just Ask:An Interactive Learning Framework for Vision and Language Navigation
Ta-Chung Chi
Mihail Eric
Seokhwan Kim
Minmin Shen
Dilek Z. Hakkani-Tür
21
71
0
02 Dec 2019
Learning to Map Natural Language Instructions to Physical Quadcopter
  Control using Simulated Flight
Learning to Map Natural Language Instructions to Physical Quadcopter Control using Simulated Flight
Valts Blukis
Yannick Terme
Eyvind Niklasson
Ross A. Knepper
Yoav Artzi
31
74
0
21 Oct 2019
Executing Instructions in Situated Collaborative Interactions
Executing Instructions in Situated Collaborative Interactions
Alane Suhr
Claudia Yan
Jack Schluger
Stanley Yu
Hadi Khader
Marwa Mouallem
Iris Zhang
Yoav Artzi
75
89
0
08 Oct 2019
RUN through the Streets: A New Dataset and Baseline Models for Realistic
  Urban Navigation
RUN through the Streets: A New Dataset and Baseline Models for Realistic Urban Navigation
Tzuf Paz-Argaman
Reut Tsarfaty
37
18
0
19 Sep 2019
Help, Anna! Visual Navigation with Natural Multimodal Assistance via
  Retrospective Curiosity-Encouraging Imitation Learning
Help, Anna! Visual Navigation with Natural Multimodal Assistance via Retrospective Curiosity-Encouraging Imitation Learning
Khanh Nguyen
Hal Daumé
LM&Ro
EgoV
195
151
0
04 Sep 2019
Vision-and-Dialog Navigation
Vision-and-Dialog Navigation
Jesse Thomason
Michael Murray
Maya Cakmak
Luke Zettlemoyer
LM&Ro
86
325
0
10 Jul 2019
Grounding Language Attributes to Objects using Bayesian Eigenobjects
Grounding Language Attributes to Objects using Bayesian Eigenobjects
Vanya Cohen
Benjamin Burchfiel
Thao Nguyen
N. Gopalan
Stefanie Tellex
George Konidaris
10
20
0
30 May 2019
Stay on the Path: Instruction Fidelity in Vision-and-Language Navigation
Stay on the Path: Instruction Fidelity in Vision-and-Language Navigation
Vihan Jain
Gabriel Ilharco
Alexander Ku
Ashish Vaswani
Eugene Ie
Jason Baldridge
LM&Ro
47
179
0
29 May 2019
REVERIE: Remote Embodied Visual Referring Expression in Real Indoor
  Environments
REVERIE: Remote Embodied Visual Referring Expression in Real Indoor Environments
Yuankai Qi
Qi Wu
Peter Anderson
Xinze Wang
Wenjie Wang
Chunhua Shen
Anton Van Den Hengel
LM&Ro
71
320
0
23 Apr 2019
Cross-Modal Self-Attention Network for Referring Image Segmentation
Cross-Modal Self-Attention Network for Referring Image Segmentation
Linwei Ye
Mrigank Rochan
Zhi Liu
Yang Wang
EgoV
23
472
0
09 Apr 2019
Prospection: Interpretable Plans From Language By Predicting the Future
Prospection: Interpretable Plans From Language By Predicting the Future
Chris Paxton
Yonatan Bisk
Jesse Thomason
Arunkumar Byravan
Dieter Fox
LM&Ro
51
47
0
20 Mar 2019
Learning To Follow Directions in Street View
Learning To Follow Directions in Street View
Karl Moritz Hermann
Mateusz Malinowski
Piotr Wojciech Mirowski
Andras Banki-Horvath
Keith Anderson
R. Hadsell
SSL
51
67
0
01 Mar 2019
Touchdown: Natural Language Navigation and Spatial Reasoning in Visual
  Street Environments
Touchdown: Natural Language Navigation and Spatial Reasoning in Visual Street Environments
Howard Chen
Alane Suhr
Dipendra Kumar Misra
Noah Snavely
Yoav Artzi
68
384
0
29 Nov 2018
Temporal Grounding Graphs for Language Understanding with Accrued
  Visual-Linguistic Context
Temporal Grounding Graphs for Language Understanding with Accrued Visual-Linguistic Context
Rohan Paul
Andrei Barbu
Sue Felshin
Boris Katz
Nicholas Roy
LM&Ro
34
40
0
16 Nov 2018
Mapping Navigation Instructions to Continuous Control Actions with
  Position-Visitation Prediction
Mapping Navigation Instructions to Continuous Control Actions with Position-Visitation Prediction
Valts Blukis
Dipendra Kumar Misra
Ross A. Knepper
Yoav Artzi
52
82
0
10 Nov 2018
Translating Navigation Instructions in Natural Language to a High-Level
  Plan for Behavioral Robot Navigation
Translating Navigation Instructions in Natural Language to a High-Level Plan for Behavioral Robot Navigation
Xiaoxue Zang
Ashwini Pokle
Nathan Tsoi
Kevin Chen
Juan Carlos Niebles
Á. Soto
Silvio Savarese
LM&Ro
26
30
0
24 Sep 2018
Mapping Instructions to Actions in 3D Environments with Visual Goal
  Prediction
Mapping Instructions to Actions in 3D Environments with Visual Goal Prediction
Dipendra Kumar Misra
Andrew Bennett
Valts Blukis
Eyvind Niklasson
Max Shatkhin
Yoav Artzi
LM&Ro
60
186
0
04 Sep 2018
Interactive Visual Grounding of Referring Expressions for Human-Robot
  Interaction
Interactive Visual Grounding of Referring Expressions for Human-Robot Interaction
Mohit Shridhar
David Hsu
44
144
0
11 Jun 2018
Following High-level Navigation Instructions on a Simulated Quadcopter
  with Imitation Learning
Following High-level Navigation Instructions on a Simulated Quadcopter with Imitation Learning
Valts Blukis
Nataly Brukhim
Andrew Bennett
Ross A. Knepper
Yoav Artzi
64
62
0
31 May 2018
Interactive Text2Pickup Network for Natural Language based Human-Robot
  Collaboration
Interactive Text2Pickup Network for Natural Language based Human-Robot Collaboration
Hyemin Ahn
Sungjoon Choi
Nuri Kim
Geonho Cha
Songhwai Oh
15
7
0
28 May 2018
Guided Feature Transformation (GFT): A Neural Language Grounding Module
  for Embodied Agents
Guided Feature Transformation (GFT): A Neural Language Grounding Module for Embodied Agents
Haonan Yu
Xiaochen Lian
Haichao Zhang
Wenyuan Xu
LM&Ro
33
21
0
22 May 2018
Interactive Grounded Language Acquisition and Generalization in a 2D
  World
Interactive Grounded Language Acquisition and Generalization in a 2D World
Haonan Yu
Haichao Zhang
Wenyuan Xu
LLMAG
LM&Ro
103
79
0
31 Jan 2018
MAttNet: Modular Attention Network for Referring Expression
  Comprehension
MAttNet: Modular Attention Network for Referring Expression Comprehension
Licheng Yu
Zhe Lin
Xiaohui Shen
Jimei Yang
Xin Lu
Joey Tianyi Zhou
Tamara L. Berg
ObjD
90
822
0
24 Jan 2018
Learning Interpretable Spatial Operations in a Rich 3D Blocks World
Learning Interpretable Spatial Operations in a Rich 3D Blocks World
Yonatan Bisk
Kevin J. Shih
Yejin Choi
D. Marcu
27
63
0
10 Dec 2017
Embodied Question Answering
Embodied Question Answering
Abhishek Das
Samyak Datta
Georgia Gkioxari
Stefan Lee
Devi Parikh
Dhruv Batra
LM&Ro
65
642
0
30 Nov 2017
Vision-and-Language Navigation: Interpreting visually-grounded
  navigation instructions in real environments
Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments
Peter Anderson
Qi Wu
Damien Teney
Jake Bruce
Mark Johnson
Niko Sünderhauf
Ian Reid
Stephen Gould
Anton Van Den Hengel
LM&Ro
74
1,299
0
20 Nov 2017
Interactively Picking Real-World Objects with Unconstrained Spoken
  Language Instructions
Interactively Picking Real-World Objects with Unconstrained Spoken Language Instructions
Jun Hatori
Yuta Kikuchi
Sosuke Kobayashi
K. Takahashi
Yuta Tsuboi
Y. Unno
W. Ko
Jethro Tan
46
160
0
17 Oct 2017
Gated-Attention Architectures for Task-Oriented Language Grounding
Gated-Attention Architectures for Task-Oriented Language Grounding
Devendra Singh Chaplot
Kanthashree Mysore Sathyendra
Rama Kumar Pasumarthi
Dheeraj Rajagopal
Ruslan Salakhutdinov
LM&Ro
30
277
0
22 Jun 2017
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
430
129,831
0
12 Jun 2017
A Joint Speaker-Listener-Reinforcer Model for Referring Expressions
A Joint Speaker-Listener-Reinforcer Model for Referring Expressions
Licheng Yu
Hao Tan
Joey Tianyi Zhou
Tamara L. Berg
ObjD
75
275
0
30 Dec 2016
CLEVR: A Diagnostic Dataset for Compositional Language and Elementary
  Visual Reasoning
CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning
Justin Johnson
B. Hariharan
Laurens van der Maaten
Li Fei-Fei
C. L. Zitnick
Ross B. Girshick
CoGe
271
2,346
0
20 Dec 2016
Stacked Hourglass Networks for Human Pose Estimation
Stacked Hourglass Networks for Human Pose Estimation
Alejandro Newell
Kaiyu Yang
Jia Deng
3DH
88
5,008
0
22 Mar 2016
Segmentation from Natural Language Expressions
Segmentation from Natural Language Expressions
Ronghang Hu
Marcus Rohrbach
Trevor Darrell
VLM
EgoV
58
430
0
20 Mar 2016
Deep Residual Learning for Image Recognition
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
1.3K
192,638
0
10 Dec 2015
Natural Language Object Retrieval
Natural Language Object Retrieval
Ronghang Hu
Huazhe Xu
Marcus Rohrbach
Jiashi Feng
Kate Saenko
Trevor Darrell
ObjD
65
552
0
13 Nov 2015
Adam: A Method for Stochastic Optimization
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
776
149,474
0
22 Dec 2014
1