ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1912.01734
  4. Cited By
ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday
  Tasks

ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks

3 December 2019
Mohit Shridhar
Jesse Thomason
Daniel Gordon
Yonatan Bisk
Winson Han
Roozbeh Mottaghi
Luke Zettlemoyer
Dieter Fox
    LM&Ro
ArXivPDFHTML

Papers citing "ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks"

50 / 186 papers shown
Title
ULN: Towards Underspecified Vision-and-Language Navigation
ULN: Towards Underspecified Vision-and-Language Navigation
Weixi Feng
Tsu-jui Fu
Yujie Lu
William Yang Wang
49
5
0
18 Oct 2022
See, Plan, Predict: Language-guided Cognitive Planning with Video
  Prediction
See, Plan, Predict: Language-guided Cognitive Planning with Video Prediction
Maria Attarian
Advaya Gupta
Ziyi Zhou
Wei Yu
Igor Gilitschenski
Animesh Garg
LM&Ro
29
7
0
07 Oct 2022
VIMA: General Robot Manipulation with Multimodal Prompts
VIMA: General Robot Manipulation with Multimodal Prompts
Yunfan Jiang
Agrim Gupta
Zichen Zhang
Guanzhi Wang
Yongqiang Dou
Yanjun Chen
Li Fei-Fei
Anima Anandkumar
Yuke Zhu
Linxi Fan
LM&Ro
28
336
0
06 Oct 2022
Iterative Vision-and-Language Navigation
Iterative Vision-and-Language Navigation
Jacob Krantz
Shurjo Banerjee
Wang Zhu
Jason J. Corso
Peter Anderson
Stefan Lee
Jesse Thomason
LM&Ro
40
18
0
06 Oct 2022
Zero-shot Active Visual Search (ZAVIS): Intelligent Object Search for
  Robotic Assistants
Zero-shot Active Visual Search (ZAVIS): Intelligent Object Search for Robotic Assistants
Jeongeun Park
Taerim Yoon
Jejoon Hong
Youngjae Yu
Matthew K. X. J. Pan
Sungjoon Choi
43
13
0
19 Sep 2022
On Grounded Planning for Embodied Tasks with Language Models
On Grounded Planning for Embodied Tasks with Language Models
Bill Yuchen Lin
Chengsong Huang
Qian Liu
Wenda Gu
Sam Sommerer
Xiang Ren
LM&Ro
34
39
0
29 Aug 2022
TextWorldExpress: Simulating Text Games at One Million Steps Per Second
TextWorldExpress: Simulating Text Games at One Million Steps Per Second
Peter Alexander Jansen
Marc-Alexandre Côté
VLM
LRM
29
6
0
01 Aug 2022
TIDEE: Tidying Up Novel Rooms using Visuo-Semantic Commonsense Priors
TIDEE: Tidying Up Novel Rooms using Visuo-Semantic Commonsense Priors
Gabriel H. Sarch
Zhaoyuan Fang
Adam W. Harley
Paul Schydlo
Michael J. Tarr
Saurabh Gupta
Katerina Fragkiadaki
LM&Ro
29
45
0
21 Jul 2022
Target-Driven Structured Transformer Planner for Vision-Language
  Navigation
Target-Driven Structured Transformer Planner for Vision-Language Navigation
Yusheng Zhao
Jinyu Chen
Chen Gao
Wenguan Wang
Lirong Yang
Haibing Ren
Huaxia Xia
Si Liu
LM&Ro
27
57
0
19 Jul 2022
Reasoning about Actions over Visual and Linguistic Modalities: A Survey
Reasoning about Actions over Visual and Linguistic Modalities: A Survey
Shailaja Keyur Sampat
Maitreya Patel
Subhasish Das
Yezhou Yang
Chitta Baral
ReLM
LM&Ro
LRM
27
12
0
15 Jul 2022
LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language,
  Vision, and Action
LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action
Dhruv Shah
B. Osinski
Brian Ichter
Sergey Levine
LM&Ro
158
437
0
10 Jul 2022
Finding Fallen Objects Via Asynchronous Audio-Visual Integration
Finding Fallen Objects Via Asynchronous Audio-Visual Integration
Chuang Gan
Yi Gu
Siyuan Zhou
Jeremy Schwartz
S. Alter
James Traer
Dan Gutfreund
J. Tenenbaum
Josh H. McDermott
Antonio Torralba
52
19
0
07 Jul 2022
Good Time to Ask: A Learning Framework for Asking for Help in Embodied
  Visual Navigation
Good Time to Ask: A Learning Framework for Asking for Help in Embodied Visual Navigation
Jenny Zhang
Samson Yu
Jiafei Duan
Cheston Tan
36
4
0
20 Jun 2022
ToolTango: Common sense Generalization in Predicting Sequential Tool
  Interactions for Robot Plan Synthesis
ToolTango: Common sense Generalization in Predicting Sequential Tool Interactions for Robot Plan Synthesis
Shreshth Tuli
Rajas Bansal
Rohan Paul
Mausam
LM&Ro
20
4
0
18 Jun 2022
MineDojo: Building Open-Ended Embodied Agents with Internet-Scale
  Knowledge
MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge
Linxi Fan
Guanzhi Wang
Yunfan Jiang
Ajay Mandlekar
Yuncong Yang
Haoyi Zhu
Andrew Tang
De-An Huang
Yuke Zhu
Anima Anandkumar
LM&Ro
51
352
0
17 Jun 2022
VLMbench: A Compositional Benchmark for Vision-and-Language Manipulation
VLMbench: A Compositional Benchmark for Vision-and-Language Manipulation
Kai Zheng
Xiaotong Chen
Odest Chadwicke Jenkins
Qing Guo
LM&Ro
CoGe
21
54
0
17 Jun 2022
ProcTHOR: Large-Scale Embodied AI Using Procedural Generation
ProcTHOR: Large-Scale Embodied AI Using Procedural Generation
Matt Deitke
Eli VanderBilt
Alvaro Herrasti
Luca Weihs
Jordi Salvador
...
Winson Han
Eric Kolve
Ali Farhadi
Aniruddha Kembhavi
Roozbeh Mottaghi
LM&Ro
44
237
0
14 Jun 2022
ABCDE: An Agent-Based Cognitive Development Environment
ABCDE: An Agent-Based Cognitive Development Environment
Jieyi Ye
Jiafei Duan
Samson Yu
B. Wen
Cheston Tan
13
1
0
10 Jun 2022
FOAM: A Follower-aware Speaker Model For Vision-and-Language Navigation
FOAM: A Follower-aware Speaker Model For Vision-and-Language Navigation
Zi-Yi Dou
Nanyun Peng
26
22
0
09 Jun 2022
Multimodal Conversational AI: A Survey of Datasets and Approaches
Multimodal Conversational AI: A Survey of Datasets and Approaches
Anirudh S. Sundar
Larry Heck
38
29
0
13 May 2022
Learning to Fold Real Garments with One Arm: A Case Study in Cloud-Based
  Robotics Research
Learning to Fold Real Garments with One Arm: A Case Study in Cloud-Based Robotics Research
Ryan Hoque
K. Shivakumar
Shrey Aeron
Gabriel Deza
Aditya Ganapathi
Adrian S. Wong
Johnny Lee
Andy Zeng
Vincent Vanhoucke
Ken Goldberg
31
21
0
21 Apr 2022
Super-NaturalInstructions: Generalization via Declarative Instructions
  on 1600+ NLP Tasks
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks
Yizhong Wang
Swaroop Mishra
Pegah Alipoormolabashi
Yeganeh Kordi
Amirreza Mirzaei
...
Chitta Baral
Yejin Choi
Noah A. Smith
Hannaneh Hajishirzi
Daniel Khashabi
ELM
47
785
0
16 Apr 2022
On the Importance of Karaka Framework in Multi-modal Grounding
On the Importance of Karaka Framework in Multi-modal Grounding
Sai Kiran Gorthi
R. Mamidi
22
1
0
09 Apr 2022
Habitat-Web: Learning Embodied Object-Search Strategies from Human
  Demonstrations at Scale
Habitat-Web: Learning Embodied Object-Search Strategies from Human Demonstrations at Scale
Ram Ramrakhya
Eric Undersander
Dhruv Batra
Abhishek Das
LM&Ro
35
109
0
07 Apr 2022
Inferring Rewards from Language in Context
Inferring Rewards from Language in Context
Jessy Lin
Daniel Fried
Dan Klein
Anca Dragan
LM&Ro
29
54
0
05 Apr 2022
Moment-based Adversarial Training for Embodied Language Comprehension
Moment-based Adversarial Training for Embodied Language Comprehension
Shintaro Ishikawa
K. Sugiura
LM&Ro
46
8
0
02 Apr 2022
Continuous Scene Representations for Embodied AI
Continuous Scene Representations for Embodied AI
S. Gadre
Kiana Ehsani
Shuran Song
Roozbeh Mottaghi
33
47
0
31 Mar 2022
EnvEdit: Environment Editing for Vision-and-Language Navigation
EnvEdit: Environment Editing for Vision-and-Language Navigation
Jialu Li
Hao Tan
Joey Tianyi Zhou
31
80
0
29 Mar 2022
Text2Pos: Text-to-Point-Cloud Cross-Modal Localization
Text2Pos: Text-to-Point-Cloud Cross-Modal Localization
Manuel Kolmet
Qunjie Zhou
Aljosa Osep
Laura Leal-Taixe
27
23
0
28 Mar 2022
CoWs on Pasture: Baselines and Benchmarks for Language-Driven Zero-Shot
  Object Navigation
CoWs on Pasture: Baselines and Benchmarks for Language-Driven Zero-Shot Object Navigation
S. Gadre
Mitchell Wortsman
Gabriel Ilharco
Ludwig Schmidt
Shuran Song
CLIP
LM&Ro
44
142
0
20 Mar 2022
Object Manipulation via Visual Target Localization
Object Manipulation via Visual Target Localization
Kiana Ehsani
Ali Farhadi
Aniruddha Kembhavi
Roozbeh Mottaghi
LM&Ro
26
9
0
15 Mar 2022
Summarizing a virtual robot's past actions in natural language
Summarizing a virtual robot's past actions in natural language
Chad DeChant
Daniel Bauer
LM&Ro
31
4
0
13 Mar 2022
One-Shot Learning from a Demonstration with Hierarchical Latent Language
One-Shot Learning from a Demonstration with Hierarchical Latent Language
Nathaniel Weir
Xingdi Yuan
Marc-Alexandre Côté
Matthew J. Hausknecht
Romain Laroche
Ida Momennejad
H. V. Seijen
Benjamin Van Durme
BDL
24
6
0
09 Mar 2022
Bridging the Gap Between Learning in Discrete and Continuous
  Environments for Vision-and-Language Navigation
Bridging the Gap Between Learning in Discrete and Continuous Environments for Vision-and-Language Navigation
Yicong Hong
Zun Wang
Qi Wu
Stephen Gould
3DV
32
64
0
05 Mar 2022
GraphWorld: Fake Graphs Bring Real Insights for GNNs
GraphWorld: Fake Graphs Bring Real Insights for GNNs
John Palowitch
Anton Tsitsulin
Brandon Mayer
Bryan Perozzi
GNN
195
68
0
28 Feb 2022
DialFRED: Dialogue-Enabled Agents for Embodied Instruction Following
DialFRED: Dialogue-Enabled Agents for Embodied Instruction Following
Xiaofeng Gao
Qiaozi Gao
Ran Gong
Kaixiang Lin
Govind Thattai
Gaurav Sukhatme
LM&Ro
89
70
0
27 Feb 2022
Think Global, Act Local: Dual-scale Graph Transformer for
  Vision-and-Language Navigation
Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language Navigation
Shizhe Chen
Pierre-Louis Guhur
Makarand Tapaswi
Cordelia Schmid
Ivan Laptev
LM&Ro
33
139
0
23 Feb 2022
Pre-Trained Language Models for Interactive Decision-Making
Pre-Trained Language Models for Interactive Decision-Making
Shuang Li
Xavier Puig
Chris Paxton
Yilun Du
Clinton Jia Wang
...
Anima Anandkumar
Jacob Andreas
Igor Mordatch
Antonio Torralba
Yuke Zhu
LM&Ro
39
247
0
03 Feb 2022
IFOR: Iterative Flow Minimization for Robotic Object Rearrangement
IFOR: Iterative Flow Minimization for Robotic Object Rearrangement
Ankit Goyal
Arsalan Mousavian
Chris Paxton
Yu-Wei Chao
Brian Okorn
Jia Deng
Dieter Fox
35
55
0
01 Feb 2022
Learning to Act with Affordance-Aware Multimodal Neural SLAM
Learning to Act with Affordance-Aware Multimodal Neural SLAM
Zhiwei Jia
Kaixiang Lin
Yizhou Zhao
Qiaozi Gao
Govind Thattai
Gaurav Sukhatme
LM&Ro
31
15
0
24 Jan 2022
Language Models as Zero-Shot Planners: Extracting Actionable Knowledge
  for Embodied Agents
Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents
Wenlong Huang
Pieter Abbeel
Deepak Pathak
Igor Mordatch
LM&Ro
42
1,062
0
18 Jan 2022
Less is More: Generating Grounded Navigation Instructions from Landmarks
Less is More: Generating Grounded Navigation Instructions from Landmarks
Su Wang
Ceslee Montgomery
Jordi Orbay
Vighnesh Birodkar
Aleksandra Faust
Izzeddin Gur
Natasha Jaques
Austin Waters
Jason Baldridge
Peter Anderson
20
63
0
25 Nov 2021
Simple but Effective: CLIP Embeddings for Embodied AI
Simple but Effective: CLIP Embeddings for Embodied AI
Apoorv Khandelwal
Luca Weihs
Roozbeh Mottaghi
Aniruddha Kembhavi
VLM
LM&Ro
47
217
0
18 Nov 2021
LILA: Language-Informed Latent Actions
LILA: Language-Informed Latent Actions
Siddharth Karamcheti
Megha Srivastava
Percy Liang
Dorsa Sadigh
LM&Ro
30
31
0
05 Nov 2021
History Aware Multimodal Transformer for Vision-and-Language Navigation
History Aware Multimodal Transformer for Vision-and-Language Navigation
Shizhe Chen
Pierre-Louis Guhur
Cordelia Schmid
Ivan Laptev
LM&Ro
28
226
0
25 Oct 2021
Goal-Aware Cross-Entropy for Multi-Target Reinforcement Learning
Goal-Aware Cross-Entropy for Multi-Target Reinforcement Learning
Kibeom Kim
Min Whoo Lee
Yoonsung Kim
Je-hwan Ryu
Minsu Lee
Byoung-Tak Zhang
24
8
0
25 Oct 2021
CORA: Benchmarks, Baselines, and Metrics as a Platform for Continual
  Reinforcement Learning Agents
CORA: Benchmarks, Baselines, and Metrics as a Platform for Continual Reinforcement Learning Agents
Sam Powers
Eliot Xing
Eric Kolve
Roozbeh Mottaghi
Abhinav Gupta
OffRL
31
38
0
19 Oct 2021
Shaping embodied agent behavior with activity-context priors from
  egocentric video
Shaping embodied agent behavior with activity-context priors from egocentric video
Tushar Nagarajan
Kristen Grauman
EgoV
LM&Ro
46
13
0
14 Oct 2021
A Framework for Learning to Request Rich and Contextually Useful
  Information from Humans
A Framework for Learning to Request Rich and Contextually Useful Information from Humans
Khanh Nguyen
Yonatan Bisk
Hal Daumé
47
16
0
14 Oct 2021
Are you doing what I say? On modalities alignment in ALFRED
Are you doing what I say? On modalities alignment in ALFRED
Ting-Rui Chiang
Yi-Ting Yeh
Ta-Chung Chi
Yau-Shian Wang
27
1
0
12 Oct 2021
Previous
1234
Next