ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1904.03461
  4. Cited By
Embodied Question Answering in Photorealistic Environments with Point
  Cloud Perception

Embodied Question Answering in Photorealistic Environments with Point Cloud Perception

6 April 2019
Erik Wijmans
Samyak Datta
Oleksandr Maksymets
Abhishek Das
Georgia Gkioxari
Stefan Lee
Irfan Essa
Devi Parikh
Dhruv Batra
    3DPC
    LM&Ro
ArXivPDFHTML

Papers citing "Embodied Question Answering in Photorealistic Environments with Point Cloud Perception"

46 / 46 papers shown
Title
Visual Environment-Interactive Planning for Embodied Complex-Question Answering
Visual Environment-Interactive Planning for Embodied Complex-Question Answering
Ning Lan
Baoshan Ou
Xuemei Xie
G. Shi
LM&Ro
77
1
0
01 Apr 2025
Beyond the Destination: A Novel Benchmark for Exploration-Aware Embodied Question Answering
Beyond the Destination: A Novel Benchmark for Exploration-Aware Embodied Question Answering
Kaixuan Jiang
Yong Liu
Weixing Chen
Jingzhou Luo
Ziliang Chen
Ling Pan
G. Li
Liang Lin
70
3
0
14 Mar 2025
EMMOE: A Comprehensive Benchmark for Embodied Mobile Manipulation in Open Environments
EMMOE: A Comprehensive Benchmark for Embodied Mobile Manipulation in Open Environments
Dongping Li
Tielong Cai
Tianci Tang
Wenhao Chai
Katherine Rose Driggs-Campbell
Gaoang Wang
LM&Ro
71
0
0
11 Mar 2025
SPARTUN3D: Situated Spatial Understanding of 3D World in Large Language Models
SPARTUN3D: Situated Spatial Understanding of 3D World in Large Language Models
Yue Zhang
Zhiyang Xu
Ying Shen
Parisa Kordjamshidi
Lifu Huang
39
6
0
04 Oct 2024
Answerability Fields: Answerable Location Estimation via Diffusion
  Models
Answerability Fields: Answerable Location Estimation via Diffusion Models
Daich Azuma
Taiki Miyanishi
Shuhei Kurita
Koya Sakamoto
M. Kawanabe
DiffM
48
0
0
26 Jul 2024
A Survey on Vision-Language-Action Models for Embodied AI
A Survey on Vision-Language-Action Models for Embodied AI
Yueen Ma
Zixing Song
Yuzheng Zhuang
Jianye Hao
Irwin King
LM&Ro
82
45
0
23 May 2024
Following the Human Thread in Social Navigation
Following the Human Thread in Social Navigation
Luca Scofano
Alessio Sampieri
Tommaso Campari
Valentino Sacco
Indro Spinelli
Lamberto Ballan
Fabio Galasso
45
0
0
17 Apr 2024
Talk2BEV: Language-enhanced Bird's-eye View Maps for Autonomous Driving
Talk2BEV: Language-enhanced Bird's-eye View Maps for Autonomous Driving
Tushar Choudhary
Vikrant Dewangan
Shivam Chandhok
Shubham Priyadarshan
Anushka Jain
A. K. Singh
Siddharth Srivastava
Krishna Murthy Jatavallabhula
K. M. Krishna
50
59
0
03 Oct 2023
An Outlook into the Future of Egocentric Vision
An Outlook into the Future of Egocentric Vision
Chiara Plizzari
Gabriele Goletto
Antonino Furnari
Siddhant Bansal
Francesco Ragusa
G. Farinella
Dima Damen
Tatiana Tommasi
EgoV
45
38
0
14 Aug 2023
MLANet: Multi-Level Attention Network with Sub-instruction for
  Continuous Vision-and-Language Navigation
MLANet: Multi-Level Attention Network with Sub-instruction for Continuous Vision-and-Language Navigation
Zongtao He
Liuyi Wang
Shu Li
Qingqing Yan
Chengju Liu
Qi Chen
29
7
0
02 Mar 2023
PIRLNav: Pretraining with Imitation and RL Finetuning for ObjectNav
PIRLNav: Pretraining with Imitation and RL Finetuning for ObjectNav
Ram Ramrakhya
Dhruv Batra
Erik Wijmans
Abhishek Das
OffRL
33
53
0
18 Jan 2023
ScanEnts3D: Exploiting Phrase-to-3D-Object Correspondences for Improved
  Visio-Linguistic Models in 3D Scenes
ScanEnts3D: Exploiting Phrase-to-3D-Object Correspondences for Improved Visio-Linguistic Models in 3D Scenes
Ahmed Abdelreheem
Kyle Olszewski
Hsin-Ying Lee
Peter Wonka
Panos Achlioptas
3DPC
24
28
0
12 Dec 2022
A General Purpose Supervisory Signal for Embodied Agents
A General Purpose Supervisory Signal for Embodied Agents
Kunal Pratap Singh
Jordi Salvador
Luca Weihs
Aniruddha Kembhavi
SSL
31
3
0
01 Dec 2022
AVLEN: Audio-Visual-Language Embodied Navigation in 3D Environments
AVLEN: Audio-Visual-Language Embodied Navigation in 3D Environments
Sudipta Paul
Amit K. Roy-Chowdhury
A. Cherian
33
23
0
14 Oct 2022
Learning a Visually Grounded Memory Assistant
Learning a Visually Grounded Memory Assistant
Meera Hahn
Kevin Carlberg
Ruta Desai
James M. Hillis
33
1
0
07 Oct 2022
Iterative Vision-and-Language Navigation
Iterative Vision-and-Language Navigation
Jacob Krantz
Shurjo Banerjee
Wang Zhu
Jason J. Corso
Peter Anderson
Stefan Lee
Jesse Thomason
LM&Ro
50
18
0
06 Oct 2022
Episodic Memory Question Answering
Episodic Memory Question Answering
Samyak Datta
Sameer Dharur
Vincent Cartillier
Ruta Desai
Mukul Khanna
Dhruv Batra
Devi Parikh
EgoV
19
31
0
03 May 2022
Habitat-Web: Learning Embodied Object-Search Strategies from Human
  Demonstrations at Scale
Habitat-Web: Learning Embodied Object-Search Strategies from Human Demonstrations at Scale
Ram Ramrakhya
Eric Undersander
Dhruv Batra
Abhishek Das
LM&Ro
44
109
0
07 Apr 2022
Vision-and-Language Navigation: A Survey of Tasks, Methods, and Future
  Directions
Vision-and-Language Navigation: A Survey of Tasks, Methods, and Future Directions
Jing Gu
Eliana Stefani
Qi Wu
Jesse Thomason
Junfeng Fang
LM&Ro
32
105
0
22 Mar 2022
PONI: Potential Functions for ObjectGoal Navigation with
  Interaction-free Learning
PONI: Potential Functions for ObjectGoal Navigation with Interaction-free Learning
Santhosh Kumar Ramakrishnan
Devendra Singh Chaplot
Ziad Al-Halah
Jitendra Malik
Kristen Grauman
36
152
0
25 Jan 2022
Towards Disturbance-Free Visual Mobile Manipulation
Towards Disturbance-Free Visual Mobile Manipulation
Tianwei Ni
Kiana Ehsani
Luca Weihs
Jordi Salvador
28
9
0
17 Dec 2021
3D Question Answering
3D Question Answering
Shuquan Ye
Dongdong Chen
Songfang Han
Jing Liao
ViT
31
47
0
15 Dec 2021
Catch Me If You Hear Me: Audio-Visual Navigation in Complex Unmapped
  Environments with Moving Sounds
Catch Me If You Hear Me: Audio-Visual Navigation in Complex Unmapped Environments with Moving Sounds
Abdelrahman Younes
Daniel Honerkamp
Tim Welschehold
Abhinav Valada
30
40
0
29 Nov 2021
Pano-AVQA: Grounded Audio-Visual Question Answering on 360$^\circ$
  Videos
Pano-AVQA: Grounded Audio-Visual Question Answering on 360∘^\circ∘ Videos
Heeseung Yun
Youngjae Yu
Wonsuk Yang
Kangil Lee
Gunhee Kim
27
79
0
11 Oct 2021
Knowledge-based Embodied Question Answering
Knowledge-based Embodied Question Answering
Sinan Tan
Mengmeng Ge
Di Guo
Huaping Liu
F. Sun
30
20
0
16 Sep 2021
The Surprising Effectiveness of Visual Odometry Techniques for Embodied
  PointGoal Navigation
The Surprising Effectiveness of Visual Odometry Techniques for Embodied PointGoal Navigation
Xiaoming Zhao
Harsh Agrawal
Dhruv Batra
Alex Schwing
36
40
0
26 Aug 2021
Core Challenges in Embodied Vision-Language Planning
Core Challenges in Embodied Vision-Language Planning
Jonathan M Francis
Nariaki Kitamura
Felix Labelle
Xiaopeng Lu
Ingrid Navarro
Jean Oh
LM&Ro
54
45
0
26 Jun 2021
A Survey on Human-aware Robot Navigation
A Survey on Human-aware Robot Navigation
Ronja Möller
Antonino Furnari
Sebastiano Battiato
Aki Härmä
G. Farinella
44
87
0
22 Jun 2021
Refer-it-in-RGBD: A Bottom-up Approach for 3D Visual Grounding in RGBD
  Images
Refer-it-in-RGBD: A Bottom-up Approach for 3D Visual Grounding in RGBD Images
Haolin Liu
Anran Lin
Xiaoguang Han
Lei Yang
Yizhou Yu
Shuguang Cui
27
40
0
14 Mar 2021
DVD: A Diagnostic Dataset for Multi-step Reasoning in Video Grounded
  Dialogue
DVD: A Diagnostic Dataset for Multi-step Reasoning in Video Grounded Dialogue
Hung Le
Chinnadhurai Sankar
Seungwhan Moon
Ahmad Beirami
A. Geramifard
Satwik Kottur
VGen
41
18
0
01 Jan 2021
Embodied Visual Active Learning for Semantic Segmentation
Embodied Visual Active Learning for Semantic Segmentation
David Nilsson
Aleksis Pirinen
Erik Gartner
C. Sminchisescu
42
35
0
17 Dec 2020
How to Train PointGoal Navigation Agents on a (Sample and Compute)
  Budget
How to Train PointGoal Navigation Agents on a (Sample and Compute) Budget
Erik Wijmans
Irfan Essa
Dhruv Batra
3DPC
30
10
0
11 Dec 2020
MultiON: Benchmarking Semantic Map Memory using Multi-Object Navigation
MultiON: Benchmarking Semantic Map Memory using Multi-Object Navigation
Saim Wani
Shivansh Patel
Unnat Jain
Angel X. Chang
Manolis Savva
34
104
0
07 Dec 2020
Efficient Robotic Object Search via HIEM: Hierarchical Policy Learning
  with Intrinsic-Extrinsic Modeling
Efficient Robotic Object Search via HIEM: Hierarchical Policy Learning with Intrinsic-Extrinsic Modeling
Xin Ye
Yezhou Yang
29
14
0
16 Oct 2020
Semantic MapNet: Building Allocentric Semantic Maps and Representations
  from Egocentric Views
Semantic MapNet: Building Allocentric Semantic Maps and Representations from Egocentric Views
Vincent Cartillier
Zhile Ren
Neha Jain
Stefan Lee
Irfan Essa
Dhruv Batra
3DPC
29
74
0
02 Oct 2020
Generative Language-Grounded Policy in Vision-and-Language Navigation
  with Bayes' Rule
Generative Language-Grounded Policy in Vision-and-Language Navigation with Bayes' Rule
Shuhei Kurita
Kyunghyun Cho
LM&Ro
17
23
0
16 Sep 2020
The Robotic Vision Scene Understanding Challenge
The Robotic Vision Scene Understanding Challenge
David Hall
Ben Talbot
S. Bista
Haoyang Zhang
Rohan Smith
Feras Dayoub
Niko Sünderhauf
23
13
0
11 Sep 2020
Auxiliary Tasks Speed Up Learning PointGoal Navigation
Auxiliary Tasks Speed Up Learning PointGoal Navigation
Joel Ye
Dhruv Batra
Erik Wijmans
Abhishek Das
3DPC
EgoV
17
79
0
09 Jul 2020
Beyond the Nav-Graph: Vision-and-Language Navigation in Continuous
  Environments
Beyond the Nav-Graph: Vision-and-Language Navigation in Continuous Environments
Jacob Krantz
Erik Wijmans
Arjun Majumdar
Dhruv Batra
Stefan Lee
24
266
0
06 Apr 2020
Analyzing Visual Representations in Embodied Navigation Tasks
Analyzing Visual Representations in Embodied Navigation Tasks
Erik Wijmans
Julian Straub
Dhruv Batra
Irfan Essa
Judy Hoffman
Ari S. Morcos
19
2
0
12 Mar 2020
An Exploration of Embodied Visual Exploration
An Exploration of Embodied Visual Exploration
Santhosh Kumar Ramakrishnan
Dinesh Jayaraman
Kristen Grauman
LM&Ro
37
98
0
07 Jan 2020
Simultaneous Mapping and Target Driven Navigation
Simultaneous Mapping and Target Driven Navigation
G. Georgakis
Yimeng Li
Jana Kosecka
17
16
0
18 Nov 2019
VideoNavQA: Bridging the Gap between Visual and Embodied Question
  Answering
VideoNavQA: Bridging the Gap between Visual and Embodied Question Answering
Cătălina Cangea
Eugene Belilovsky
Pietro Lio
Aaron Courville
16
17
0
14 Aug 2019
Neural Modular Control for Embodied Question Answering
Neural Modular Control for Embodied Question Answering
Abhishek Das
Georgia Gkioxari
Stefan Lee
Devi Parikh
Dhruv Batra
LM&Ro
135
128
0
26 Oct 2018
Making the V in VQA Matter: Elevating the Role of Image Understanding in
  Visual Question Answering
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering
Yash Goyal
Tejas Khot
D. Summers-Stay
Dhruv Batra
Devi Parikh
CoGe
155
3,136
0
02 Dec 2016
CAD2RL: Real Single-Image Flight without a Single Real Image
CAD2RL: Real Single-Image Flight without a Single Real Image
Fereshteh Sadeghi
Sergey Levine
SSL
246
812
0
13 Nov 2016
1