ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1705.03633
  4. Cited By
Inferring and Executing Programs for Visual Reasoning

Inferring and Executing Programs for Visual Reasoning

10 May 2017
Justin Johnson
B. Hariharan
L. V. D. van der Maaten
Judy Hoffman
Li Fei-Fei
C. L. Zitnick
Ross B. Girshick
    NAI
ArXivPDFHTML

Papers citing "Inferring and Executing Programs for Visual Reasoning"

50 / 128 papers shown
Title
Neuro Symbolic Knowledge Reasoning for Procedural Video Question Answering
Neuro Symbolic Knowledge Reasoning for Procedural Video Question Answering
Thanh-Son Nguyen
Hong Yang
Tzeh Yuan Neoh
Hao Zhang
Ee Yeo Keat
Basura Fernando
NAI
64
0
0
19 Mar 2025
MoVer: Motion Verification for Motion Graphics Animations
MoVer: Motion Verification for Motion Graphics Animations
Jiaju Ma
Maneesh Agrawala
VGen
51
0
0
19 Feb 2025
DiSciPLE: Learning Interpretable Programs for Scientific Visual Discovery
Utkarsh Mall
Cheng Perng Phoo
Mia Chiquier
Bharath Hariharan
Kavita Bala
Carl Vondrick
79
1
0
17 Feb 2025
Learning to Reason Iteratively and Parallelly for Complex Visual
  Reasoning Scenarios
Learning to Reason Iteratively and Parallelly for Complex Visual Reasoning Scenarios
Shantanu Jaiswal
Debaditya Roy
Basura Fernando
Cheston Tan
ReLM
LRM
79
2
0
20 Nov 2024
Discovering Object Attributes by Prompting Large Language Models with Perception-Action APIs
Discovering Object Attributes by Prompting Large Language Models with Perception-Action APIs
A. Mavrogiannis
Dehao Yuan
Yiannis Aloimonos
LM&Ro
43
0
0
23 Sep 2024
Hydra-SGG: Hybrid Relation Assignment for One-stage Scene Graph Generation
Hydra-SGG: Hybrid Relation Assignment for One-stage Scene Graph Generation
Minghan Chen
Guikun Chen
Wenguan Wang
Yi Yang
58
3
0
16 Sep 2024
SOK-Bench: A Situated Video Reasoning Benchmark with Aligned Open-World
  Knowledge
SOK-Bench: A Situated Video Reasoning Benchmark with Aligned Open-World Knowledge
Andong Wang
Bo Wu
Sunli Chen
Zhenfang Chen
Haotian Guan
Wei-Ning Lee
Li Erran Li
Chuang Gan
LRM
RALM
37
16
0
15 May 2024
STAR: A Benchmark for Situated Reasoning in Real-World Videos
STAR: A Benchmark for Situated Reasoning in Real-World Videos
Bo Wu
Shoubin Yu
Zhenfang Chen
Joshua B Tenenbaum
Chuang Gan
46
178
0
15 May 2024
Closed Loop Interactive Embodied Reasoning for Robot Manipulation
Closed Loop Interactive Embodied Reasoning for Robot Manipulation
Michal Nazarczuk
Jan Kristof Behrens
Karla Stepanova
Matej Hoffmann
K. Mikolajczyk
LM&Ro
LRM
57
1
0
23 Apr 2024
MoReVQA: Exploring Modular Reasoning Models for Video Question Answering
MoReVQA: Exploring Modular Reasoning Models for Video Question Answering
Juhong Min
Shyamal Buch
Arsha Nagrani
Minsu Cho
Cordelia Schmid
LRM
44
20
0
09 Apr 2024
PhotoScout: Synthesis-Powered Multi-Modal Image Search
PhotoScout: Synthesis-Powered Multi-Modal Image Search
Celeste Barnaby
Qiaochu Chen
Chenglong Wang
Işıl Dillig
48
2
0
19 Jan 2024
What's Left? Concept Grounding with Logic-Enhanced Foundation Models
What's Left? Concept Grounding with Logic-Enhanced Foundation Models
Joy Hsu
Jiayuan Mao
Joshua B. Tenenbaum
Jiajun Wu
VLM
ReLM
LRM
40
21
0
24 Oct 2023
D3: Data Diversity Design for Systematic Generalization in Visual
  Question Answering
D3: Data Diversity Design for Systematic Generalization in Visual Question Answering
Amir Rahimi
Vanessa D’Amario
Moyuru Yamada
Kentaro Takemoto
Tomotake Sasaki
Xavier Boix
41
1
0
15 Sep 2023
Image Transformation Sequence Retrieval with General Reinforcement
  Learning
Image Transformation Sequence Retrieval with General Reinforcement Learning
Enrique Mas-Candela
Antonio Ríos-Vila
Jorge Calvo-Zaragoza
27
0
0
13 Jul 2023
Towards AGI in Computer Vision: Lessons Learned from GPT and Large
  Language Models
Towards AGI in Computer Vision: Lessons Learned from GPT and Large Language Models
Lingxi Xie
Longhui Wei
Xiaopeng Zhang
Kaifeng Bi
Xiaotao Gu
Jianlong Chang
Qi Tian
41
7
0
14 Jun 2023
Visual Reasoning: from State to Transformation
Visual Reasoning: from State to Transformation
Xin Hong
Yanyan Lan
Liang Pang
J. Guo
Xueqi Cheng
LRM
27
4
0
02 May 2023
Curriculum Learning for Compositional Visual Reasoning
Curriculum Learning for Compositional Visual Reasoning
Wafa Aissa
Marin Ferecatu
M. Crucianu
LRM
34
3
0
27 Mar 2023
ViperGPT: Visual Inference via Python Execution for Reasoning
ViperGPT: Visual Inference via Python Execution for Reasoning
Dídac Surís
Sachit Menon
Carl Vondrick
MLLM
LRM
ReLM
45
435
0
14 Mar 2023
Dissociating language and thought in large language models
Dissociating language and thought in large language models
Kyle Mahowald
Anna A. Ivanova
I. Blank
Nancy Kanwisher
J. Tenenbaum
Evelina Fedorenko
ELM
ReLM
29
209
0
16 Jan 2023
SrTR: Self-reasoning Transformer with Visual-linguistic Knowledge for
  Scene Graph Generation
SrTR: Self-reasoning Transformer with Visual-linguistic Knowledge for Scene Graph Generation
Yuxiang Zhang
Zhenbo Liu
Shuai Wang
ReLM
LRM
34
1
0
19 Dec 2022
A Short Survey of Systematic Generalization
A Short Survey of Systematic Generalization
Yuanpeng Li
AI4CE
43
1
0
22 Nov 2022
Visual Programming: Compositional visual reasoning without training
Visual Programming: Compositional visual reasoning without training
Tanmay Gupta
Aniruddha Kembhavi
ReLM
VLM
LRM
94
406
0
18 Nov 2022
lilGym: Natural Language Visual Reasoning with Reinforcement Learning
lilGym: Natural Language Visual Reasoning with Reinforcement Learning
Anne Wu
Kianté Brantley
Noriyuki Kojima
Yoav Artzi
ReLM
OffRL
LRM
27
3
0
03 Nov 2022
Challenges in Applying Robotics to Retail Store Management
Challenges in Applying Robotics to Retail Store Management
Vartika Sengar
Aditya Kapoor
Nijil George
Vighnesh Vatsal
J. Gubbi
P. Balamuralidhar
Arpan Pal
30
4
0
18 Aug 2022
How to Reuse and Compose Knowledge for a Lifetime of Tasks: A Survey on
  Continual Learning and Functional Composition
How to Reuse and Compose Knowledge for a Lifetime of Tasks: A Survey on Continual Learning and Functional Composition
Jorge Armando Mendez Mendez
Eric Eaton
KELM
CLL
37
27
0
15 Jul 2022
Node Graph Optimization Using Differentiable Proxies
Node Graph Optimization Using Differentiable Proxies
Yiwei Hu
Paul Guerrero
Miloš Hašan
Holly Rushmeier
Valentin Deschaintre
22
25
0
15 Jul 2022
CodeRL: Mastering Code Generation through Pretrained Models and Deep
  Reinforcement Learning
CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning
Hung Le
Yue Wang
Akhilesh Deepak Gotmare
Silvio Savarese
Guosheng Lin
SyDa
ALM
135
241
0
05 Jul 2022
Beyond RGB: Scene-Property Synthesis with Neural Radiance Fields
Beyond RGB: Scene-Property Synthesis with Neural Radiance Fields
Mingtong Zhang
Shuhong Zheng
Zhipeng Bao
M. Hebert
Yu-xiong Wang
29
12
0
09 Jun 2022
What is Right for Me is Not Yet Right for You: A Dataset for Grounding
  Relative Directions via Multi-Task Learning
What is Right for Me is Not Yet Right for You: A Dataset for Grounding Relative Directions via Multi-Task Learning
Jae Hee Lee
Matthias Kerzel
Kyra Ahrens
C. Weber
S. Wermter
40
9
0
05 May 2022
Theory of Graph Neural Networks: Representation and Learning
Theory of Graph Neural Networks: Representation and Learning
Stefanie Jegelka
GNN
AI4CE
33
68
0
16 Apr 2022
CLEVR-X: A Visual Reasoning Dataset for Natural Language Explanations
CLEVR-X: A Visual Reasoning Dataset for Natural Language Explanations
Leonard Salewski
A. Sophia Koepke
Hendrik P. A. Lensch
Zeynep Akata
LRM
NAI
33
20
0
05 Apr 2022
FALCON: Fast Visual Concept Learning by Integrating Images, Linguistic
  descriptions, and Conceptual Relations
FALCON: Fast Visual Concept Learning by Integrating Images, Linguistic descriptions, and Conceptual Relations
Lingjie Mei
Jiayuan Mao
Ziqi Wang
Chuang Gan
J. Tenenbaum
VLM
29
21
0
30 Mar 2022
Measuring CLEVRness: Blackbox testing of Visual Reasoning Models
Measuring CLEVRness: Blackbox testing of Visual Reasoning Models
Spyridon Mouselinos
Henryk Michalewski
Mateusz Malinowski
21
3
0
24 Feb 2022
RelTR: Relation Transformer for Scene Graph Generation
RelTR: Relation Transformer for Scene Graph Generation
Yuren Cong
M. Yang
Bodo Rosenhahn
ViT
100
136
0
27 Jan 2022
Visual Question Answering based on Formal Logic
Visual Question Answering based on Formal Logic
Muralikrishnna G. Sethuraman
Ali Payani
Faramarz Fekri
J. C. Kerce
NAI
21
3
0
08 Nov 2021
Dynamic Visual Reasoning by Learning Differentiable Physics Models from
  Video and Language
Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language
Mingyu Ding
Zhenfang Chen
Tao Du
Ping Luo
J. Tenenbaum
Chuang Gan
VGen
PINN
OCL
38
74
0
28 Oct 2021
Understanding Interlocking Dynamics of Cooperative Rationalization
Understanding Interlocking Dynamics of Cooperative Rationalization
Mo Yu
Yang Zhang
Shiyu Chang
Tommi Jaakkola
22
41
0
26 Oct 2021
Calibrating Concepts and Operations: Towards Symbolic Reasoning on Real
  Images
Calibrating Concepts and Operations: Towards Symbolic Reasoning on Real Images
Zhuowan Li
Elias Stengel-Eskin
Yixiao Zhang
Cihang Xie
Q. Tran
Benjamin Van Durme
Alan Yuille
VLM
24
15
0
01 Oct 2021
Systematic Generalization on gSCAN: What is Nearly Solved and What is
  Next?
Systematic Generalization on gSCAN: What is Nearly Solved and What is Next?
Linlu Qiu
Hexiang Hu
Bowen Zhang
Peter Shaw
Fei Sha
33
21
0
25 Sep 2021
ReaSCAN: Compositional Reasoning in Language Grounding
ReaSCAN: Compositional Reasoning in Language Grounding
Zhengxuan Wu
Elisa Kreiss
Desmond C. Ong
Christopher Potts
CoGe
LRM
34
22
0
18 Sep 2021
Discovering the Unknown Knowns: Turning Implicit Knowledge in the
  Dataset into Explicit Training Examples for Visual Question Answering
Discovering the Unknown Knowns: Turning Implicit Knowledge in the Dataset into Explicit Training Examples for Visual Question Answering
Jihyung Kil
Cheng Zhang
D. Xuan
Wei-Lun Chao
61
20
0
13 Sep 2021
Spatial-Temporal Transformer for Dynamic Scene Graph Generation
Spatial-Temporal Transformer for Dynamic Scene Graph Generation
Yuren Cong
Wentong Liao
H. Ackermann
Bodo Rosenhahn
M. Yang
ViT
22
122
0
26 Jul 2021
Adventurer's Treasure Hunt: A Transparent System for Visually Grounded
  Compositional Visual Question Answering based on Scene Graphs
Adventurer's Treasure Hunt: A Transparent System for Visually Grounded Compositional Visual Question Answering based on Scene Graphs
Daniel Reich
F. Putze
Tanja Schultz
30
2
0
28 Jun 2021
Leveraging Language to Learn Program Abstractions and Search Heuristics
Leveraging Language to Learn Program Abstractions and Search Heuristics
Catherine Wong
Kevin Ellis
J. Tenenbaum
Jacob Andreas
27
54
0
18 Jun 2021
Supervising the Transfer of Reasoning Patterns in VQA
Supervising the Transfer of Reasoning Patterns in VQA
Corentin Kervadec
Christian Wolf
G. Antipov
M. Baccouche
Madiha Nadri Wolf
30
10
0
10 Jun 2021
Linguistic Structures as Weak Supervision for Visual Scene Graph
  Generation
Linguistic Structures as Weak Supervision for Visual Scene Graph Generation
Keren Ye
Adriana Kovashka
29
52
0
28 May 2021
A Review on Explainability in Multimodal Deep Neural Nets
A Review on Explainability in Multimodal Deep Neural Nets
Gargi Joshi
Rahee Walambe
K. Kotecha
29
140
0
17 May 2021
MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding
MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding
Aishwarya Kamath
Mannat Singh
Yann LeCun
Gabriel Synnaeve
Ishan Misra
Nicolas Carion
ObjD
VLM
90
863
0
26 Apr 2021
VGNMN: Video-grounded Neural Module Network to Video-Grounded Language
  Tasks
VGNMN: Video-grounded Neural Module Network to Video-Grounded Language Tasks
Hung Le
Nancy F. Chen
Guosheng Lin
MLLM
28
19
0
16 Apr 2021
CLEVR_HYP: A Challenge Dataset and Baselines for Visual Question
  Answering with Hypothetical Actions over Images
CLEVR_HYP: A Challenge Dataset and Baselines for Visual Question Answering with Hypothetical Actions over Images
Shailaja Keyur Sampat
Akshay Kumar
Yezhou Yang
Chitta Baral
29
26
0
13 Apr 2021
123
Next