Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1810.02338
Cited By
Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding
4 October 2018
Kexin Yi
Jiajun Wu
Chuang Gan
Antonio Torralba
Pushmeet Kohli
J. Tenenbaum
NAI
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding"
50 / 107 papers shown
Title
Modular Machine Learning: An Indispensable Path towards New-Generation Large Language Models
X. Wang
Haoyang Li
Zeyang Zhang
H. Chen
Wenwu Zhu
LRM
84
0
0
28 Apr 2025
We'll Fix it in Post: Improving Text-to-Video Generation with Neuro-Symbolic Feedback
Minkyu Choi
Sundar Sripada V. S.
Harsh Goel
Sahil Shah
Sandeep P. Chinchali
DiffM
VGen
91
0
0
24 Apr 2025
Predicate Hierarchies Improve Few-Shot State Classification
Emily Jin
Joy Hsu
Jiajun Wu
OffRL
79
0
0
18 Feb 2025
Visual Graph Question Answering with ASP and LLMs for Language Parsing
Jakob Johannes Bauer
Thomas Eiter
Nelson Higuera Ruiz
J. Oetsch
GNN
64
0
0
13 Feb 2025
Learning to Reason Iteratively and Parallelly for Complex Visual Reasoning Scenarios
Shantanu Jaiswal
Debaditya Roy
Basura Fernando
Cheston Tan
ReLM
LRM
79
2
0
20 Nov 2024
Breaking Neural Network Scaling Laws with Modularity
Akhilan Boopathy
Sunshine Jiang
William Yue
Jaedong Hwang
Abhiram Iyer
Ila Fiete
OOD
41
2
0
09 Sep 2024
FlowLearn: Evaluating Large Vision-Language Models on Flowchart Understanding
Huitong Pan
Qi Zhang
Cornelia Caragea
Eduard Constantin Dragut
Longin Jan Latecki
33
4
0
06 Jul 2024
Automated Molecular Concept Generation and Labeling with Large Language Models
Shichang Zhang
Botao Xia
Zimin Zhang
Qianli Wu
Fang Sun
Ziniu Hu
Yizhou Sun
43
0
0
13 Jun 2024
STAR: A Benchmark for Situated Reasoning in Real-World Videos
Bo Wu
Shoubin Yu
Zhenfang Chen
Joshua B Tenenbaum
Chuang Gan
38
177
0
15 May 2024
Closed Loop Interactive Embodied Reasoning for Robot Manipulation
Michal Nazarczuk
Jan Kristof Behrens
Karla Stepanova
Matej Hoffmann
K. Mikolajczyk
LM&Ro
LRM
55
1
0
23 Apr 2024
Pre-trained Vision-Language Models Learn Discoverable Visual Concepts
Yuan Zang
Tian Yun
Hao Tan
Trung Bui
Chen Sun
VLM
CoGe
58
9
0
19 Apr 2024
Enhancing Visual Question Answering through Question-Driven Image Captions as Prompts
Övgü Özdemir
Erdem Akagündüz
41
10
0
12 Apr 2024
OLViT: Multi-Modal State Tracking via Attention-Based Embeddings for Video-Grounded Dialog
Adnen Abdessaied
Manuel von Hochmeister
Andreas Bulling
40
2
0
20 Feb 2024
ContPhy: Continuum Physical Concept Learning and Reasoning from Videos
Zhicheng Zheng
Xin Yan
Zhenfang Chen
Jingzhou Wang
Qin Zhi Eddie Lim
Joshua B. Tenenbaum
Chuang Gan
LRM
43
6
0
09 Feb 2024
Image Translation as Diffusion Visual Programmers
Cheng Han
James Liang
Qifan Wang
Majid Rabbani
S. Dianat
Raghuveer M. Rao
Ying Nian Wu
Dongfang Liu
29
8
0
18 Jan 2024
Neural-Logic Human-Object Interaction Detection
Liulei Li
Jianan Wei
Wenguan Wang
Yi Yang
46
16
0
16 Nov 2023
3D-Aware Visual Question Answering about Parts, Poses and Occlusions
Xingrui Wang
Wufei Ma
Zhuowan Li
Adam Kortylewski
Alan L. Yuille
CoGe
27
12
0
27 Oct 2023
What's Left? Concept Grounding with Logic-Enhanced Foundation Models
Joy Hsu
Jiayuan Mao
Joshua B. Tenenbaum
Jiajun Wu
VLM
ReLM
LRM
30
21
0
24 Oct 2023
D3: Data Diversity Design for Systematic Generalization in Visual Question Answering
Amir Rahimi
Vanessa D’Amario
Moyuru Yamada
Kentaro Takemoto
Tomotake Sasaki
Xavier Boix
33
1
0
15 Sep 2023
Does Visual Pretraining Help End-to-End Reasoning?
Chen Sun
Calvin Luo
Xingyi Zhou
Anurag Arnab
Cordelia Schmid
OCL
LRM
ViT
35
3
0
17 Jul 2023
Learning Differentiable Logic Programs for Abstract Visual Reasoning
Hikaru Shindo
Viktor Pfanschilling
Devendra Singh Dhami
Kristian Kersting
NAI
32
6
0
03 Jul 2023
Scalable Neural-Probabilistic Answer Set Programming
Arseny Skryagin
Daniel Ochs
Devendra Singh Dhami
Kristian Kersting
35
5
0
14 Jun 2023
AI Transparency in the Age of LLMs: A Human-Centered Research Roadmap
Q. V. Liao
J. Vaughan
38
158
0
02 Jun 2023
Scallop: A Language for Neurosymbolic Programming
Ziyang Li
Jiani Huang
Mayur Naik
ReLM
LRM
NAI
24
30
0
10 Apr 2023
ViperGPT: Visual Inference via Python Execution for Reasoning
Dídac Surís
Sachit Menon
Carl Vondrick
MLLM
LRM
ReLM
45
431
0
14 Mar 2023
Concept Learning for Interpretable Multi-Agent Reinforcement Learning
Renos Zabounidis
Joseph Campbell
Simon Stepputtis
Dana Hughes
Katia P. Sycara
36
15
0
23 Feb 2023
Learning from Noisy Crowd Labels with Logics
Zhijun Chen
Hailong Sun
Haoqian He
Pengpeng Chen
NoLa
NAI
32
7
0
13 Feb 2023
Dissociating language and thought in large language models
Kyle Mahowald
Anna A. Ivanova
I. Blank
Nancy Kanwisher
J. Tenenbaum
Evelina Fedorenko
ELM
ReLM
29
209
0
16 Jan 2023
See, Think, Confirm: Interactive Prompting Between Vision and Language Models for Knowledge-based Visual Reasoning
Zhenfang Chen
Qinhong Zhou
Yikang Shen
Yining Hong
Hao Zhang
Chuang Gan
LRM
VLM
33
35
0
12 Jan 2023
In Defense of Structural Symbolic Representation for Video Event-Relation Prediction
Andrew Lu
Xudong Lin
Yulei Niu
Shih-Fu Chang
19
2
0
06 Jan 2023
Learning Action-Effect Dynamics from Pairs of Scene-graphs
Shailaja Keyur Sampat
Pratyay Banerjee
Yezhou Yang
Chitta Baral
GNN
23
0
0
07 Dec 2022
Visual Question Answering From Another Perspective: CLEVR Mental Rotation Tests
Christopher Beckham
Martin Weiss
Florian Golemo
S. Honari
Derek Nowrouzezahrai
C. Pal
28
7
0
03 Dec 2022
Super-CLEVR: A Virtual Benchmark to Diagnose Domain Robustness in Visual Reasoning
Zhuowan Li
Xingrui Wang
Elias Stengel-Eskin
Adam Kortylewski
Wufei Ma
Benjamin Van Durme
Max Planck Institute for Informatics
OOD
LRM
26
57
0
01 Dec 2022
Versatile Diffusion: Text, Images and Variations All in One Diffusion Model
Xingqian Xu
Zhangyang Wang
Eric Zhang
Kai Wang
Humphrey Shi
DiffM
35
183
0
15 Nov 2022
Generalization Differences between End-to-End and Neuro-Symbolic Vision-Language Reasoning Systems
Wang Zhu
Jesse Thomason
Robin Jia
VLM
OOD
NAI
LRM
31
6
0
26 Oct 2022
RulE: Knowledge Graph Reasoning with Rule Embedding
Xiaojuan Tang
Song-Chun Zhu
Yitao Liang
Muhan Zhang
15
2
0
24 Oct 2022
Mind's Eye: Grounded Language Model Reasoning through Simulation
Ruibo Liu
Jason W. Wei
S. Gu
Te-Yen Wu
Soroush Vosoughi
Claire Cui
Denny Zhou
Andrew M. Dai
ReLM
LRM
118
79
0
11 Oct 2022
TCNL: Transparent and Controllable Network Learning Via Embedding Human-Guided Concepts
Zhihao Wang
Chuang Zhu
24
1
0
07 Oct 2022
Learning to Collocate Visual-Linguistic Neural Modules for Image Captioning
Xu Yang
Hanwang Zhang
Chongyang Gao
Jianfei Cai
MLLM
40
10
0
04 Oct 2022
On Grounded Planning for Embodied Tasks with Language Models
Bill Yuchen Lin
Chengsong Huang
Qian Liu
Wenda Gu
Sam Sommerer
Xiang Ren
LM&Ro
34
39
0
29 Aug 2022
Diagnose Like a Radiologist: Hybrid Neuro-Probabilistic Reasoning for Attribute-Based Medical Image Diagnosis
Gangming Zhao
Quanlong Feng
Chaoqi Chen
Zhen Zhou
Yizhou Yu
34
31
0
19 Aug 2022
Challenges in Applying Robotics to Retail Store Management
Vartika Sengar
Aditya Kapoor
Nijil George
Vighnesh Vatsal
J. Gubbi
P. Balamuralidhar
Arpan Pal
30
4
0
18 Aug 2022
Neuro-Symbolic Learning: Principles and Applications in Ophthalmology
Muhammad Hassan
Haifei Guan
Aikaterini Melliou
Yuqi Wang
Qianhui Sun
...
Qi Huang
Jiefu Tan
Qinwang Xing
Peiwu Qin
Dongmei Yu
NAI
39
14
0
31 Jul 2022
Leveraging Explanations in Interactive Machine Learning: An Overview
Stefano Teso
Öznur Alkan
Wolfgang Stammer
Elizabeth M. Daly
XAI
FAtt
LRM
26
62
0
29 Jul 2022
Hierarchical Symbolic Reasoning in Hyperbolic Space for Deep Discriminative Models
Ainkaran Santhirasekaram
Avinash Kori
A. Rockall
Mathias Winkler
Francesca Toni
Ben Glocker
FAtt
42
4
0
05 Jul 2022
PROTOtypical Logic Tensor Networks (PROTO-LTN) for Zero Shot Learning
Simone Martone
Francesco Manigrasso
Lamberti Fabrizio
Lia Morra
34
3
0
26 Jun 2022
A-OKVQA: A Benchmark for Visual Question Answering using World Knowledge
Dustin Schwenk
Apoorv Khandelwal
Christopher Clark
Kenneth Marino
Roozbeh Mottaghi
16
505
0
03 Jun 2022
Multimodal Conversational AI: A Survey of Datasets and Approaches
Anirudh S. Sundar
Larry Heck
38
29
0
13 May 2022
What is Right for Me is Not Yet Right for You: A Dataset for Grounding Relative Directions via Multi-Task Learning
Jae Hee Lee
Matthias Kerzel
Kyra Ahrens
C. Weber
S. Wermter
38
9
0
05 May 2022
RelViT: Concept-guided Vision Transformer for Visual Relational Reasoning
Xiaojian Ma
Weili Nie
Zhiding Yu
Huaizu Jiang
Chaowei Xiao
Yuke Zhu
Song-Chun Zhu
Anima Anandkumar
ViT
LRM
30
19
0
24 Apr 2022
1
2
3
Next