ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1505.00468
  4. Cited By
VQA: Visual Question Answering
v1v2v3v4v5v6v7 (latest)

VQA: Visual Question Answering

3 May 2015
Aishwarya Agrawal
Jiasen Lu
Stanislaw Antol
Margaret Mitchell
C. L. Zitnick
Dhruv Batra
Devi Parikh
    CoGe
ArXiv (abs)PDFHTML

Papers citing "VQA: Visual Question Answering"

50 / 2,957 papers shown
Title
Discover and Mitigate Unknown Biases with Debiasing Alternate Networks
Discover and Mitigate Unknown Biases with Debiasing Alternate Networks
Zhiheng Li
A. Hoogs
Chenliang Xu
83
56
0
20 Jul 2022
Tip-Adapter: Training-free Adaption of CLIP for Few-shot Classification
Tip-Adapter: Training-free Adaption of CLIP for Few-shot Classification
Renrui Zhang
Zhang Wei
Rongyao Fang
Peng Gao
Kunchang Li
Jifeng Dai
Yu Qiao
Hongsheng Li
VLM
145
321
0
19 Jul 2022
Exploiting Unlabeled Data with Vision and Language Models for Object
  Detection
Exploiting Unlabeled Data with Vision and Language Models for Object Detection
Shiyu Zhao
Zhixing Zhang
S. Schulter
Long Zhao
Vijay Kumar B.G
Anastasis Stathopoulos
Manmohan Chandraker
Dimitris N. Metaxas
VLMObjD
89
102
0
18 Jul 2022
Rethinking Data Augmentation for Robust Visual Question Answering
Rethinking Data Augmentation for Robust Visual Question Answering
Long Chen
Yuhang Zheng
Jun Xiao
OOD
88
43
0
18 Jul 2022
Towards the Human Global Context: Does the Vision-Language Model Really
  Judge Like a Human Being?
Towards the Human Global Context: Does the Vision-Language Model Really Judge Like a Human Being?
Sangmyeong Woh
Jaemin Lee
Hoki Kim
Jinsuk Lee
48
0
0
18 Jul 2022
Zero-Shot Temporal Action Detection via Vision-Language Prompting
Zero-Shot Temporal Action Detection via Vision-Language Prompting
Sauradip Nag
Xiatian Zhu
Yi-Zhe Song
Tao Xiang
VLM
79
68
0
17 Jul 2022
FashionViL: Fashion-Focused Vision-and-Language Representation Learning
FashionViL: Fashion-Focused Vision-and-Language Representation Learning
Xiaoping Han
Licheng Yu
Xiatian Zhu
Li Zhang
Yi-Zhe Song
Tao Xiang
AI4TS
54
49
0
17 Jul 2022
Scene Graph for Embodied Exploration in Cluttered Scenario
Scene Graph for Embodied Exploration in Cluttered Scenario
Yuhong Deng
Qie Sima
Di Guo
Huaping Liu
Yi Wang
Gang Hua
LM&Ro
103
2
0
16 Jul 2022
Inner Monologue: Embodied Reasoning through Planning with Language
  Models
Inner Monologue: Embodied Reasoning through Planning with Language Models
Wenlong Huang
F. Xia
Ted Xiao
Harris Chan
Jacky Liang
...
Tomas Jackson
Linda Luu
Sergey Levine
Karol Hausman
Brian Ichter
LLMAGLM&RoLRM
211
927
0
12 Jul 2022
BOSS: Bottom-up Cross-modal Semantic Composition with Hybrid
  Counterfactual Training for Robust Content-based Image Retrieval
BOSS: Bottom-up Cross-modal Semantic Composition with Hybrid Counterfactual Training for Robust Content-based Image Retrieval
Wenqiao Zhang
Jiannan Guo
Meng Li
Haochen Shi
Shengyu Zhang
Juncheng Li
Siliang Tang
Yueting Zhuang
88
6
0
09 Jul 2022
CoSIm: Commonsense Reasoning for Counterfactual Scene Imagination
CoSIm: Commonsense Reasoning for Counterfactual Scene Imagination
Hyounghun Kim
Abhaysinh Zala
Joey Tianyi Zhou
60
6
0
08 Jul 2022
Crake: Causal-Enhanced Table-Filler for Question Answering over Large
  Scale Knowledge Base
Crake: Causal-Enhanced Table-Filler for Question Answering over Large Scale Knowledge Base
Minhao Zhang
Ruoyu Zhang
Yanzeng Li
Lei Zou
83
10
0
08 Jul 2022
Adversarial Robustness of Visual Dialog
Adversarial Robustness of Visual Dialog
Lu Yu
Verena Rieser
AAML
83
0
0
06 Jul 2022
Chairs Can be Stood on: Overcoming Object Bias in Human-Object
  Interaction Detection
Chairs Can be Stood on: Overcoming Object Bias in Human-Object Interaction Detection
Guangzhi Wang
Yangyang Guo
Yongkang Wong
Mohan S. Kankanhalli
90
11
0
06 Jul 2022
Distance Matters in Human-Object Interaction Detection
Distance Matters in Human-Object Interaction Detection
Guangzhi Wang
Yangyang Guo
Yongkang Wong
Mohan S. Kankanhalli
105
13
0
05 Jul 2022
Scene-Aware Prompt for Multi-modal Dialogue Understanding and Generation
Scene-Aware Prompt for Multi-modal Dialogue Understanding and Generation
Bin Li
Yixuan Weng
Ziyu Ma
Bin Sun
Shutao Li
VLM
36
2
0
05 Jul 2022
ViRel: Unsupervised Visual Relations Discovery with Graph-level Analogy
ViRel: Unsupervised Visual Relations Discovery with Graph-level Analogy
D. Zeng
Tailin Wu
J. Leskovec
GNN
110
1
0
04 Jul 2022
Enabling Harmonious Human-Machine Interaction with Visual-Context
  Augmented Dialogue System: A Review
Enabling Harmonious Human-Machine Interaction with Visual-Context Augmented Dialogue System: A Review
Hao Wang
Bin Guo
Y. Zeng
Yasan Ding
Chen Qiu
Ying Zhang
Li Yao
Zhiwen Yu
79
2
0
02 Jul 2022
Contrastive Cross-Modal Knowledge Sharing Pre-training for
  Vision-Language Representation Learning and Retrieval
Contrastive Cross-Modal Knowledge Sharing Pre-training for Vision-Language Representation Learning and Retrieval
Keyu Wen
Zhenshan Tan
Qingrong Cheng
Cheng Chen
X. Gu
VLM
78
0
0
02 Jul 2022
Modern Question Answering Datasets and Benchmarks: A Survey
Modern Question Answering Datasets and Benchmarks: A Survey
Zhen Wang
85
24
0
30 Jun 2022
A Unified End-to-End Retriever-Reader Framework for Knowledge-based VQA
A Unified End-to-End Retriever-Reader Framework for Knowledge-based VQA
Yangyang Guo
Liqiang Nie
Yongkang Wong
Yebin Liu
Zhiyong Cheng
Mohan S. Kankanhalli
121
40
0
30 Jun 2022
Technical Report for CVPR 2022 LOVEU AQTC Challenge
Technical Report for CVPR 2022 LOVEU AQTC Challenge
Hyeonyu Kim
Jongeun Kim
Jeonghun Kang
S. Park
Dongchan Park
Taehwan Kim
43
0
0
29 Jun 2022
Fair Machine Learning in Healthcare: A Review
Fair Machine Learning in Healthcare: A Review
Qizhang Feng
Mengnan Du
Na Zou
Helen Zhou
FaML
83
0
0
29 Jun 2022
EBMs vs. CL: Exploring Self-Supervised Visual Pretraining for Visual
  Question Answering
EBMs vs. CL: Exploring Self-Supervised Visual Pretraining for Visual Question Answering
Violetta Shevchenko
Ehsan Abbasnejad
A. Dick
Anton Van Den Hengel
Damien Teney
73
0
0
29 Jun 2022
Consistency-preserving Visual Question Answering in Medical Imaging
Consistency-preserving Visual Question Answering in Medical Imaging
Sergio Tascon-Morales
Pablo Márquez-Neila
Raphael Sznitman
MedIm
92
12
0
27 Jun 2022
Automatic Generation of Product-Image Sequence in E-commerce
Automatic Generation of Product-Image Sequence in E-commerce
Xiaochuan Fan
Chi Zhang
Yong-Jie Yang
Yue Shang
Xueying Zhang
Zhen He
Yun Xiao
Bo Long
Lingfei Wu
60
4
0
26 Jun 2022
CLAMP: Prompt-based Contrastive Learning for Connecting Language and
  Animal Pose
CLAMP: Prompt-based Contrastive Learning for Connecting Language and Animal Pose
Xu Zhang
Wen Wang
Zhe Chen
Yufei Xu
Jing Zhang
Dacheng Tao
CLIPVLM
73
28
0
23 Jun 2022
VisFIS: Visual Feature Importance Supervision with
  Right-for-the-Right-Reason Objectives
VisFIS: Visual Feature Importance Supervision with Right-for-the-Right-Reason Objectives
Zhuofan Ying
Peter Hase
Joey Tianyi Zhou
LRM
87
13
0
22 Jun 2022
Winning the CVPR'2022 AQTC Challenge: A Two-stage Function-centric
  Approach
Winning the CVPR'2022 AQTC Challenge: A Two-stage Function-centric Approach
Shiwei Wu
Weidong He
Tong Xu
Hao Wang
Enhong Chen
EgoV
64
3
0
20 Jun 2022
DALL-E for Detection: Language-driven Compositional Image Synthesis for
  Object Detection
DALL-E for Detection: Language-driven Compositional Image Synthesis for Object Detection
Yunhao Ge
Lyne Tchapmi
Brian Nlong Zhao
Neel Joshi
Laurent Itti
Vibhav Vineet
DiffMObjD
107
18
0
20 Jun 2022
Interactive Visual Reasoning under Uncertainty
Interactive Visual Reasoning under Uncertainty
Manjie Xu
Guangyuan Jiang
Wei Liang
Song-Chun Zhu
Yixin Zhu
LRM
108
5
0
18 Jun 2022
VReBERT: A Simple and Flexible Transformer for Visual Relationship
  Detection
VReBERT: A Simple and Flexible Transformer for Visual Relationship Detection
Yunbo Cui
M. Farazi
ViT
95
1
0
18 Jun 2022
Unified-IO: A Unified Model for Vision, Language, and Multi-Modal Tasks
Unified-IO: A Unified Model for Vision, Language, and Multi-Modal Tasks
Jiasen Lu
Christopher Clark
Rowan Zellers
Roozbeh Mottaghi
Aniruddha Kembhavi
ObjDVLMMLLM
173
412
0
17 Jun 2022
FD-CAM: Improving Faithfulness and Discriminability of Visual
  Explanation for CNNs
FD-CAM: Improving Faithfulness and Discriminability of Visual Explanation for CNNs
Hui Li
Zihao Li
Rui Ma
Tieru Wu
FAtt
47
9
0
17 Jun 2022
VLMbench: A Compositional Benchmark for Vision-and-Language Manipulation
VLMbench: A Compositional Benchmark for Vision-and-Language Manipulation
Kai Zheng
Xiaotong Chen
Odest Chadwicke Jenkins
Xinze Wang
LM&RoCoGe
95
63
0
17 Jun 2022
Zero-Shot Video Question Answering via Frozen Bidirectional Language
  Models
Zero-Shot Video Question Answering via Frozen Bidirectional Language Models
Antoine Yang
Antoine Miech
Josef Sivic
Ivan Laptev
Cordelia Schmid
152
240
0
16 Jun 2022
Multimodal Dialogue State Tracking
Multimodal Dialogue State Tracking
Hung Le
Nancy F. Chen
Guosheng Lin
75
9
0
16 Jun 2022
Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone
Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone
Zi-Yi Dou
Aishwarya Kamath
Zhe Gan
Pengchuan Zhang
Jianfeng Wang
...
Ce Liu
Yann LeCun
Nanyun Peng
Jianfeng Gao
Lijuan Wang
VLMObjD
117
130
0
15 Jun 2022
Multimodal Learning with Transformers: A Survey
Multimodal Learning with Transformers: A Survey
Peng Xu
Xiatian Zhu
David Clifton
ViT
241
579
0
13 Jun 2022
Bringing Image Scene Structure to Video via Frame-Clip Consistency of
  Object Tokens
Bringing Image Scene Structure to Video via Frame-Clip Consistency of Object Tokens
Elad Ben-Avraham
Roei Herzig
K. Mangalam
Amir Bar
Anna Rohrbach
Leonid Karlinsky
Trevor Darrell
Amir Globerson
94
0
0
13 Jun 2022
GLIPv2: Unifying Localization and Vision-Language Understanding
GLIPv2: Unifying Localization and Vision-Language Understanding
Haotian Zhang
Pengchuan Zhang
Xiaowei Hu
Yen-Chun Chen
Liunian Harold Li
Xiyang Dai
Lijuan Wang
Lu Yuan
Lei Li
Jianfeng Gao
ObjDVLM
106
304
0
12 Jun 2022
A Unified Continuous Learning Framework for Multi-modal Knowledge
  Discovery and Pre-training
A Unified Continuous Learning Framework for Multi-modal Knowledge Discovery and Pre-training
Zhihao Fan
Zhongyu Wei
Jingjing Chen
Siyuan Wang
Zejun Li
Jiarong Xu
Xuanjing Huang
CLL
59
6
0
11 Jun 2022
Revealing Single Frame Bias for Video-and-Language Learning
Revealing Single Frame Bias for Video-and-Language Learning
Jie Lei
Tamara L. Berg
Joey Tianyi Zhou
96
115
0
07 Jun 2022
cViL: Cross-Lingual Training of Vision-Language Models using Knowledge
  Distillation
cViL: Cross-Lingual Training of Vision-Language Models using Knowledge Distillation
Kshitij Gupta
Devansh Gautam
R. Mamidi
VLM
74
4
0
07 Jun 2022
Towards Fast Adaptation of Pretrained Contrastive Models for
  Multi-channel Video-Language Retrieval
Towards Fast Adaptation of Pretrained Contrastive Models for Multi-channel Video-Language Retrieval
Xudong Lin
Simran Tiwari
Shiyuan Huang
Manling Li
Mike Zheng Shou
Heng Ji
Shih-Fu Chang
138
21
0
05 Jun 2022
From Pixels to Objects: Cubic Visual Attention for Visual Question
  Answering
From Pixels to Objects: Cubic Visual Attention for Visual Question Answering
Jingkuan Song
Pengpeng Zeng
Lianli Gao
Heng Tao Shen
72
62
0
04 Jun 2022
Revisiting the "Video" in Video-Language Understanding
Revisiting the "Video" in Video-Language Understanding
S. Buch
Cristobal Eyzaguirre
Adrien Gaidon
Jiajun Wu
L. Fei-Fei
Juan Carlos Niebles
100
166
0
03 Jun 2022
A-OKVQA: A Benchmark for Visual Question Answering using World Knowledge
A-OKVQA: A Benchmark for Visual Question Answering using World Knowledge
Dustin Schwenk
Apoorv Khandelwal
Christopher Clark
Kenneth Marino
Roozbeh Mottaghi
77
556
0
03 Jun 2022
Pruning for Feature-Preserving Circuits in CNNs
Pruning for Feature-Preserving Circuits in CNNs
Christopher Hamblin
Talia Konkle
G. Alvarez
80
2
0
03 Jun 2022
REVIVE: Regional Visual Representation Matters in Knowledge-Based Visual
  Question Answering
REVIVE: Regional Visual Representation Matters in Knowledge-Based Visual Question Answering
Yuanze Lin
Yujia Xie
Dongdong Chen
Yichong Xu
Chenguang Zhu
Lu Yuan
93
75
0
02 Jun 2022
Previous
123...272829...585960
Next