ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1511.05099
  4. Cited By
Yin and Yang: Balancing and Answering Binary Visual Questions

Yin and Yang: Balancing and Answering Binary Visual Questions

16 November 2015
Peng Zhang
Yash Goyal
D. Summers-Stay
Dhruv Batra
Devi Parikh
    CoGe
ArXivPDFHTML

Papers citing "Yin and Yang: Balancing and Answering Binary Visual Questions"

50 / 203 papers shown
Title
BabyVLM: Data-Efficient Pretraining of VLMs Inspired by Infant Learning
BabyVLM: Data-Efficient Pretraining of VLMs Inspired by Infant Learning
Shengao Wang
Arjun Chandra
Aoming Liu
Venkatesh Saligrama
Boqing Gong
MLLM
VLM
47
0
0
13 Apr 2025
Resource-efficient Inference with Foundation Model Programs
Resource-efficient Inference with Foundation Model Programs
Lunyiu Nie
Zhimin Ding
Kevin Yu
Marco Cheung
C. Jermaine
S. Chaudhuri
30
0
0
09 Apr 2025
Talking to the brain: Using Large Language Models as Proxies to Model Brain Semantic Representation
Talking to the brain: Using Large Language Models as Proxies to Model Brain Semantic Representation
Xin Liu
Zhe Zhang
Jingxin Nie
67
0
0
26 Feb 2025
FilterRAG: Zero-Shot Informed Retrieval-Augmented Generation to Mitigate Hallucinations in VQA
FilterRAG: Zero-Shot Informed Retrieval-Augmented Generation to Mitigate Hallucinations in VQA
S M Sarwar
80
1
0
25 Feb 2025
Beyond Benchmarks: On The False Promise of AI Regulation
Gabriel Stanovsky
Renana Keydar
Gadi Perl
Eliya Habba
41
1
0
28 Jan 2025
MASS: Overcoming Language Bias in Image-Text Matching
MASS: Overcoming Language Bias in Image-Text Matching
Jiwan Chung
Seungwon Lim
Sangkyu Lee
Youngjae Yu
VLM
32
0
0
20 Jan 2025
What makes a good metric? Evaluating automatic metrics for text-to-image
  consistency
What makes a good metric? Evaluating automatic metrics for text-to-image consistency
Candace Ross
Melissa Hall
Adriana Romero Soriano
Adina Williams
95
3
0
18 Dec 2024
Task Progressive Curriculum Learning for Robust Visual Question
  Answering
Task Progressive Curriculum Learning for Robust Visual Question Answering
Ahmed Akl
Abdelwahed Khamis
Zhe Wang
Ali Cheraghian
Sara Khalifa
Kewen Wang
OOD
83
0
0
26 Nov 2024
A Comprehensive Survey on Visual Question Answering Datasets and Algorithms
Raihan Kabir
Naznin Haque
Md. Saiful Islam
Marium-E. Jannat
CoGe
29
1
0
17 Nov 2024
Right this way: Can VLMs Guide Us to See More to Answer Questions?
Right this way: Can VLMs Guide Us to See More to Answer Questions?
Li Liu
Diji Yang
Sijia Zhong
Kalyana Suma Sree Tholeti
Lei Ding
Yi Zhang
Leilani H. Gilpin
39
2
0
01 Nov 2024
SimpsonsVQA: Enhancing Inquiry-Based Learning with a Tailored Dataset
SimpsonsVQA: Enhancing Inquiry-Based Learning with a Tailored Dataset
Ngoc Dung Huynh
Mohamed Reda Bouadjenek
Sunil Aryal
Imran Razzak
Hakim Hacid
31
0
0
30 Oct 2024
Dreaming Out Loud: A Self-Synthesis Approach For Training
  Vision-Language Models With Developmentally Plausible Data
Dreaming Out Loud: A Self-Synthesis Approach For Training Vision-Language Models With Developmentally Plausible Data
Badr AlKhamissi
Yingtian Tang
Abdülkadir Gökce
Johannes Mehrer
Martin Schrimpf
VLM
49
0
0
29 Oct 2024
MMAR: Towards Lossless Multi-Modal Auto-Regressive Probabilistic
  Modeling
MMAR: Towards Lossless Multi-Modal Auto-Regressive Probabilistic Modeling
Jian Yang
Dacheng Yin
Yizhou Zhou
Fengyun Rao
Wei-dong Zhai
Yang Cao
Zheng-jun Zha
DiffM
28
7
0
14 Oct 2024
Eliminating the Language Bias for Visual Question Answering with
  fine-grained Causal Intervention
Eliminating the Language Bias for Visual Question Answering with fine-grained Causal Intervention
Ying Liu
Ge Bai
Chenji Lu
Shilong Li
Zhang Zhang
Ruifang Liu
Wenbin Guo
21
0
0
14 Oct 2024
QTG-VQA: Question-Type-Guided Architectural for VideoQA Systems
QTG-VQA: Question-Type-Guided Architectural for VideoQA Systems
Zhixian He
Pengcheng Zhao
Fuwei Zhang
Shujin Lin
41
0
0
14 Sep 2024
Evaluating Attribute Comprehension in Large Vision-Language Models
Evaluating Attribute Comprehension in Large Vision-Language Models
Haiwen Zhang
Zixi Yang
Yuanzhi Liu
Xinran Wang
Zheqi He
Kongming Liang
Zhanyu Ma
ELM
37
0
0
25 Aug 2024
Revisiting Multi-Modal LLM Evaluation
Revisiting Multi-Modal LLM Evaluation
Jian Lu
Shikhar Srivastava
Junyu Chen
Robik Shrestha
Manoj Acharya
Kushal Kafle
Christopher Kanan
30
3
0
09 Aug 2024
ArtVLM: Attribute Recognition Through Vision-Based Prefix Language
  Modeling
ArtVLM: Attribute Recognition Through Vision-Based Prefix Language Modeling
William Y. Zhu
Keren Ye
Junjie Ke
Jiahui Yu
Leonidas J. Guibas
P. Milanfar
Feng Yang
48
2
0
07 Aug 2024
Fairness and Bias Mitigation in Computer Vision: A Survey
Fairness and Bias Mitigation in Computer Vision: A Survey
Sepehr Dehdashtian
Ruozhen He
Yi Li
Guha Balakrishnan
Nuno Vasconcelos
Vicente Ordonez
Vishnu Naresh Boddeti
40
4
0
05 Aug 2024
Causal Understanding For Video Question Answering
Causal Understanding For Video Question Answering
Bhanu Prakash Reddy Guda
Tanmay Kulkarni
Adithya Sampath
Swarnashree Mysore Sathyendra
CML
54
0
0
23 Jul 2024
On the Role of Visual Grounding in VQA
On the Role of Visual Grounding in VQA
Daniel Reich
Tanja Schultz
21
1
0
26 Jun 2024
FoodieQA: A Multimodal Dataset for Fine-Grained Understanding of Chinese
  Food Culture
FoodieQA: A Multimodal Dataset for Fine-Grained Understanding of Chinese Food Culture
Wenyan Li
Xinyu Crystina Zhang
Jiaang Li
Qiwei Peng
Raphael Tang
...
Guimin Hu
Yifei Yuan
Anders Søgaard
Daniel Hershcovich
Desmond Elliott
CoGe
35
7
0
16 Jun 2024
3AM: An Ambiguity-Aware Multi-Modal Machine Translation Dataset
3AM: An Ambiguity-Aware Multi-Modal Machine Translation Dataset
Xinyu Ma
Xuebo Liu
Derek F. Wong
Jun Rao
Bei Li
Liang Ding
Lidia S. Chao
Dacheng Tao
Min Zhang
41
2
0
29 Apr 2024
VideoDistill: Language-aware Vision Distillation for Video Question
  Answering
VideoDistill: Language-aware Vision Distillation for Video Question Answering
Bo Zou
Chao Yang
Yu Qiao
Chengbin Quan
Youjian Zhao
VGen
50
1
0
01 Apr 2024
Few-Shot VQA with Frozen LLMs: A Tale of Two Approaches
Few-Shot VQA with Frozen LLMs: A Tale of Two Approaches
Igor Sterner
Weizhe Lin
Jinghong Chen
Bill Byrne
25
2
0
17 Mar 2024
Contrastive Region Guidance: Improving Grounding in Vision-Language
  Models without Training
Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training
David Wan
Jaemin Cho
Elias Stengel-Eskin
Mohit Bansal
VLM
ObjD
53
29
0
04 Mar 2024
Measuring Vision-Language STEM Skills of Neural Models
Measuring Vision-Language STEM Skills of Neural Models
Jianhao Shen
Ye Yuan
Srbuhi Mirzoyan
Ming Zhang
Chenguang Wang
VLM
33
8
0
27 Feb 2024
Multimodal Transformer With a Low-Computational-Cost Guarantee
Multimodal Transformer With a Low-Computational-Cost Guarantee
Sungjin Park
Edward Choi
52
1
0
23 Feb 2024
Proximity QA: Unleashing the Power of Multi-Modal Large Language Models
  for Spatial Proximity Analysis
Proximity QA: Unleashing the Power of Multi-Modal Large Language Models for Spatial Proximity Analysis
Jianing Li
Xi Nan
Ming Lu
Li Du
Shanghang Zhang
50
1
0
31 Jan 2024
MIVC: Multiple Instance Visual Component for Visual-Language Models
MIVC: Multiple Instance Visual Component for Visual-Language Models
Wenyi Wu
Qi Li
Leon Wenliang Zhong
Junzhou Huang
33
3
0
28 Dec 2023
Understanding Unimodal Bias in Multimodal Deep Linear Networks
Understanding Unimodal Bias in Multimodal Deep Linear Networks
Yedi Zhang
Peter E. Latham
Andrew Saxe
31
6
0
01 Dec 2023
Debiasing Multimodal Models via Causal Information Minimization
Debiasing Multimodal Models via Causal Information Minimization
Vaidehi Patil
A. Maharana
Mohit Bansal
CML
38
2
0
28 Nov 2023
From Image to Language: A Critical Analysis of Visual Question Answering
  (VQA) Approaches, Challenges, and Opportunities
From Image to Language: A Critical Analysis of Visual Question Answering (VQA) Approaches, Challenges, and Opportunities
Md Farhan Ishmam
Md Sakib Hossain Shovon
M. F. Mridha
Nilanjan Dey
43
36
0
01 Nov 2023
CAD -- Contextual Multi-modal Alignment for Dynamic AVQA
CAD -- Contextual Multi-modal Alignment for Dynamic AVQA
Asmar Nadeem
Adrian Hilton
R. Dawes
Graham A. Thomas
A. Mustafa
30
9
0
25 Oct 2023
Dataset Bias Mitigation in Multiple-Choice Visual Question Answering and
  Beyond
Dataset Bias Mitigation in Multiple-Choice Visual Question Answering and Beyond
Zhecan Wang
Long Chen
Haoxuan You
Keyang Xu
Yicheng He
Wenhao Li
Noal Codella
Kai-Wei Chang
Shih-Fu Chang
27
3
0
23 Oct 2023
UNK-VQA: A Dataset and a Probe into the Abstention Ability of
  Multi-modal Large Models
UNK-VQA: A Dataset and a Probe into the Abstention Ability of Multi-modal Large Models
Yanyang Guo
Fangkai Jiao
Zhiqi Shen
Liqiang Nie
Mohan S. Kankanhalli
MLLM
30
5
0
17 Oct 2023
Tackling Data Bias in MUSIC-AVQA: Crafting a Balanced Dataset for
  Unbiased Question-Answering
Tackling Data Bias in MUSIC-AVQA: Crafting a Balanced Dataset for Unbiased Question-Answering
Xiulong Liu
Zhikang Dong
Peng Zhang
24
21
0
10 Oct 2023
Learning the meanings of function words from grounded language using a
  visual question answering model
Learning the meanings of function words from grounded language using a visual question answering model
Eva Portelance
Michael C. Frank
Dan Jurafsky
NAI
33
7
0
16 Aug 2023
Robust Visual Question Answering: Datasets, Methods, and Future
  Challenges
Robust Visual Question Answering: Datasets, Methods, and Future Challenges
Jie Ma
Pinghui Wang
Dechen Kong
Zewei Wang
Jun Liu
Hongbin Pei
Junzhou Zhao
OOD
29
18
0
21 Jul 2023
Read, Look or Listen? What's Needed for Solving a Multimodal Dataset
Read, Look or Listen? What's Needed for Solving a Multimodal Dataset
Netta Madvil
Yonatan Bitton
Roy Schwartz
30
2
0
06 Jul 2023
What Matters in Training a GPT4-Style Language Model with Multimodal
  Inputs?
What Matters in Training a GPT4-Style Language Model with Multimodal Inputs?
Yan Zeng
Hanbo Zhang
Jiani Zheng
Jiangnan Xia
Guoqiang Wei
Yang Wei
Yuchen Zhang
Tao Kong
MLLM
27
71
0
05 Jul 2023
Learning to Imagine: Visually-Augmented Natural Language Generation
Learning to Imagine: Visually-Augmented Natural Language Generation
Tianyi Tang
Yushuo Chen
Yifan Du
Junyi Li
Wayne Xin Zhao
Ji-Rong Wen
DiffM
16
9
0
26 May 2023
Evaluating Object Hallucination in Large Vision-Language Models
Evaluating Object Hallucination in Large Vision-Language Models
Yifan Li
Yifan Du
Kun Zhou
Jinpeng Wang
Wayne Xin Zhao
Ji-Rong Wen
MLLM
LRM
119
699
0
17 May 2023
Fairness in AI Systems: Mitigating gender bias from language-vision
  models
Fairness in AI Systems: Mitigating gender bias from language-vision models
Lavisha Aggarwal
Shruti Bhargava
19
4
0
03 May 2023
Visual Reasoning: from State to Transformation
Visual Reasoning: from State to Transformation
Xin Hong
Yanyan Lan
Liang Pang
J. Guo
Xueqi Cheng
LRM
16
3
0
02 May 2023
RoCOCO: Robustness Benchmark of MS-COCO to Stress-test Image-Text
  Matching Models
RoCOCO: Robustness Benchmark of MS-COCO to Stress-test Image-Text Matching Models
Seulki Park
Daeho Um
Hajung Yoon
Sanghyuk Chun
Sangdoo Yun
Jin Young Choi
38
2
0
21 Apr 2023
Improving Visual Question Answering Models through Robustness Analysis
  and In-Context Learning with a Chain of Basic Questions
Improving Visual Question Answering Models through Robustness Analysis and In-Context Learning with a Chain of Basic Questions
Jia-Hong Huang
Modar Alfadly
Guohao Li
M. Worring
OOD
AAML
44
5
0
06 Apr 2023
Breaking Common Sense: WHOOPS! A Vision-and-Language Benchmark of
  Synthetic and Compositional Images
Breaking Common Sense: WHOOPS! A Vision-and-Language Benchmark of Synthetic and Compositional Images
Nitzan Bitton-Guetta
Yonatan Bitton
Jack Hessel
Ludwig Schmidt
Yuval Elovici
Gabriel Stanovsky
Roy Schwartz
VLM
121
66
0
13 Mar 2023
MAQA: A Multimodal QA Benchmark for Negation
MAQA: A Multimodal QA Benchmark for Negation
Judith Yue Li
Aren Jansen
Qingqing Huang
Joonseok Lee
Ravi Ganti
Dima Kuzmin
33
5
0
09 Jan 2023
VQA and Visual Reasoning: An Overview of Recent Datasets, Methods and
  Challenges
VQA and Visual Reasoning: An Overview of Recent Datasets, Methods and Challenges
R. Zakari
Jim Wilson Owusu
Hailin Wang
Ke Qin
Zaharaddeen Karami Lawal
Yue-hong Dong
LRM
33
16
0
26 Dec 2022
12345
Next