Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1605.09782
Cited By
Adversarial Feature Learning
31 May 2016
Jiasen Lu
Philipp Krahenbuhl
Trevor Darrell
GAN
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Adversarial Feature Learning"
50 / 642 papers shown
Title
Question-Aware Gaussian Experts for Audio-Visual Question Answering
Hongyeob Kim
Inyoung Jung
Dayoon Suh
Youjia Zhang
Sangmin Lee
Sungeun Hong
61
0
0
06 Mar 2025
Towards Improved Text-Aligned Codebook Learning: Multi-Hierarchical Codebook-Text Alignment with Long Text
Guotao Liang
Baoquan Zhang
Zhiyuan Wen
Junteng Zhao
Yunming Ye
Kola Ye
Yao He
57
0
0
03 Mar 2025
R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts
Zhongyang Li
Ziyue Li
Dinesh Manocha
MoE
53
0
0
27 Feb 2025
LOVA3: Learning to Visual Question Answering, Asking and Assessment
Henry Hengyuan Zhao
Pan Zhou
Difei Gao
Zechen Bai
Mike Zheng Shou
82
8
0
21 Feb 2025
Using Large Language Models for education managements in Vietnamese with low resources
Duc Do Minh
Vinh Nguyen Van
Thang Dam Cong
43
0
0
28 Jan 2025
Cross-modal Context Fusion and Adaptive Graph Convolutional Network for Multimodal Conversational Emotion Recognition
Junwei Feng
Xueyan Fan
147
0
0
28 Jan 2025
Combining Knowledge Graph and LLMs for Enhanced Zero-shot Visual Question Answering
Qian Tao
Xiaoyang Fan
Yong Xu
Xingquan Zhu
Yufei Tang
50
0
0
22 Jan 2025
The Quest for Visual Understanding: A Journey Through the Evolution of Visual Question Answering
Anupam Pandey
Deepjyoti Bodo
Arpan Phukan
Asif Ekbal
38
0
0
13 Jan 2025
SAFE-MEME: Structured Reasoning Framework for Robust Hate Speech Detection in Memes
Palash Nandi
Shivam Sharma
Tanmoy Chakraborty
36
1
0
31 Dec 2024
A Review of Multimodal Explainable Artificial Intelligence: Past, Present and Future
Shilin Sun
Wenbin An
Feng Tian
Fang Nan
Qidong Liu
Xiaozhong Liu
N. Shah
Ping Chen
96
2
0
18 Dec 2024
A Comprehensive Survey on Visual Question Answering Datasets and Algorithms
Raihan Kabir
Naznin Haque
Md. Saiful Islam
Marium-E. Jannat
CoGe
29
1
0
17 Nov 2024
SaSR-Net: Source-Aware Semantic Representation Network for Enhancing Audio-Visual Question Answering
Tianyu Yang
Yiyang Nan
Lisen Dai
Zhenwen Liang
Yapeng Tian
Xuzhi Zhang
39
0
0
07 Nov 2024
Goal-Oriented Semantic Communication for Wireless Visual Question Answering
Sige Liu
Nan Li
Yansha Deng
Tony Q. S. Quek
34
0
0
03 Nov 2024
Efficient Bilinear Attention-based Fusion for Medical Visual Question Answering
Zhilin Zhang
Jie Wang
Zhanghao Qin
Ruiqi Zhu
Xiaoliang Gong
MedIm
48
0
0
28 Oct 2024
Visual Text Matters: Improving Text-KVQA with Visual Text Entity Knowledge-aware Large Multimodal Assistant
A. S. Penamakuri
Anand Mishra
26
1
0
24 Oct 2024
ViConsFormer: Constituting Meaningful Phrases of Scene Texts using Transformer-based Method in Vietnamese Text-based Visual Question Answering
Nghia Hieu Nguyen
Tho Thanh Quan
Ngan Luu-Thuy Nguyen
31
0
0
18 Oct 2024
Online Multi-modal Root Cause Analysis
Lecheng Zheng
Zhengzhang Chen
Haifeng Chen
Jingrui He
29
0
0
13 Oct 2024
A Social Context-aware Graph-based Multimodal Attentive Learning Framework for Disaster Content Classification during Emergencies
Shahid Shafi Dar
Mohammad Zia Ur Rehman
Karan Bais
Mohammed Abdul Haseeb
Nagendra Kumara
36
10
0
11 Oct 2024
FODA-PG for Enhanced Medical Imaging Narrative Generation: Adaptive Differentiation of Normal and Abnormal Attributes
Kai Shu
Yuzhuo Jia
Ziyang Zhang
Jiechao Gao
MedIm
32
0
0
06 Sep 2024
Ensemble Predicate Decoding for Unbiased Scene Graph Generation
Jiasong Feng
Lichun Wang
Hongbo Xu
Kai Xu
Baocai Yin
42
0
0
26 Aug 2024
A Survey on Integrated Sensing, Communication, and Computation
Dingzhu Wen
Yong Zhou
Xiaoyang Li
Yuanming Shi
Kaibin Huang
Khaled B. Letaief
37
24
0
15 Aug 2024
Modelling Visual Semantics via Image Captioning to extract Enhanced Multi-Level Cross-Modal Semantic Incongruity Representation with Attention for Multimodal Sarcasm Detection
Sajal Aggarwal
Ananya Pandey
Dinesh Kumar Vishwakarma
43
1
0
05 Aug 2024
Advancing Vietnamese Visual Question Answering with Transformer and Convolutional Integration
Ngoc Son Nguyen
Van Son Nguyen
Tung Le
ViT
43
0
0
30 Jul 2024
Boosting Audio Visual Question Answering via Key Semantic-Aware Cues
Guangyao Li
Henghui Du
Di Hu
26
4
0
30 Jul 2024
UniMEL: A Unified Framework for Multimodal Entity Linking with Large Language Models
Liu Qi
Yongyi He
Lian Defu
Zhi Zheng
Tong Xu
Liu Che
Chen Enhong
MLLM
33
1
0
23 Jul 2024
Attention Beats Linear for Fast Implicit Neural Representation Generation
Shuyi Zhang
Ke Liu
Jingjun Gu
Xiaoxu Cai
Zhihua Wang
Jiajun Bu
Haishuai Wang
48
2
0
22 Jul 2024
DIM: Dynamic Integration of Multimodal Entity Linking with Large Language Model
Shezheng Song
Shasha Li
Jie Yu
Shan Zhao
Xiaopeng Li
Jun Ma
Xiaodong Liu
Zhuo Li
Xiaoguang Mao
26
1
0
27 Jun 2024
SRC-Net: Bi-Temporal Spatial Relationship Concerned Network for Change Detection
Hongjia Chen
Xin Xu
Fangling Pu
51
6
0
09 Jun 2024
Enhancing Multimodal Large Language Models with Multi-instance Visual Prompt Generator for Visual Representation Enrichment
Wenliang Zhong
Wenyi Wu
Qi Li
Rob Barton
Boxin Du
Shioulin Sam
Karim Bouyarmane
Ismail B. Tutar
Junzhou Huang
33
3
0
05 Jun 2024
Two Tales of Persona in LLMs: A Survey of Role-Playing and Personalization
Yu-Min Tseng
Yu-Chao Huang
Teng-Yun Hsiao
Yu-Ching Hsu
Chao-Wei Huang
Jia-Yin Foo
Yun-Nung Chen
LLMAG
259
68
0
03 Jun 2024
Prompt-Aware Adapter: Towards Learning Adaptive Visual Tokens for Multimodal Large Language Models
Yue Zhang
Hehe Fan
Yi Yang
53
3
0
24 May 2024
MemeMQA: Multimodal Question Answering for Memes via Rationale-Based Inferencing
Siddhant Agarwal
Shivam Sharma
Preslav Nakov
Tanmoy Chakraborty
24
4
0
18 May 2024
RGB Guided ToF Imaging System: A Survey of Deep Learning-based Methods
Xin Qiao
Matteo Poggi
Pengchao Deng
Hao Wei
Chenyang Ge
S. Mattoccia
34
3
0
16 May 2024
Spatial Semantic Recurrent Mining for Referring Image Segmentation
Jiaxing Yang
Lihe Zhang
Jiayu Sun
Huchuan Lu
29
0
0
15 May 2024
ViOCRVQA: Novel Benchmark Dataset and Vision Reader for Visual Question Answering by Understanding Vietnamese Text in Images
Huy Quang Pham
Thang Kien-Bao Nguyen
Quan Van Nguyen
Dan Quang Tran
Nghia Hieu Nguyen
Kiet Van Nguyen
Ngan Luu-Thuy Nguyen
33
3
0
29 Apr 2024
Exploring Diverse Methods in Visual Question Answering
Panfeng Li
Qikai Yang
Xieming Geng
Wenjing Zhou
Zhicheng Ding
Yi Nian
42
54
0
21 Apr 2024
Look, Listen, and Answer: Overcoming Biases for Audio-Visual Question Answering
Jie Ma
Min Hu
Pinghui Wang
Wangchun Sun
Lingyun Song
Hongbin Pei
Jun Liu
Youtian Du
35
4
0
18 Apr 2024
Deep Learning for Video-Based Assessment of Endotracheal Intubation Skills
Jean-Paul Ainam
Erim Yanik
Rahul Rahul
Taylor Kunkes
Lora Cavuoto
Brian Clemency
Kaori Tanaka
Matthew Hackett
Jack Norfleet
S. De
21
0
0
17 Apr 2024
DWE+: Dual-Way Matching Enhanced Framework for Multimodal Entity Linking
Shezheng Song
Shasha Li
Shan Zhao
Xiaopeng Li
Chengyu Wang
Jie Yu
Jun Ma
Tianwei Yan
Bing Ji
Xiaoguang Mao
23
0
0
07 Apr 2024
Enhancing Efficiency in Vision Transformer Networks: Design Techniques and Insights
Moein Heidari
Reza Azad
Sina Ghorbani Kolahi
René Arimond
Leon Niggemeier
...
Afshin Bozorgpour
Ehsan Khodapanah Aghdam
A. Kazerouni
I. Hacihaliloglu
Dorit Merhof
51
7
0
28 Mar 2024
Centered Masking for Language-Image Pre-Training
Mingliang Liang
Martha Larson
VLM
CLIP
33
4
0
23 Mar 2024
Answering Diverse Questions via Text Attached with Key Audio-Visual Clues
Qilang Ye
Zitong Yu
Xin Liu
38
1
0
11 Mar 2024
Multi-modal Semantic Understanding with Contrastive Cross-modal Feature Alignment
Minghua Zhang
Ke Chang
Yunfang Wu
32
1
0
11 Mar 2024
CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenarios
Qilang Ye
Zitong Yu
Rui Shao
Xinyu Xie
Philip H. S. Torr
Xiaochun Cao
MLLM
47
24
0
07 Mar 2024
Modeling Collaborator: Enabling Subjective Vision Classification With Minimal Human Effort via LLM Tool-Use
Imad Eddine Toubal
Aditya Avinash
N. Alldrin
Jan Dlabal
Wenlei Zhou
...
Chun-Ta Lu
Howard Zhou
Ranjay Krishna
Ariel Fuxman
Tom Duerig
VLM
75
7
0
05 Mar 2024
MLIP: Enhancing Medical Visual Representation with Divergence Encoder and Knowledge-guided Contrastive Learning
Zhe Li
Laurence T. Yang
Bocheng Ren
Xin Nie
Zhangyang Gao
Cheng Tan
Stan Z. Li
VLM
15
12
0
03 Feb 2024
Free Form Medical Visual Question Answering in Radiology
Abhishek Narayanan
Rushabh Musthyala
Rahul Sankar
A. Nistala
P. Singh
Jacopo Cirrone
15
2
0
23 Jan 2024
MAST: Video Polyp Segmentation with a Mixture-Attention Siamese Transformer
Geng Chen
Junqing Yang
Xiaozhou Pu
Ge-Peng Ji
Huan Xiong
Yongsheng Pan
Hengfei Cui
Yong-quan Xia
MedIm
ViT
51
2
0
23 Jan 2024
Survey of Natural Language Processing for Education: Taxonomy, Systematic Review, and Future Trends
Yunshi Lan
Xinyuan Li
Hanyue Du
Xuesong Lu
Ming Gao
Weining Qian
Aoying Zhou
40
2
0
15 Jan 2024
Towards More Faithful Natural Language Explanation Using Multi-Level Contrastive Learning in VQA
Chengen Lai
Shengli Song
Shiqi Meng
Jingyang Li
Sitong Yan
Guangneng Hu
23
5
0
21 Dec 2023
1
2
3
4
...
11
12
13
Next