Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.13033
Cited By
A Transformer-based Cross-modal Fusion Model with Adversarial Training for VQA Challenge 2021
24 June 2021
Keda Lu
Bo Fang
Kuan-Yu Chen
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Transformer-based Cross-modal Fusion Model with Adversarial Training for VQA Challenge 2021"
2 / 2 papers shown
Title
VinVL: Revisiting Visual Representations in Vision-Language Models
Pengchuan Zhang
Xiujun Li
Xiaowei Hu
Jianwei Yang
Lei Zhang
Lijuan Wang
Yejin Choi
Jianfeng Gao
ObjD
VLM
260
157
0
02 Jan 2021
Unified Vision-Language Pre-Training for Image Captioning and VQA
Luowei Zhou
Hamid Palangi
Lei Zhang
Houdong Hu
Jason J. Corso
Jianfeng Gao
MLLM
VLM
252
927
0
24 Sep 2019
1