ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2201.03965
  4. Cited By
On the Efficacy of Co-Attention Transformer Layers in Visual Question
  Answering

On the Efficacy of Co-Attention Transformer Layers in Visual Question Answering

11 January 2022
Ankur Sikarwar
Gabriel Kreiman
    ViT
ArXivPDFHTML

Papers citing "On the Efficacy of Co-Attention Transformer Layers in Visual Question Answering"

3 / 3 papers shown
Title
BAN-Cap: A Multi-Purpose English-Bangla Image Descriptions Dataset
BAN-Cap: A Multi-Purpose English-Bangla Image Descriptions Dataset
Mohammad Faiyaz Khan
S. M. S. Shifath
Md. Saiful Islam
16
6
0
28 May 2022
Multi-modal Transformer for Video Retrieval
Multi-modal Transformer for Video Retrieval
Valentin Gabeur
Chen Sun
Alahari Karteek
Cordelia Schmid
ViT
424
596
0
21 Jul 2020
Unified Vision-Language Pre-Training for Image Captioning and VQA
Unified Vision-Language Pre-Training for Image Captioning and VQA
Luowei Zhou
Hamid Palangi
Lei Zhang
Houdong Hu
Jason J. Corso
Jianfeng Gao
MLLM
VLM
252
927
0
24 Sep 2019
1