ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2004.08744
  4. Cited By
Are we pretraining it right? Digging deeper into visio-linguistic
  pretraining

Are we pretraining it right? Digging deeper into visio-linguistic pretraining

19 April 2020
Amanpreet Singh
Vedanuj Goswami
Devi Parikh
    VLM
ArXivPDFHTML

Papers citing "Are we pretraining it right? Digging deeper into visio-linguistic pretraining"

19 / 19 papers shown
Title
Training Vision-Language Models with Less Bimodal Supervision
Training Vision-Language Models with Less Bimodal Supervision
Elad Segal
Ben Bogin
Jonathan Berant
VLM
21
2
0
01 Nov 2022
Winoground: Probing Vision and Language Models for Visio-Linguistic
  Compositionality
Winoground: Probing Vision and Language Models for Visio-Linguistic Compositionality
Tristan Thrush
Ryan Jiang
Max Bartolo
Amanpreet Singh
Adina Williams
Douwe Kiela
Candace Ross
CoGe
34
404
0
07 Apr 2022
Self-Training Vision Language BERTs with a Unified Conditional Model
Self-Training Vision Language BERTs with a Unified Conditional Model
Xiaofeng Yang
Fengmao Lv
Fayao Liu
Guosheng Lin
SSL
VLM
54
13
0
06 Jan 2022
Hateful Memes Challenge: An Enhanced Multimodal Framework
Hateful Memes Challenge: An Enhanced Multimodal Framework
Aijing Gao
Bingjun Wang
Jiaqi Yin
Yating Tian
21
2
0
20 Dec 2021
FLAVA: A Foundational Language And Vision Alignment Model
FLAVA: A Foundational Language And Vision Alignment Model
Amanpreet Singh
Ronghang Hu
Vedanuj Goswami
Guillaume Couairon
Wojciech Galuba
Marcus Rohrbach
Douwe Kiela
CLIP
VLM
40
690
0
08 Dec 2021
Exceeding the Limits of Visual-Linguistic Multi-Task Learning
Exceeding the Limits of Visual-Linguistic Multi-Task Learning
Cameron R. Wolfe
Keld T. Lundgaard
VLM
45
2
0
27 Jul 2021
Probing Image-Language Transformers for Verb Understanding
Probing Image-Language Transformers for Verb Understanding
Lisa Anne Hendricks
Aida Nematzadeh
30
114
0
16 Jun 2021
Recent Advances in Deep Learning Based Dialogue Systems: A Systematic
  Survey
Recent Advances in Deep Learning Based Dialogue Systems: A Systematic Survey
Jinjie Ni
Tom Young
Vlad Pandelea
Fuzhao Xue
Min Zhang
54
268
0
10 May 2021
InfographicVQA
InfographicVQA
Minesh Mathew
Viraj Bagal
Rubèn Pérez Tito
Dimosthenis Karatzas
Ernest Valveny
C. V. Jawahar
42
206
0
26 Apr 2021
Cross-Modal Retrieval Augmentation for Multi-Modal Classification
Cross-Modal Retrieval Augmentation for Multi-Modal Classification
Shir Gur
Natalia Neverova
C. Stauffer
Ser-Nam Lim
Douwe Kiela
A. Reiter
19
26
0
16 Apr 2021
Decoupling the Role of Data, Attention, and Losses in Multimodal
  Transformers
Decoupling the Role of Data, Attention, and Losses in Multimodal Transformers
Lisa Anne Hendricks
John F. J. Mellor
R. Schneider
Jean-Baptiste Alayrac
Aida Nematzadeh
79
110
0
31 Jan 2021
The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes
The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes
Douwe Kiela
Hamed Firooz
Aravind Mohan
Vedanuj Goswami
Amanpreet Singh
Pratik Ringshia
Davide Testuggine
45
580
0
10 May 2020
Unified Vision-Language Pre-Training for Image Captioning and VQA
Unified Vision-Language Pre-Training for Image Captioning and VQA
Luowei Zhou
Hamid Palangi
Lei Zhang
Houdong Hu
Jason J. Corso
Jianfeng Gao
MLLM
VLM
252
927
0
24 Sep 2019
Supervised Multimodal Bitransformers for Classifying Images and Text
Supervised Multimodal Bitransformers for Classifying Images and Text
Douwe Kiela
Suvrat Bhooshan
Hamed Firooz
Ethan Perez
Davide Testuggine
59
242
0
06 Sep 2019
Boosting Self-Supervised Learning via Knowledge Transfer
Boosting Self-Supervised Learning via Knowledge Transfer
M. Noroozi
Ananth Vinjimoor
Paolo Favaro
Hamed Pirsiavash
SSL
224
292
0
01 May 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
299
6,996
0
20 Apr 2018
Neural Baby Talk
Neural Baby Talk
Jiasen Lu
Jianwei Yang
Dhruv Batra
Devi Parikh
VLM
200
434
0
27 Mar 2018
Aggregated Residual Transformations for Deep Neural Networks
Aggregated Residual Transformations for Deep Neural Networks
Saining Xie
Ross B. Girshick
Piotr Dollár
Zhuowen Tu
Kaiming He
303
10,233
0
16 Nov 2016
Multimodal Compact Bilinear Pooling for Visual Question Answering and
  Visual Grounding
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
Akira Fukui
Dong Huk Park
Daylen Yang
Anna Rohrbach
Trevor Darrell
Marcus Rohrbach
167
1,465
0
06 Jun 2016
1