Papers
Communities
Organizations
Events
Blog
Pricing
Search
Open menu
Home
Papers
1908.02265
Cited By
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks
6 August 2019
Jiasen Lu
Dhruv Batra
Devi Parikh
Stefan Lee
SSL
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks"
50 / 2,119 papers shown
Title
Large-scale Multi-Modal Pre-trained Models: A Comprehensive Survey
Tianlin Li
Guangyao Chen
Guangwu Qian
Pengcheng Gao
Xiaoyong Wei
Yaowei Wang
Yonghong Tian
Wen Gao
AI4CE
VLM
198
216
0
20 Feb 2023
Knowledge-aware Bayesian Co-attention for Multimodal Emotion Recognition
Zihan Zhao
Yu Wang
Yanfeng Wang
75
18
0
20 Feb 2023
Interactive Video Corpus Moment Retrieval using Reinforcement Learning
Zhixin Ma
Chong-Wah Ngo
73
3
0
19 Feb 2023
Hyneter: Hybrid Network Transformer for Object Detection
Dong Chen
Duoqian Miao
Xuepeng Zhao
ViT
80
4
0
18 Feb 2023
Transformadores: Fundamentos teoricos y Aplicaciones
J. D. L. Torre
175
0
0
18 Feb 2023
Towards Unifying Medical Vision-and-Language Pre-training via Soft Prompts
Zhihong Chen
Shizhe Diao
Benyou Wang
Guanbin Li
Xiang Wan
MedIm
129
33
0
17 Feb 2023
Multimodal Propaganda Processing
Vincent Ng
Shengjie Li
107
2
0
17 Feb 2023
MINOTAUR: Multi-task Video Grounding From Multimodal Queries
Raghav Goyal
E. Mavroudi
Xitong Yang
Sainbayar Sukhbaatar
Leonid Sigal
Matt Feiszli
Lorenzo Torresani
Du Tran
95
7
0
16 Feb 2023
GraphPrompt: Unifying Pre-Training and Downstream Tasks for Graph Neural Networks
Zemin Liu
Xingtong Yu
Yuan Fang
Xinming Zhang
LLMAG
AI4CE
102
148
0
16 Feb 2023
PolyFormer: Referring Image Segmentation as Sequential Polygon Generation
Jiang Liu
Hui Ding
Zhaowei Cai
Yuting Zhang
R. Satzoda
Vijay Mahadevan
R. Manmatha
ObjD
128
133
0
14 Feb 2023
Multi-modal Machine Learning in Engineering Design: A Review and Future Directions
Binyang Song
Ruilin Zhou
Faez Ahmed
AI4CE
146
47
0
14 Feb 2023
Large Scale Multi-Lingual Multi-Modal Summarization Dataset
Yash Verma
Anubhav Jangra
Raghvendra Kumar
S. Saha
40
14
0
13 Feb 2023
Understanding Multimodal Contrastive Learning and Incorporating Unpaired Data
Ryumei Nakada
Halil Ibrahim Gulluk
Zhun Deng
Wenlong Ji
James Zou
Linjun Zhang
SSL
VLM
110
41
0
13 Feb 2023
Actional Atomic-Concept Learning for Demystifying Vision-Language Navigation
Bingqian Lin
Yi Zhu
Xiaodan Liang
Liang Lin
Jian-zhuo Liu
CoGe
LM&Ro
110
3
0
13 Feb 2023
Unified Vision-Language Representation Modeling for E-Commerce Same-Style Products Retrieval
Ben Chen
Linbo Jin
Xinxin Wang
D. Gao
Wen Jiang
Wei Ning
72
3
0
10 Feb 2023
Learning by Asking for Embodied Visual Navigation and Task Completion
Ying Shen
Ismini Lourentzou
86
1
0
09 Feb 2023
Prompting for Multimodal Hateful Meme Classification
Rui Cao
Roy Ka-wei Lee
Wen-Haw Chong
Jing Jiang
VLM
87
83
0
08 Feb 2023
Pic2Word: Mapping Pictures to Words for Zero-shot Composed Image Retrieval
Kuniaki Saito
Kihyuk Sohn
Xiang Zhang
Chun-Liang Li
Chen-Yu Lee
Kate Saenko
Tomas Pfister
127
124
0
06 Feb 2023
Learning to Agree on Vision Attention for Visual Commonsense Reasoning
Zhenyang Li
Yangyang Guo
Ke-Jyun Wang
Fan Liu
Liqiang Nie
Mohan S. Kankanhalli
95
10
0
04 Feb 2023
Controlling for Stereotypes in Multimodal Language Model Evaluation
Manuj Malik
Richard Johansson
148
1
0
03 Feb 2023
Self-Supervised Relation Alignment for Scene Graph Generation
Bicheng Xu
Renjie Liao
Leonid Sigal
80
0
0
02 Feb 2023
Language Quantized AutoEncoders: Towards Unsupervised Text-Image Alignment
Hao Liu
Wilson Yan
Pieter Abbeel
99
25
0
02 Feb 2023
CLIPood: Generalizing CLIP to Out-of-Distributions
Yang Shu
Xingzhuo Guo
Jialong Wu
Ximei Wang
Jianmin Wang
Mingsheng Long
OODD
VLM
164
80
0
02 Feb 2023
mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video
Haiyang Xu
Qinghao Ye
Mingshi Yan
Yaya Shi
Jiabo Ye
...
Guohai Xu
Ji Zhang
Songfang Huang
Feiran Huang
Jingren Zhou
MLLM
VLM
MoE
123
171
0
01 Feb 2023
Multimodality Representation Learning: A Survey on Evolution, Pretraining and Its Applications
Muhammad Arslan Manzoor
S. Albarri
Ziting Xian
Zaiqiao Meng
Preslav Nakov
Shangsong Liang
AI4TS
107
32
0
01 Feb 2023
Efficient Scopeformer: Towards Scalable and Rich Feature Extraction for Intracranial Hemorrhage Detection
Yassine Barhoumi
N. Bouaynaya
Ghulam Rasool
MedIm
48
5
0
01 Feb 2023
Grounding Language Models to Images for Multimodal Inputs and Outputs
Jing Yu Koh
Ruslan Salakhutdinov
Daniel Fried
MLLM
150
123
0
31 Jan 2023
UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers
Dachuan Shi
Chaofan Tao
Ying Jin
Zhendong Yang
Chun Yuan
Jiaqi Wang
VLM
ViT
133
39
0
31 Jan 2023
RREx-BoT: Remote Referring Expressions with a Bag of Tricks
Gunnar Sigurdsson
Jesse Thomason
Gaurav Sukhatme
Robinson Piramuthu
LM&Ro
116
9
0
30 Jan 2023
Debiased Fine-Tuning for Vision-language Models by Prompt Regularization
Beier Zhu
Yulei Niu
Saeil Lee
Minhoe Hur
Hanwang Zhang
VLM
VPVLM
130
24
0
29 Jan 2023
Learning the Effects of Physical Actions in a Multi-modal Environment
Gautier Dagan
Frank Keller
A. Lascarides
LM&Ro
94
4
0
27 Jan 2023
Semi-Parametric Video-Grounded Text Generation
Sungdong Kim
Jin-Hwa Kim
Jiyoung Lee
Minjoon Seo
VGen
80
14
0
27 Jan 2023
Characterizing the Entities in Harmful Memes: Who is the Hero, the Villain, the Victim?
Shivam Sharma
Atharva Kulkarni
Tharun Suresh
Himanshi Mathur
Preslav Nakov
Md. Shad Akhtar
Tanmoy Chakraborty
113
17
0
26 Jan 2023
Lexi: Self-Supervised Learning of the UI Language
Pratyay Banerjee
Shweti Mahajan
Kushal Arora
Chitta Baral
Oriana Riva
68
17
0
23 Jan 2023
Learning Open-vocabulary Semantic Segmentation Models From Natural Language Supervision
Jilan Xu
Junlin Hou
Yuejie Zhang
Rui Feng
Yi Wang
Yu Qiao
Weidi Xie
VLM
88
87
0
22 Jan 2023
A Survey of research in Deep Learning for Robotics for Undergraduate research interns
P. NarayananP.
Palacode Narayana Iyer Anantharaman
50
1
0
19 Jan 2023
Masked Autoencoding Does Not Help Natural Language Supervision at Scale
Floris Weers
Vaishaal Shankar
Angelos Katharopoulos
Yinfei Yang
Tom Gunter
CLIP
59
5
0
19 Jan 2023
Effective End-to-End Vision Language Pretraining with Semantic Visual Loss
Xiaofeng Yang
Fayao Liu
Guosheng Lin
VLM
49
8
0
18 Jan 2023
Curriculum Script Distillation for Multilingual Visual Question Answering
Khyathi Chandu
A. Geramifard
71
0
0
17 Jan 2023
USER: Unified Semantic Enhancement with Momentum Contrast for Image-Text Retrieval
Yan Zhang
Zhong Ji
Dingrong Wang
Yanwei Pang
Xuelong Li
VLM
66
23
0
17 Jan 2023
Text to Point Cloud Localization with Relation-Enhanced Transformer
Guangzhi Wang
Hehe Fan
Mohan S. Kankanhalli
3DPC
86
15
0
13 Jan 2023
See, Think, Confirm: Interactive Prompting Between Vision and Language Models for Knowledge-based Visual Reasoning
Zhenfang Chen
Qinhong Zhou
Songlin Yang
Yining Hong
Hao Zhang
Chuang Gan
LRM
VLM
118
41
0
12 Jan 2023
Toward Building General Foundation Models for Language, Vision, and Vision-Language Understanding Tasks
Xinsong Zhang
Yan Zeng
Jipeng Zhang
Hang Li
VLM
AI4CE
LRM
128
17
0
12 Jan 2023
Multimodal Inverse Cloze Task for Knowledge-based Visual Question Answering
Paul Lerner
O. Ferret
C. Guinaudeau
84
9
0
11 Jan 2023
Transformers as Policies for Variable Action Environments
Niklas Zwingenberger
37
2
0
09 Jan 2023
Learning Bidirectional Action-Language Translation with Limited Supervision and Incongruent Input
Ozan Ozdemir
Matthias Kerzel
C. Weber
Jae Hee Lee
Muhammad Burhan Hafez
P. Bruns
S. Wermter
70
1
0
09 Jan 2023
Universal Multimodal Representation for Language Understanding
Zhuosheng Zhang
Kehai Chen
Rui Wang
Masao Utiyama
Eiichiro Sumita
Z. Li
Hai Zhao
SSL
109
22
0
09 Jan 2023
MAQA: A Multimodal QA Benchmark for Negation
Judith Yue Li
Aren Jansen
Qingqing Huang
Joonseok Lee
Ravi Ganti
Dima Kuzmin
81
5
0
09 Jan 2023
Transferring Pre-trained Multimodal Representations with Cross-modal Similarity Matching
Byoungjip Kim
Sun Choi
Dasol Hwang
Moontae Lee
Honglak Lee
76
11
0
07 Jan 2023
Learning Trajectory-Word Alignments for Video-Language Tasks
Xu Yang
Zhang Li
Haiyang Xu
Hanwang Zhang
Qinghao Ye
Chenliang Li
Ming Yan
Yu Zhang
Fei Huang
Songfang Huang
95
7
0
05 Jan 2023
Previous
1
2
3
...
18
19
20
...
41
42
43
Next