Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1912.02315
Cited By
v1
v2 (latest)
12-in-1: Multi-Task Vision and Language Representation Learning
5 December 2019
Jiasen Lu
Vedanuj Goswami
Marcus Rohrbach
Devi Parikh
Stefan Lee
VLM
ObjD
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"12-in-1: Multi-Task Vision and Language Representation Learning"
5 / 55 papers shown
Title
Visual7W: Grounded Question Answering in Images
Yuke Zhu
Oliver Groth
Michael S. Bernstein
Li Fei-Fei
91
884
0
11 Nov 2015
Generation and Comprehension of Unambiguous Object Descriptions
Junhua Mao
Jonathan Huang
Alexander Toshev
Oana-Maria Camburu
Alan Yuille
Kevin Patrick Murphy
ObjD
126
1,345
0
07 Nov 2015
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Shaoqing Ren
Kaiming He
Ross B. Girshick
Jian Sun
AIMat
ObjD
520
62,294
0
04 Jun 2015
Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models
Bryan A. Plummer
Liwei Wang
Christopher M. Cervantes
Juan C. Caicedo
Julia Hockenmaier
Svetlana Lazebnik
199
2,060
0
19 May 2015
Microsoft COCO Captions: Data Collection and Evaluation Server
Xinlei Chen
Hao Fang
Nayeon Lee
Ramakrishna Vedantam
Saurabh Gupta
Piotr Dollar
C. L. Zitnick
215
2,478
0
01 Apr 2015
Previous
1
2