Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2312.17109
Cited By
MIVC: Multiple Instance Visual Component for Visual-Language Models
28 December 2023
Wenyi Wu
Qi Li
Leon Wenliang Zhong
Junzhou Huang
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"MIVC: Multiple Instance Visual Component for Visual-Language Models"
17 / 17 papers shown
Title
InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning
Wenliang Dai
Junnan Li
Dongxu Li
A. M. H. Tiong
Junqi Zhao
Weisheng Wang
Boyang Albert Li
Pascale Fung
Steven C. H. Hoi
MLLM
VLM
119
2,067
0
11 May 2023
Scaling Instruction-Finetuned Language Models
Hyung Won Chung
Le Hou
Shayne Longpre
Barret Zoph
Yi Tay
...
Jacob Devlin
Adam Roberts
Denny Zhou
Quoc V. Le
Jason W. Wei
ReLM
LRM
192
3,128
0
20 Oct 2022
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
Pan Lu
Swaroop Mishra
Tony Xia
Liang Qiu
Kai-Wei Chang
Song-Chun Zhu
Oyvind Tafjord
Peter Clark
Ashwin Kalyan
ELM
ReLM
LRM
278
1,286
0
20 Sep 2022
Flamingo: a Visual Language Model for Few-Shot Learning
Jean-Baptiste Alayrac
Jeff Donahue
Pauline Luc
Antoine Miech
Iain Barr
...
Mikolaj Binkowski
Ricardo Barreira
Oriol Vinyals
Andrew Zisserman
Karen Simonyan
MLLM
VLM
392
3,542
0
29 Apr 2022
DTFD-MIL: Double-Tier Feature Distillation Multiple Instance Learning for Histopathology Whole Slide Image Classification
Hongrun Zhang
Y. Meng
Yitian Zhao
Yihong Qiao
Xiaoyun Yang
Sarah E Coupland
Yalin Zheng
89
290
0
22 Mar 2022
Vision-Language Pre-Training with Triple Contrastive Learning
Jinyu Yang
Jiali Duan
Son N. Tran
Yi Xu
Sampath Chanda
Liqun Chen
Belinda Zeng
Trishul Chilimbi
Junzhou Huang
VLM
108
295
0
21 Feb 2022
ABO: Dataset and Benchmarks for Real-World 3D Object Understanding
Jasmine Collins
Shubham Goel
Kenan Deng
Achleshwar Luthra
Leon L. Xu
...
T. F. Y. Vicente
T. Dideriksen
H. Arora
M. Guillaumin
Jitendra Malik
213
229
0
12 Oct 2021
TransMIL: Transformer based Correlated Multiple Instance Learning for Whole Slide Image Classification
Zhucheng Shao
Hao Bian
Yang Chen
Yifeng Wang
Jian Zhang
Xiangyang Ji
Yongbing Zhang
ViT
MedIm
95
674
0
02 Jun 2021
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
415
4,953
0
24 Feb 2021
Dual-stream Multiple Instance Learning Network for Whole Slide Image Classification with Self-supervised Contrastive Learning
Bin Li
Yin Li
K. Eliceiri
82
618
0
17 Nov 2020
Contrastive Learning for Weakly Supervised Phrase Grounding
Tanmay Gupta
Arash Vahdat
Gal Chechik
Xiaodong Yang
Jan Kautz
Derek Hoiem
ObjD
SSL
119
142
0
17 Jun 2020
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
445
20,181
0
23 Oct 2019
Image Captioning
Vikram Mullachery
Vishal Motwani
49
56
0
13 May 2018
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering
Yash Goyal
Tejas Khot
D. Summers-Stay
Dhruv Batra
Devi Parikh
CoGe
345
3,246
0
02 Dec 2016
Image Captioning with Semantic Attention
Quanzeng You
Hailin Jin
Zhaowen Wang
Chen Fang
Jiebo Luo
VLM
171
1,662
0
12 Mar 2016
Yin and Yang: Balancing and Answering Binary Visual Questions
Peng Zhang
Yash Goyal
D. Summers-Stay
Dhruv Batra
Devi Parikh
CoGe
87
352
0
16 Nov 2015
VQA: Visual Question Answering
Aishwarya Agrawal
Jiasen Lu
Stanislaw Antol
Margaret Mitchell
C. L. Zitnick
Dhruv Batra
Devi Parikh
CoGe
211
5,478
0
03 May 2015
1