Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2302.14794
Cited By
Meta Learning to Bridge Vision and Language Models for Multimodal Few-Shot Learning
28 February 2023
Ivona Najdenkoska
Xiantong Zhen
M. Worring
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Meta Learning to Bridge Vision and Language Models for Multimodal Few-Shot Learning"
18 / 18 papers shown
Title
Few-shot target-driven instance detection based on open-vocabulary object detection models
Ben Crulis
Barthélémy Serres
Cyril de Runz
Gilles Venturini
VLM
ObjD
24
0
0
21 Oct 2024
FastEdit: Fast Text-Guided Single-Image Editing via Semantic-Aware Diffusion Fine-Tuning
Zhi Chen
Zecheng Zhao
Yadan Luo
Zi Huang
DiffM
48
4
0
06 Aug 2024
One-Stage Open-Vocabulary Temporal Action Detection Leveraging Temporal Multi-scale and Action Label Features
Trung Thanh Nguyen
Yasutomo Kawanishi
Takahiro Komamizu
Ichiro Ide
VLM
33
3
0
30 Apr 2024
LuoJiaHOG: A Hierarchy Oriented Geo-aware Image Caption Dataset for Remote Sensing Image-Text Retrival
Yuanxin Zhao
Mi Zhang
Bingnan Yang
Zhan Zhang
Jiaju Kang
Jianya Gong
35
2
0
16 Mar 2024
Task Attribute Distance for Few-Shot Learning: Theoretical Analysis and Applications
Minyang Hu
Hong Chang
Zong Guo
Bingpeng Ma
Shiguang Shan
Xilin Chen
VLM
29
1
0
06 Mar 2024
X-InstructBLIP: A Framework for aligning X-Modal instruction-aware representations to LLMs and Emergent Cross-modal Reasoning
Artemis Panagopoulou
Le Xue
Ning Yu
Junnan Li
Dongxu Li
Shafiq R. Joty
Ran Xu
Silvio Savarese
Caiming Xiong
Juan Carlos Niebles
VLM
MLLM
41
45
0
30 Nov 2023
Self-Supervised Open-Ended Classification with Small Visual Language Models
Mohammad Mahdi Derakhshani
Ivona Najdenkoska
Cees G. M. Snoek
M. Worring
Yuki M. Asano
VLM
22
0
0
30 Sep 2023
Jointly Training Large Autoregressive Multimodal Models
Emanuele Aiello
L. Yu
Yixin Nie
Armen Aghajanyan
Barlas Oğuz
19
29
0
27 Sep 2023
MMICL: Empowering Vision-language Model with Multi-Modal In-Context Learning
Haozhe Zhao
Zefan Cai
Shuzheng Si
Xiaojian Ma
Kaikai An
Liang Chen
Zixuan Liu
Sheng Wang
Wenjuan Han
Baobao Chang
MLLM
VLM
28
133
0
14 Sep 2023
Mobile Foundation Model as Firmware
Jinliang Yuan
Chenchen Yang
Dongqi Cai
Shihe Wang
Xin Yuan
...
Di Zhang
Hanzi Mei
Xianqing Jia
Shangguang Wang
Mengwei Xu
40
19
0
28 Aug 2023
Cross-Modal Concept Learning and Inference for Vision-Language Models
Yi Zhang
Ce Zhang
Yushun Tang
Z. He
VLM
MLLM
CLIP
36
15
0
28 Jul 2023
RSGPT: A Remote Sensing Vision Language Model and Benchmark
Yuan Hu
Jianlong Yuan
Congcong Wen
Xiaonan Lu
Xiang Li
VLM
26
99
0
28 Jul 2023
Open-Ended Medical Visual Question Answering Through Prefix Tuning of Language Models
Tom van Sonsbeek
Mohammad Mahdi Derakhshani
Ivona Najdenkoska
Cees G. M. Snoek
M. Worring
LM&MA
11
51
0
10 Mar 2023
How Much Can CLIP Benefit Vision-and-Language Tasks?
Sheng Shen
Liunian Harold Li
Hao Tan
Joey Tianyi Zhou
Anna Rohrbach
Kai-Wei Chang
Z. Yao
Kurt Keutzer
CLIP
VLM
MLLM
202
405
0
13 Jul 2021
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
280
3,848
0
18 Apr 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
313
3,708
0
11 Feb 2021
Rapid Learning or Feature Reuse? Towards Understanding the Effectiveness of MAML
Aniruddh Raghu
M. Raghu
Samy Bengio
Oriol Vinyals
183
639
0
19 Sep 2019
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
Chelsea Finn
Pieter Abbeel
Sergey Levine
OOD
344
11,684
0
09 Mar 2017
1