ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2312.01564
  4. Cited By
APoLLo: Unified Adapter and Prompt Learning for Vision Language Models

APoLLo: Unified Adapter and Prompt Learning for Vision Language Models

4 December 2023
Sanjoy Chowdhury
Sayan Nag
Dinesh Manocha
    VLM
ArXiv (abs)PDFHTML

Papers citing "APoLLo: Unified Adapter and Prompt Learning for Vision Language Models"

34 / 34 papers shown
Title
Aurelia: Test-time Reasoning Distillation in Audio-Visual LLMs
Aurelia: Test-time Reasoning Distillation in Audio-Visual LLMs
Sanjoy Chowdhury
Hanan Gani
Nishit Anand
Sayan Nag
Ruohan Gao
Mohamed Elhoseiny
Salman Khan
Dinesh Manocha
LRM
168
1
0
29 Mar 2025
Consolidator: Mergeable Adapter with Grouped Connections for Visual
  Adaptation
Consolidator: Mergeable Adapter with Grouped Connections for Visual Adaptation
Tianxiang Hao
Hui Chen
Yuchen Guo
Guiguang Ding
128
16
0
30 Apr 2023
Synthetic Data from Diffusion Models Improves ImageNet Classification
Synthetic Data from Diffusion Models Improves ImageNet Classification
Shekoofeh Azizi
Simon Kornblith
Chitwan Saharia
Mohammad Norouzi
David J. Fleet
VLMDiffM
112
315
0
17 Apr 2023
Effective Data Augmentation With Diffusion Models
Effective Data Augmentation With Diffusion Models
Brandon Trabucco
Kyle Doherty
Max Gurinas
Ruslan Salakhutdinov
DiffMVLM
94
256
0
07 Feb 2023
Position-guided Text Prompt for Vision-Language Pre-training
Position-guided Text Prompt for Vision-Language Pre-training
Alex Jinpeng Wang
Pan Zhou
Mike Zheng Shou
Shuicheng Yan
VLM
52
38
0
19 Dec 2022
Uni-Perceiver v2: A Generalist Model for Large-Scale Vision and
  Vision-Language Tasks
Uni-Perceiver v2: A Generalist Model for Large-Scale Vision and Vision-Language Tasks
Hao Li
Jinguo Zhu
Xiaohu Jiang
Xizhou Zhu
Hongsheng Li
...
Xiaohua Wang
Yu Qiao
Xiaogang Wang
Wenhai Wang
Jifeng Dai
MLLM
74
57
0
17 Nov 2022
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BigScience Workshop
:
Teven Le Scao
Angela Fan
Christopher Akiki
...
Zhongli Xie
Zifan Ye
M. Bras
Younes Belkada
Thomas Wolf
VLM
411
2,393
0
09 Nov 2022
Bridging the Gap between Object and Image-level Representations for
  Open-Vocabulary Detection
Bridging the Gap between Object and Image-level Representations for Open-Vocabulary Detection
H. Rasheed
Muhammad Maaz
Muhammad Uzair Khattak
Salman Khan
Fahad Shahbaz Khan
ObjDVLM
109
154
0
07 Jul 2022
Open-Vocabulary DETR with Conditional Matching
Open-Vocabulary DETR with Conditional Matching
Yuhang Zang
Wei Li
Kaiyang Zhou
Chen Huang
Chen Change Loy
ObjDVLM
133
205
0
22 Mar 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified
  Vision-Language Understanding and Generation
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
Guosheng Lin
MLLMBDLVLMCLIP
555
4,413
0
28 Jan 2022
Detecting Twenty-thousand Classes using Image-level Supervision
Detecting Twenty-thousand Classes using Image-level Supervision
Xingyi Zhou
Rohit Girdhar
Armand Joulin
Phillip Krahenbuhl
Ishan Misra
CLIPVLM
113
618
0
07 Jan 2022
High-Resolution Image Synthesis with Latent Diffusion Models
High-Resolution Image Synthesis with Latent Diffusion Models
Robin Rombach
A. Blattmann
Dominik Lorenz
Patrick Esser
Bjorn Ommer
3DV
502
15,788
0
20 Dec 2021
VL-Adapter: Parameter-Efficient Transfer Learning for
  Vision-and-Language Tasks
VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks
Yi-Lin Sung
Jaemin Cho
Joey Tianyi Zhou
VLMVPVLM
112
356
0
13 Dec 2021
UniTAB: Unifying Text and Box Outputs for Grounded Vision-Language
  Modeling
UniTAB: Unifying Text and Box Outputs for Grounded Vision-Language Modeling
Zhengyuan Yang
Zhe Gan
Jianfeng Wang
Xiaowei Hu
Faisal Ahmed
Zicheng Liu
Yumao Lu
Lijuan Wang
115
116
0
23 Nov 2021
Florence: A New Foundation Model for Computer Vision
Florence: A New Foundation Model for Computer Vision
Lu Yuan
Dongdong Chen
Yi-Ling Chen
Noel Codella
Xiyang Dai
...
Zhen Xiao
Jianwei Yang
Michael Zeng
Luowei Zhou
Pengchuan Zhang
VLM
141
908
0
22 Nov 2021
Tip-Adapter: Training-free CLIP-Adapter for Better Vision-Language
  Modeling
Tip-Adapter: Training-free CLIP-Adapter for Better Vision-Language Modeling
Renrui Zhang
Rongyao Fang
Wei Zhang
Peng Gao
Kunchang Li
Jifeng Dai
Yu Qiao
Hongsheng Li
VLM
284
402
0
06 Nov 2021
CLIP-Adapter: Better Vision-Language Models with Feature Adapters
CLIP-Adapter: Better Vision-Language Models with Feature Adapters
Peng Gao
Shijie Geng
Renrui Zhang
Teli Ma
Rongyao Fang
Yongfeng Zhang
Hongsheng Li
Yu Qiao
VLMCLIP
324
1,050
0
09 Oct 2021
The Power of Scale for Parameter-Efficient Prompt Tuning
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
592
4,093
0
18 Apr 2021
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng Zhang
Stephen Lin
B. Guo
ViT
467
21,603
0
25 Mar 2021
Barlow Twins: Self-Supervised Learning via Redundancy Reduction
Barlow Twins: Self-Supervised Learning via Redundancy Reduction
Jure Zbontar
Li Jing
Ishan Misra
Yann LeCun
Stéphane Deny
SSL
347
2,366
0
04 Mar 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLMCLIP
463
3,901
0
11 Feb 2021
Prefix-Tuning: Optimizing Continuous Prompts for Generation
Prefix-Tuning: Optimizing Continuous Prompts for Generation
Xiang Lisa Li
Percy Liang
252
4,305
0
01 Jan 2021
CO2: Consistent Contrast for Unsupervised Visual Representation Learning
CO2: Consistent Contrast for Unsupervised Visual Representation Learning
Chen Wei
Huiyu Wang
Wei Shen
Alan Yuille
SSL
83
63
0
05 Oct 2020
The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution
  Generalization
The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization
Dan Hendrycks
Steven Basart
Norman Mu
Saurav Kadavath
Frank Wang
...
Samyak Parajuli
Mike Guo
Basel Alomair
Jacob Steinhardt
Justin Gilmer
OOD
363
1,757
0
29 Jun 2020
A Simple Framework for Contrastive Learning of Visual Representations
A Simple Framework for Contrastive Learning of Visual Representations
Ting-Li Chen
Simon Kornblith
Mohammad Norouzi
Geoffrey E. Hinton
SSL
390
18,897
0
13 Feb 2020
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
490
20,342
0
23 Oct 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
AIMat
692
24,557
0
26 Jul 2019
Learning Robust Global Representations by Penalizing Local Predictive
  Power
Learning Robust Global Representations by Penalizing Local Predictive Power
Haohan Wang
Songwei Ge
Eric Xing
Zachary Chase Lipton
OOD
122
967
0
29 May 2019
Do ImageNet Classifiers Generalize to ImageNet?
Do ImageNet Classifiers Generalize to ImageNet?
Benjamin Recht
Rebecca Roelofs
Ludwig Schmidt
Vaishaal Shankar
OODSSegVLM
121
1,728
0
13 Feb 2019
EDA: Easy Data Augmentation Techniques for Boosting Performance on Text
  Classification Tasks
EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks
Jason W. Wei
Kai Zou
119
1,963
0
31 Jan 2019
Learning multiple visual domains with residual adapters
Learning multiple visual domains with residual adapters
Sylvestre-Alvise Rebuffi
Hakan Bilen
Andrea Vedaldi
OOD
176
939
0
22 May 2017
Incremental Learning Through Deep Adaptation
Incremental Learning Through Deep Adaptation
Amir Rosenfeld
John K. Tsotsos
CLL
76
278
0
11 May 2017
Fine-Grained Visual Classification of Aircraft
Fine-Grained Visual Classification of Aircraft
Subhransu Maji
Esa Rahtu
Arno Solin
Matthew Blaschko
Andrea Vedaldi
126
2,272
0
21 Jun 2013
UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild
UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild
K. Soomro
Amir Zamir
M. Shah
CLIPVGen
163
6,170
0
03 Dec 2012
1