Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.18010
Cited By
Test-Time Adaptation with CLIP Reward for Zero-Shot Generalization in Vision-Language Models
29 May 2023
Shuai Zhao
Xiaohan Wang
Linchao Zhu
Yezhou Yang
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Test-Time Adaptation with CLIP Reward for Zero-Shot Generalization in Vision-Language Models"
24 / 24 papers shown
Title
Search-TTA: A Multimodal Test-Time Adaptation Framework for Visual Search in the Wild
Derek Ming Siang Tan
Shailesh
Boyang Liu
Alok Raj
Qi Xuan Ang
...
Tanishq Duhan
Jimmy Chiun
Yuhong Cao
Florian Shkurti
Guillaume Sartoretti
22
0
0
16 May 2025
Mitigating Image Captioning Hallucinations in Vision-Language Models
Fei Zhao
Chengcui Zhang
Runlin Zhang
Tianyang Wang
Xi Li
VLM
44
0
0
06 May 2025
Learning from Reference Answers: Versatile Language Model Alignment without Binary Human Preference Data
Shuai Zhao
Linchao Zhu
Yi Yang
39
2
0
14 Apr 2025
Noise is an Efficient Learner for Zero-Shot Vision-Language Models
Raza Imam
Asif Hanif
Jian Zhang
Khaled Waleed Dawoud
Yova Kementchedjhieva
Mohammad Yaqub
VLM
58
0
0
09 Feb 2025
Historical Test-time Prompt Tuning for Vision Foundation Models
Jingyi Zhang
Jiaxing Huang
Xiaoqin Zhang
Ling Shao
Shijian Lu
VLM
37
4
0
27 Oct 2024
Is Less More? Exploring Token Condensation as Training-free Test-time Adaptation
Zixin Wang
Dong Gong
Sen Wang
Zi Huang
Yadan Luo
VLM
34
0
0
16 Oct 2024
LatteCLIP: Unsupervised CLIP Fine-Tuning via LMM-Synthetic Texts
Anh-Quan Cao
M. Jaritz
Matthieu Guillaumin
Raoul de Charette
Loris Bazzani
VLM
CLIP
52
2
0
10 Oct 2024
Frustratingly Easy Test-Time Adaptation of Vision-Language Models
Matteo Farina
Gianni Franchi
Giovanni Iacca
Massimiliano Mancini
Elisa Ricci
VLM
45
5
0
28 May 2024
Just Shift It: Test-Time Prototype Shifting for Zero-Shot Generalization with Vision-Language Models
Elaine Sui
Xiaohan Wang
Serena Yeung-Levy
VLM
30
5
0
19 Mar 2024
AnySkill: Learning Open-Vocabulary Physical Skill for Interactive Agents
Jieming Cui
Tengyu Liu
Nian Liu
Yaodong Yang
Yixin Zhu
Siyuan Huang
59
21
0
19 Mar 2024
AutoCLIP: Auto-tuning Zero-Shot Classifiers for Vision-Language Models
Sanghwan Kim
Hao Tang
Fisher Yu
VLM
CLIP
21
4
0
28 Sep 2023
Spectrum-guided Multi-granularity Referring Video Object Segmentation
Bo Miao
Bennamoun
Yongsheng Gao
Ajmal Mian
VOS
42
34
0
25 Jul 2023
CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model
Shuai Zhao
Xiaohan Wang
Linchao Zhu
Yezhou Yang
CLIP
VLM
23
25
0
23 May 2023
Reinforcement Learning with Knowledge Representation and Reasoning: A Brief Survey
Chao Yu
Xuejing Zheng
H. Zhuo
OffRL
LRM
55
7
0
24 Apr 2023
A Comprehensive Survey on Test-Time Adaptation under Distribution Shifts
Jian Liang
Ran He
Tien-Ping Tan
OOD
VLM
TTA
38
205
0
27 Mar 2023
DeCap: Decoding CLIP Latents for Zero-Shot Captioning via Text-Only Training
Wei Li
Linchao Zhu
Longyin Wen
Yi Yang
VLM
45
86
0
06 Mar 2023
Text-Only Training for Image Captioning using Noise-Injected CLIP
David Nukrai
Ron Mokady
Amir Globerson
VLM
CLIP
63
94
0
01 Nov 2022
Improving alignment of dialogue agents via targeted human judgements
Amelia Glaese
Nat McAleese
Maja Trkebacz
John Aslanides
Vlad Firoiu
...
John F. J. Mellor
Demis Hassabis
Koray Kavukcuoglu
Lisa Anne Hendricks
G. Irving
ALM
AAML
227
506
0
28 Sep 2022
Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models
Manli Shu
Weili Nie
De-An Huang
Zhiding Yu
Tom Goldstein
Anima Anandkumar
Chaowei Xiao
VLM
VPVLM
186
286
0
15 Sep 2022
Fine-grained Image Captioning with CLIP Reward
Jaemin Cho
Seunghyun Yoon
Ajinkya Kale
Franck Dernoncourt
Trung Bui
Joey Tianyi Zhou
CLIP
131
76
0
26 May 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
333
12,003
0
04 Mar 2022
Learning to Prompt for Vision-Language Models
Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
VPVLM
CLIP
VLM
348
2,271
0
02 Sep 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
322
3,708
0
11 Feb 2021
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
292
1,595
0
18 Sep 2019
1