Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2212.01803
Cited By
Controllable Image Captioning via Prompting
4 December 2022
Ning Wang
Jiahao Xie
Jihao Wu
Mingbo Jia
Linlin Li
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Controllable Image Captioning via Prompting"
19 / 19 papers shown
Title
Modeling Thousands of Human Annotators for Generalizable Text-to-Image Person Re-identification
Jiayu Jiang
Changxing Ding
Wentao Tan
Junhong Wang
Jin Tao
Xiangmin Xu
51
1
0
13 Mar 2025
Continual Panoptic Perception: Towards Multi-modal Incremental Interpretation of Remote Sensing Images
Bo Yuan
Danpei Zhao
Zhuoran Liu
Wentao Li
Tian Li
CLL
VLM
30
2
0
19 Jul 2024
Controllable Contextualized Image Captioning: Directing the Visual Narrative through User-Defined Highlights
Shunqi Mao
Chaoyi Zhang
Hang Su
Hwanjun Song
Igor Shalyminov
Weidong Cai
30
1
0
16 Jul 2024
Prompt Learning for Generalized Vehicle Routing
Fei Liu
Xi Lin
Weiduo Liao
Zhenkun Wang
Qingfu Zhang
Xialiang Tong
Mingxuan Yuan
VLM
42
3
0
20 May 2024
HistGen: Histopathology Report Generation via Local-Global Feature Encoding and Cross-modal Context Interaction
Zhengrui Guo
Jiabo Ma
Ying Xu
Yihui Wang
Liansheng Wang
Hao Chen
50
17
0
08 Mar 2024
Open-Vocabulary Calibration for Fine-tuned CLIP
Shuoyuan Wang
Jindong Wang
Guoqing Wang
Bob Zhang
Kaiyang Zhou
Hongxin Wei
VLM
31
5
0
07 Feb 2024
Can Prompt Learning Benefit Radiology Report Generation?
Jun Wang
Lixing Zhu
A. Bhalerao
Yulan He
MedIm
36
2
0
30 Aug 2023
Caption Anything: Interactive Image Description with Diverse Multimodal Controls
Teng Wang
Jinrui Zhang
Junjie Fei
Hao Zheng
Yunlong Tang
Zhe Li
Mingqi Gao
Shanshan Zhao
MLLM
102
82
0
04 May 2023
Learning Combinatorial Prompts for Universal Controllable Image Captioning
Zhen Wang
Jun Xiao
Yueting Zhuang
Fei Gao
Jian Shao
Long Chen
54
5
0
11 Mar 2023
DEVICE: Depth and Visual Concepts Aware Transformer for OCR-based Image Captioning
Dongsheng Xu
Qingbao Huang
Shuang Feng
Yiru Cai
Feng Shuang
Yi Cai
ViT
VLM
27
1
0
03 Feb 2023
OSIC: A New One-Stage Image Captioner Coined
Bo Wang
Zhao Zhang
Ming Zhao
Xiaojie Jin
Mingliang Xu
Meng Wang
VLM
23
3
0
04 Nov 2022
PreSTU: Pre-Training for Scene-Text Understanding
Jihyung Kil
Soravit Changpinyo
Xi Chen
Hexiang Hu
Sebastian Goodman
Wei-Lun Chao
Radu Soricut
VLM
135
29
0
12 Sep 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
392
4,137
0
28 Jan 2022
Learning to Prompt for Vision-Language Models
Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
VPVLM
CLIP
VLM
330
2,267
0
02 Sep 2021
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
280
3,848
0
18 Apr 2021
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
Soravit Changpinyo
P. Sharma
Nan Ding
Radu Soricut
VLM
278
1,082
0
17 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
298
3,700
0
11 Feb 2021
Improving Image Captioning by Leveraging Intra- and Inter-layer Global Representation in Transformer Network
Jiayi Ji
Yunpeng Luo
Xiaoshuai Sun
Fuhai Chen
Gen Luo
Yongjian Wu
Yue Gao
Rongrong Ji
ViT
43
170
0
13 Dec 2020
Unified Vision-Language Pre-Training for Image Captioning and VQA
Luowei Zhou
Hamid Palangi
Lei Zhang
Houdong Hu
Jason J. Corso
Jianfeng Gao
MLLM
VLM
252
927
0
24 Sep 2019
1