Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2302.06098
Cited By
Towards Local Visual Modeling for Image Captioning
13 February 2023
Yiwei Ma
Jiayi Ji
Xiaoshuai Sun
Yiyi Zhou
Rongrong Ji
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Towards Local Visual Modeling for Image Captioning"
20 / 20 papers shown
Title
MLLM-Selector: Necessity and Diversity-driven High-Value Data Selection for Enhanced Visual Instruction Tuning
Yiwei Ma
Guohai Xu
Xiaoshuai Sun
Jiayi Ji
Jie Lou
Debing Zhang
Rongrong Ji
155
0
0
26 Mar 2025
TAR3D: Creating High-Quality 3D Assets via Next-Part Prediction
Xuying Zhang
Yutong Liu
Yangguang Li
Renrui Zhang
Yong Liu
...
Wanli Ouyang
Zhiwei Xiong
Peng Gao
Qibin Hou
Ming-Ming Cheng
168
3
0
13 Mar 2025
An Ensemble Model with Attention Based Mechanism for Image Captioning
Israa Al Badarneh
Bassam Hammo
Omar Al-Kadi
122
5
0
28 Jan 2025
Semantics Prompting Data-Free Quantization for Low-Bit Vision Transformers
Mingliang Xu
Yuyao Zhou
Yuxin Zhang
Shen Li
Yong Li
Chia-Wen Lin
Zhanpeng Zeng
Rongrong Ji
MQ
218
0
0
31 Dec 2024
Referring Camouflaged Object Detection
Xuying Zhang
Bo Yin
Zheng Lin
Qibin Hou
Deng-Ping Fan
Ming-Ming Cheng
99
17
0
13 Jun 2023
Diverse Branch Block: Building a Convolution as an Inception-like Unit
Xiaohan Ding
Xinming Zhang
Jungong Han
Guiguang Ding
AI4CE
54
282
0
24 Mar 2021
RepVGG: Making VGG-style ConvNets Great Again
Xiaohan Ding
Xinming Zhang
Ningning Ma
Jungong Han
Guiguang Ding
Jian Sun
239
1,574
0
11 Jan 2021
Meshed-Memory Transformer for Image Captioning
Marcella Cornia
Matteo Stefanini
Lorenzo Baraldi
Rita Cucchiara
51
873
0
17 Dec 2019
Attention on Attention for Image Captioning
Lun Huang
Wenmin Wang
Jie Chen
Xiao-Yong Wei
54
828
0
19 Aug 2019
Image Captioning: Transforming Objects into Words
Simão Herdade
Armin Kappeler
K. Boakye
Joao Soares
ViT
92
467
0
14 Jun 2019
Auto-Encoding Scene Graphs for Image Captioning
Xu Yang
Kaihua Tang
Hanwang Zhang
Jianfei Cai
129
698
0
06 Dec 2018
Show, Control and Tell: A Framework for Generating Controllable and Grounded Captions
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
DiffM
62
175
0
26 Nov 2018
Exploring Visual Relationship for Image Captioning
Ting Yao
Yingwei Pan
Yehao Li
Tao Mei
72
830
0
19 Sep 2018
Recurrent Fusion Network for Image Captioning
Wenhao Jiang
Lin Ma
Yu-Gang Jiang
Wen Liu
Tong Zhang
ObjD
56
234
0
26 Jul 2018
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
Peter Anderson
Xiaodong He
Chris Buehler
Damien Teney
Mark Johnson
Stephen Gould
Lei Zhang
AIMat
109
4,201
0
25 Jul 2017
Aggregated Residual Transformations for Deep Neural Networks
Saining Xie
Ross B. Girshick
Piotr Dollár
Zhuowen Tu
Kaiming He
463
10,281
0
16 Nov 2016
Boosting Image Captioning with Attributes
Ting Yao
Yingwei Pan
Yehao Li
Zhaofan Qiu
Tao Mei
VLM
80
621
0
05 Nov 2016
Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models
Bryan A. Plummer
Liwei Wang
Christopher M. Cervantes
Juan C. Caicedo
Julia Hockenmaier
Svetlana Lazebnik
185
2,033
0
19 May 2015
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
DiffM
298
10,034
0
10 Feb 2015
CIDEr: Consensus-based Image Description Evaluation
Ramakrishna Vedantam
C. L. Zitnick
Devi Parikh
244
4,451
0
20 Nov 2014
1