Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2306.07490
Cited By
v1
v2
v3 (latest)
Top-Down Framework for Weakly-supervised Grounded Image Captioning
13 June 2023
Chen Cai
Suchen Wang
Kim-Hui Yap
Yi Wang
ObjD
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Top-Down Framework for Weakly-supervised Grounded Image Captioning"
23 / 23 papers shown
Title
Prophet Attention: Predicting Attention with Future Attention for Image Captioning
Fenglin Liu
Xuancheng Ren
Xian Wu
Wei Fan
Yuexian Zou
Xu Sun
65
47
0
19 Oct 2022
Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning
Li Yang
Yan Xu
Chunfen Yuan
Wei Liu
Bing Li
Weiming Hu
ObjD
68
117
0
30 Apr 2022
Injecting Semantic Concepts into End-to-End Image Captioning
Zhiyuan Fang
Jianfeng Wang
Xiaowei Hu
Lin Liang
Zhe Gan
Lijuan Wang
Yezhou Yang
Zicheng Liu
ViT
VLM
73
88
0
09 Dec 2021
Distributed Attention for Grounded Image Captioning
Nenglun Chen
Xingjia Pan
Runnan Chen
Lei Yang
Zhiwen Lin
Yuqiang Ren
Haolei Yuan
Xiaowei Guo
Feiyue Huang
Wenping Wang
55
21
0
02 Aug 2021
TransVG: End-to-End Visual Grounding with Transformers
Jiajun Deng
Zhengyuan Yang
Tianlang Chen
Wen-gang Zhou
Houqiang Li
ViT
74
345
0
17 Apr 2021
Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation
Liwei Wang
Jing-ling Huang
Yin Li
Kun Xu
Zhengyuan Yang
Dong Yu
ObjD
65
83
0
03 Jul 2020
More Grounded Image Captioning by Distilling Image-Text Matching Model
Yuanen Zhou
Meng Wang
Daqing Liu
Zhenzhen Hu
Hanwang Zhang
65
126
0
01 Apr 2020
Meshed-Memory Transformer for Image Captioning
Marcella Cornia
Matteo Stefanini
Lorenzo Baraldi
Rita Cucchiara
78
884
0
17 Dec 2019
Attention on Attention for Image Captioning
Lun Huang
Wenmin Wang
Jie Chen
Xiao-Yong Wei
72
832
0
19 Aug 2019
Image Captioning: Transforming Objects into Words
Simão Herdade
Armin Kappeler
K. Boakye
Joao Soares
ViT
130
470
0
14 Jun 2019
Multimodal Transformer with Multi-View Visual Representation for Image Captioning
Jun-chen Yu
Jing Li
Zhou Yu
Qingming Huang
ViT
63
384
0
20 May 2019
Deep Learning for Generic Object Detection: A Survey
Li Liu
Wanli Ouyang
Xiaogang Wang
Paul Fieguth
Jie Chen
Xinwang Liu
M. Pietikäinen
ObjD
VLM
OOD
174
2,458
0
06 Sep 2018
Recurrent Fusion Network for Image Captioning
Wenhao Jiang
Lin Ma
Yu-Gang Jiang
Wen Liu
Tong Zhang
ObjD
62
235
0
26 Jul 2018
Self-produced Guidance for Weakly-supervised Object Localization
Xiaolin Zhang
Yunchao Wei
Guoliang Kang
Yi Yang
Thomas Huang
WSOL
111
253
0
24 Jul 2018
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
Peter Anderson
Xiaodong He
Chris Buehler
Damien Teney
Mark Johnson
Stephen Gould
Lei Zhang
AIMat
123
4,221
0
25 Jul 2017
Language Modeling with Gated Convolutional Networks
Yann N. Dauphin
Angela Fan
Michael Auli
David Grangier
242
2,404
0
23 Dec 2016
Self-critical Sequence Training for Image Captioning
Steven J. Rennie
E. Marcheret
Youssef Mroueh
Jerret Ross
Vaibhava Goel
109
1,890
0
02 Dec 2016
SPICE: Semantic Propositional Image Caption Evaluation
Peter Anderson
Basura Fernando
Mark Johnson
Stephen Gould
EGVM
108
1,919
0
29 Jul 2016
Learning Deep Features for Discriminative Localization
Bolei Zhou
A. Khosla
Àgata Lapedriza
A. Oliva
Antonio Torralba
SSL
SSeg
FAtt
253
9,338
0
14 Dec 2015
Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models
Bryan A. Plummer
Liwei Wang
Christopher M. Cervantes
Juan C. Caicedo
Julia Hockenmaier
Svetlana Lazebnik
208
2,072
0
19 May 2015
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
DiffM
350
10,079
0
10 Feb 2015
Deep Visual-Semantic Alignments for Generating Image Descriptions
A. Karpathy
Li Fei-Fei
152
5,591
0
07 Dec 2014
CIDEr: Consensus-based Image Description Evaluation
Ramakrishna Vedantam
C. L. Zitnick
Devi Parikh
297
4,508
0
20 Nov 2014
1