Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2207.09666
Cited By
GRIT: Faster and Better Image captioning Transformer Using Dual Visual Features
20 July 2022
Van-Quang Nguyen
Masanori Suganuma
Takayuki Okatani
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"GRIT: Faster and Better Image captioning Transformer Using Dual Visual Features"
7 / 57 papers shown
Title
Deep Learning Approaches on Image Captioning: A Review
Taraneh Ghandi
H. Pourreza
H. Mahyar
VLM
16
89
0
31 Jan 2022
ViDT: An Efficient and Effective Fully Transformer-based Object Detector
Hwanjun Song
Deqing Sun
Sanghyuk Chun
Varun Jampani
Dongyoon Han
Byeongho Heo
Wonjae Kim
Ming-Hsuan Yang
87
76
0
08 Oct 2021
Improving Image Captioning by Leveraging Intra- and Inter-layer Global Representation in Transformer Network
Jiayi Ji
Yunpeng Luo
Xiaoshuai Sun
Fuhai Chen
Gen Luo
Yongjian Wu
Yue Gao
Rongrong Ji
ViT
46
170
0
13 Dec 2020
Normalized and Geometry-Aware Self-Attention Network for Image Captioning
Longteng Guo
Jing Liu
Xinxin Zhu
Peng Yao
Shichen Lu
Hanqing Lu
ViT
120
189
0
19 Mar 2020
Unified Vision-Language Pre-Training for Image Captioning and VQA
Luowei Zhou
Hamid Palangi
Lei Zhang
Houdong Hu
Jason J. Corso
Jianfeng Gao
MLLM
VLM
252
927
0
24 Sep 2019
Neural Baby Talk
Jiasen Lu
Jianwei Yang
Dhruv Batra
Devi Parikh
VLM
200
434
0
27 Mar 2018
Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning
Jiasen Lu
Caiming Xiong
Devi Parikh
R. Socher
85
1,442
0
06 Dec 2016
Previous
1
2