Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2105.05964
Cited By
Connecting What to Say With Where to Look by Modeling Human Attention Traces
12 May 2021
Zihang Meng
Licheng Yu
Ning Zhang
Tamara L. Berg
Babak Damavandi
Vikas Singh
Amy Bearman
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Connecting What to Say With Where to Look by Modeling Human Attention Traces"
5 / 5 papers shown
Title
A look under the hood of the Interactive Deep Learning Enterprise (No-IDLE)
Daniel Sonntag
Michael Barz
Thiago S. Gouvêa
VLM
52
4
0
27 Jun 2024
Who are you referring to? Coreference resolution in image narrations
A. Goel
Basura Fernando
Frank Keller
Hakan Bilen
27
3
0
26 Nov 2022
Object-Centric Unsupervised Image Captioning
Zihang Meng
David Yang
Xuefei Cao
Ashish Shah
Ser-Nam Lim
OCL
VLM
21
11
0
02 Dec 2021
From Show to Tell: A Survey on Deep Learning-based Image Captioning
Matteo Stefanini
Marcella Cornia
Lorenzo Baraldi
S. Cascianelli
G. Fiameni
Rita Cucchiara
3DV
VLM
MLLM
67
256
0
14 Jul 2021
Unified Vision-Language Pre-Training for Image Captioning and VQA
Luowei Zhou
Hamid Palangi
Lei Zhang
Houdong Hu
Jason J. Corso
Jianfeng Gao
MLLM
VLM
252
927
0
24 Sep 2019
1