Connecting What to Say With Where to Look by Modeling Human Attention
Traces

Connecting What to Say With Where to Look by Modeling Human Attention Traces

12 May 2021

Babak Damavandi

Papers citing "Connecting What to Say With Where to Look by Modeling Human Attention Traces"

5 / 5 papers shown

Title
A look under the hood of the Interactive Deep Learning Enterprise (No-IDLE) Daniel Sonntag Michael Barz Thiago S. Gouvêa VLM 52 4 0 27 Jun 2024
Who are you referring to? Coreference resolution in image narrations A. Goel Basura Fernando Frank Keller Hakan Bilen 27 3 0 26 Nov 2022
Object-Centric Unsupervised Image Captioning Zihang Meng David Yang Xuefei Cao Ashish Shah Ser-Nam Lim OCL VLM 21 11 0 02 Dec 2021
From Show to Tell: A Survey on Deep Learning-based Image Captioning Matteo Stefanini Marcella Cornia Lorenzo Baraldi S. Cascianelli G. Fiameni Rita Cucchiara 3DV VLM MLLM 67 256 0 14 Jul 2021
Unified Vision-Language Pre-Training for Image Captioning and VQA Luowei Zhou Hamid Palangi Lei Zhang Houdong Hu Jason J. Corso Jianfeng Gao MLLM VLM 252 927 0 24 Sep 2019