Generation and Comprehension of Unambiguous Object Descriptions

7 November 2015

Papers citing "Generation and Comprehension of Unambiguous Object Descriptions"

25 / 275 papers shown

Title
Discriminability objective for training descriptive captions Ruotian Luo Brian L. Price Scott D. Cohen Gregory Shakhnarovich 30 202 0 12 Mar 2018
Grounding Referring Expressions in Images by Variational Context Hanwang Zhang Yulei Niu Shih-Fu Chang BDL ObjD 21 219 0 05 Dec 2017
Object Referring in Visual Scene with Spoken Language A. Vasudevan Dengxin Dai Luc Van Gool 37 18 0 10 Nov 2017
Semantic Image Retrieval via Active Grounding of Visual Situations Max H. Quinn E. Conser Jordan M. Witte Melanie Mitchell 16 9 0 31 Oct 2017
Self-Guiding Multimodal LSTM - when we do not have a perfect training dataset for image captioning Yang Xian Yingli Tian VLM 25 22 0 15 Sep 2017
Reasoning about Fine-grained Attribute Phrases using Reference Games Jong-Chyi Su Chenyun Wu Huaizu Jiang Subhransu Maji 34 16 0 29 Aug 2017
VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation Chuang Gan Yandong Li Haoxiang Li Chen Sun Boqing Gong 27 126 0 15 Aug 2017
Localizing Moments in Video with Natural Language Lisa Anne Hendricks Oliver Wang Eli Shechtman Josef Sivic Trevor Darrell Bryan C. Russell 55 927 0 04 Aug 2017
Multimodal Machine Learning: A Survey and Taxonomy T. Baltrušaitis Chaitanya Ahuja Louis-Philippe Morency 15 2,865 0 26 May 2017
TALL: Temporal Activity Localization via Language Query J. Gao Chen Sun Zhenheng Yang Ram Nevatia 68 799 0 05 May 2017
Show, Ask, Attend, and Answer: A Strong Baseline For Visual Question Answering V. Kazemi Ali Elqursh OOD 28 183 0 11 Apr 2017
Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation Albert Gatt E. Krahmer LM&MA ELM 27 810 0 29 Mar 2017
Recurrent Multimodal Interaction for Referring Image Segmentation Chenxi Liu Zhe-nan Lin Xiaohui Shen Jimei Yang Xin Lu Alan Yuille EgoV 36 234 0 23 Mar 2017
An End-to-End Approach to Natural Language Object Retrieval via Context-Aware Deep Reinforcement Learning Fan Wu Zhongwen Xu Yi Yang ObjD 34 11 0 22 Mar 2017
Comprehension-guided referring expressions Ruotian Luo Gregory Shakhnarovich ObjD 29 171 0 12 Jan 2017
Context-aware Captions from Context-agnostic Supervision Ramakrishna Vedantam Samy Bengio Kevin Patrick Murphy Devi Parikh Gal Chechik 22 152 0 11 Jan 2017
A Joint Speaker-Listener-Reinforcer Model for Referring Expressions Licheng Yu Hao Tan Joey Tianyi Zhou Tamara L. Berg ObjD 46 273 0 30 Dec 2016
Top-down Visual Saliency Guided by Captions Vasili Ramanishka Abir Das Jianming Zhang Kate Saenko 21 142 0 21 Dec 2016
ImageNet pre-trained models with batch normalization Marcel Simon E. Rodner Joachim Denzler VLM SSeg 44 165 0 05 Dec 2016
Modeling Relationships in Referential Expressions with Compositional Modular Networks Ronghang Hu Marcus Rohrbach Jacob Andreas Trevor Darrell Kate Saenko 42 401 0 30 Nov 2016
GuessWhat?! Visual object discovery through multi-modal dialogue H. D. Vries Florian Strub A. Chandar Olivier Pietquin Hugo Larochelle Aaron Courville VLM 32 426 0 23 Nov 2016
Dense Captioning with Joint Inference and Visual Context L. Yang K. Tang Jianchao Yang Li-Jia Li VLM 30 169 0 21 Nov 2016
Title Generation for User Generated Videos Kuo-Hao Zeng Tseng-Hung Chen Juan Carlos Niebles Min Sun 35 69 0 25 Aug 2016
Modeling Context in Referring Expressions Licheng Yu Patrick Poirson Shan Yang Alexander C. Berg Tamara L. Berg 30 1,227 0 31 Jul 2016
Generating Visual Explanations Lisa Anne Hendricks Zeynep Akata Marcus Rohrbach Jeff Donahue Bernt Schiele Trevor Darrell VLM FAtt 44 618 0 28 Mar 2016