Finding beans in burgers: Deep semantic-visual embedding with localization

5 April 2018

Papers citing "Finding beans in burgers: Deep semantic-visual embedding with localization"

21 / 21 papers shown

Title
GOOD: Towards Domain Generalized Orientated Object Detection Qi Bi Beichen Zhou Jingjun Yi Wei Ji Haolan Zhan Gui-Song Xia ObjD OOD 85 2 0 20 Feb 2024
Unified Medical Image Pre-training in Language-Guided Common Semantic Space Xiaoxuan He Yifan Yang Xinyang Jiang Xufang Luo Haoji Hu Siyun Zhao Dongsheng Li Yuqing Yang Lili Qiu 48 1 0 24 Nov 2023
Object-Centric Open-Vocabulary Image-Retrieval with Aggregated Features Hila Levi Guy Heller Dan Levi Ethan Fetaya OCL VLM 29 3 0 26 Sep 2023
A Joint Study of Phrase Grounding and Task Performance in Vision and Language Models Noriyuki Kojima Hadar Averbuch-Elor Yoav Artzi 34 2 0 06 Sep 2023
Multi-Granularity Cross-modal Alignment for Generalized Medical Visual Representation Learning Fuying Wang Yuyin Zhou Shujun Wang V. Vardhanabhuti Lequan Yu 34 137 0 12 Oct 2022
Embedding Arithmetic of Multimodal Queries for Image Retrieval Guillaume Couairon Matthieu Cord Matthijs Douze Holger Schwenk 37 23 0 06 Dec 2021
Contrastive Learning of Visual-Semantic Embeddings Anurag Jain Yashaswi Verma SSL 33 1 0 17 Oct 2021
Structured Multi-modal Feature Embedding and Alignment for Image-Sentence Retrieval Xuri Ge Fuhai Chen J. Jose Zhilong Ji Zhongqin Wu Xiao-Chang Liu 34 55 0 05 Aug 2021
Probabilistic Embeddings for Cross-Modal Retrieval Sanghyuk Chun Seong Joon Oh Rafael Sampaio de Rezende Yannis Kalantidis Diane Larlus UQCV 415 203 0 13 Jan 2021
Semantics for Robotic Mapping, Perception and Interaction: A Survey Sourav Garg Niko Sünderhauf Feras Dayoub D. Morrison Akansel Cosgun ... Tat-Jun Chin Ian Reid Stephen Gould Peter Corke Michael Milford 26 115 0 02 Jan 2021
Cosine meets Softmax: A tough-to-beat baseline for visual grounding N. Rufus U. R. Nair K. M. Krishna Vineet Gandhi 27 13 0 13 Sep 2020
Retrieving and Highlighting Action with Spatiotemporal Reference Seito Kasai Yuchi Ishikawa Masaki Hayashi Y. Aoki Kensho Hara Hirokatsu Kataoka 11 0 0 19 May 2020
Sketch Less for More: On-the-Fly Fine-Grained Sketch Based Image Retrieval A. Bhunia Yongxin Yang Timothy M. Hospedales Tao Xiang Yi-Zhe Song 35 103 0 24 Feb 2020
Probing Contextualized Sentence Representations with Visual Awareness ZhuoSheng Zhang Rui Wang Kehai Chen Masao Utiyama Eiichiro Sumita Hai Zhao 14 2 0 07 Nov 2019
Target-Oriented Deformation of Visual-Semantic Embedding Space Takashi Matsubara 26 7 0 15 Oct 2019
Align2Ground: Weakly Supervised Phrase Grounding Guided by Image-Caption Alignment Samyak Datta Karan Sikka Anirban Roy Karuna Ahuja Devi Parikh Ajay Divakaran 19 103 0 27 Mar 2019
Image search using multilingual texts: a cross-modal learning approach between image and text Maxime Portaz Hicham Randrianarivo A. Nivaggioli Estelle Maudet Christophe Servan Sylvain Peyronnet 26 12 0 27 Mar 2019
MUREL: Multimodal Relational Reasoning for Visual Question Answering Rémi Cadène H. Ben-younes Matthieu Cord Nicolas Thome LRM 19 271 0 25 Feb 2019
Engaging Image Captioning Via Personality Kurt Shuster Samuel Humeau Hexiang Hu Antoine Bordes Jason Weston 37 149 0 25 Oct 2018
Recipe1M+: A Dataset for Learning Cross-Modal Embeddings for Cooking Recipes and Food Images Javier Marín Aritro Biswas Ferda Ofli Nick Hynes Amaia Salvador Y. Aytar Ingmar Weber Antonio Torralba 16 320 0 14 Oct 2018
Convolutional Neural Networks for Sentence Classification Yoon Kim AILaw VLM 309 13,373 0 25 Aug 2014