Prototype-based Aleatoric Uncertainty Quantification for Cross-modal Retrieval

29 September 2023

Lianli Gao

Papers citing "Prototype-based Aleatoric Uncertainty Quantification for Cross-modal Retrieval"

31 / 31 papers shown

Title
Entropy Heat-Mapping: Localizing GPT-Based OCR Errors with Sliding-Window Shannon Analysis Alexei Kaltchenko 72 0 0 30 Apr 2025
Leveraging Modality Tags for Enhanced Cross-Modal Video Retrieval A. Fragomeni Dima Damen Michael Wray 93 0 0 02 Apr 2025
Narrating the Video: Boosting Text-Video Retrieval via Comprehensive Utilization of Frame-Level Captions Chan hur Jeong-hun Hong Dong-hun Lee Dabin Kang Semin Myeong Sang-hyo Park Hyeyoung Park 131 1 0 07 Mar 2025
Improved Probabilistic Image-Text Representations Sanghyuk Chun VLM 72 28 0 29 May 2023
Improving Cross-Modal Retrieval with Set of Diverse Embeddings Dongwon Kim Nam-Won Kim Suha Kwak 85 38 0 30 Nov 2022
MILES: Visual BERT Pre-training with Injected Language Semantics for Video-text Retrieval Yuying Ge Yixiao Ge Xihui Liu Alex Jinpeng Wang Jianping Wu Ying Shan Xiaohu Qie Ping Luo VLM 42 44 0 26 Apr 2022
X-Pool: Cross-Modal Language-Video Attention for Text-Video Retrieval S. Gorti Noël Vouitsis Junwei Ma Keyvan Golestan M. Volkovs Animesh Garg Guangwei Yu 71 160 0 28 Mar 2022
CLIP2Video: Mastering Video-Text Retrieval via Image CLIP Han Fang Pengfei Xiong Luhui Xu Yu Chen CLIP VLM 93 297 0 21 Jun 2021
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text Hassan Akbari Liangzhe Yuan Rui Qian Wei-Hong Chuang Shih-Fu Chang Huayu Chen Boqing Gong ViT 305 588 0 22 Apr 2021
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval Max Bain Arsha Nagrani Gül Varol Andrew Zisserman VGen 136 1,172 0 01 Apr 2021
MDMMT: Multidomain Multimodal Transformer for Video Retrieval Maksim Dzabraev M. Kalashnikov Stepan Alekseevich Komkov Aleksandr Petiushko 63 132 0 19 Mar 2021
Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling Jie Lei Linjie Li Luowei Zhou Zhe Gan Tamara L. Berg Joey Tianyi Zhou Jingjing Liu CLIP 116 661 0 11 Feb 2021
Trusted Multi-View Classification Zongbo Han Changqing Zhang Huazhu Fu Qiufeng Wang EDL 41 168 0 03 Feb 2021
Probabilistic Embeddings for Cross-Modal Retrieval Sanghyuk Chun Seong Joon Oh Rafael Sampaio de Rezende Yannis Kalantidis Diane Larlus UQCV 463 206 0 13 Jan 2021
Similarity Reasoning and Filtration for Image-Text Matching Haiwen Diao Ying Zhang Lingyun Ma Huchuan Lu 278 335 0 05 Jan 2021
Multi-modal Transformer for Video Retrieval Valentin Gabeur Chen Sun Alahari Karteek Cordelia Schmid ViT 523 608 0 21 Jul 2020
Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning Shizhe Chen Yida Zhao Qin Jin Qi Wu 82 314 0 01 Mar 2020
Use What You Have: Video Retrieval Using Representations From Collaborative Experts Yang Liu Samuel Albanie Arsha Nagrani Andrew Zisserman 71 389 0 31 Jul 2019
Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval Yale Song M. Soleymani 52 244 0 11 Jun 2019
Evidential Deep Learning to Quantify Classification Uncertainty Murat Sensoy Lance M. Kaplan M. Kandemir OOD UQCV EDL BDL 177 991 0 05 Jun 2018
AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks Tao Xu Pengchuan Zhang Qiuyuan Huang Han Zhang Zhe Gan Xiaolei Huang Xiaodong He GAN ViT 105 1,716 0 28 Nov 2017
Localizing Moments in Video with Natural Language Lisa Anne Hendricks Oliver Wang Eli Shechtman Josef Sivic Trevor Darrell Bryan C. Russell 110 946 0 04 Aug 2017
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering Peter Anderson Xiaodong He Chris Buehler Damien Teney Mark Johnson Stephen Gould Lei Zhang AIMat 117 4,214 0 25 Jul 2017
Variational Dropout Sparsifies Deep Neural Networks Dmitry Molchanov Arsenii Ashukha Dmitry Vetrov BDL 139 828 0 19 Jan 2017
SGDR: Stochastic Gradient Descent with Warm Restarts I. Loshchilov Frank Hutter ODL 307 8,114 0 13 Aug 2016
DenseCap: Fully Convolutional Localization Networks for Dense Captioning Justin Johnson A. Karpathy Li Fei-Fei VLM 126 1,168 0 24 Nov 2015
Variational Dropout and the Local Reparameterization Trick Diederik P. Kingma Tim Salimans Max Welling BDL 220 1,510 0 08 Jun 2015
Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification Kaiming He Xinming Zhang Shaoqing Ren Jian Sun VLM 315 18,609 0 06 Feb 2015
Deep Visual-Semantic Alignments for Generating Image Descriptions A. Karpathy Li Fei-Fei 118 5,583 0 07 Dec 2014
Black Box Variational Inference Rajesh Ranganath S. Gerrish David M. Blei DRL BDL 131 1,166 0 31 Dec 2013
Exact solutions to the nonlinear dynamics of learning in deep linear neural networks Andrew M. Saxe James L. McClelland Surya Ganguli ODL 165 1,843 0 20 Dec 2013