Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2309.17093
Cited By
Prototype-based Aleatoric Uncertainty Quantification for Cross-modal Retrieval
29 September 2023
Hao Li
Marie-Jeanne Lesot
Lianli Gao
Xiaosu Zhu
Christophe Marsala
EDL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Prototype-based Aleatoric Uncertainty Quantification for Cross-modal Retrieval"
31 / 31 papers shown
Title
Entropy Heat-Mapping: Localizing GPT-Based OCR Errors with Sliding-Window Shannon Analysis
Alexei Kaltchenko
72
0
0
30 Apr 2025
Leveraging Modality Tags for Enhanced Cross-Modal Video Retrieval
A. Fragomeni
Dima Damen
Michael Wray
93
0
0
02 Apr 2025
Narrating the Video: Boosting Text-Video Retrieval via Comprehensive Utilization of Frame-Level Captions
Chan hur
Jeong-hun Hong
Dong-hun Lee
Dabin Kang
Semin Myeong
Sang-hyo Park
Hyeyoung Park
131
1
0
07 Mar 2025
Improved Probabilistic Image-Text Representations
Sanghyuk Chun
VLM
72
28
0
29 May 2023
Improving Cross-Modal Retrieval with Set of Diverse Embeddings
Dongwon Kim
Nam-Won Kim
Suha Kwak
85
38
0
30 Nov 2022
MILES: Visual BERT Pre-training with Injected Language Semantics for Video-text Retrieval
Yuying Ge
Yixiao Ge
Xihui Liu
Alex Jinpeng Wang
Jianping Wu
Ying Shan
Xiaohu Qie
Ping Luo
VLM
42
44
0
26 Apr 2022
X-Pool: Cross-Modal Language-Video Attention for Text-Video Retrieval
S. Gorti
Noël Vouitsis
Junwei Ma
Keyvan Golestan
M. Volkovs
Animesh Garg
Guangwei Yu
71
160
0
28 Mar 2022
CLIP2Video: Mastering Video-Text Retrieval via Image CLIP
Han Fang
Pengfei Xiong
Luhui Xu
Yu Chen
CLIP
VLM
93
297
0
21 Jun 2021
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
Hassan Akbari
Liangzhe Yuan
Rui Qian
Wei-Hong Chuang
Shih-Fu Chang
Huayu Chen
Boqing Gong
ViT
305
588
0
22 Apr 2021
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval
Max Bain
Arsha Nagrani
Gül Varol
Andrew Zisserman
VGen
136
1,172
0
01 Apr 2021
MDMMT: Multidomain Multimodal Transformer for Video Retrieval
Maksim Dzabraev
M. Kalashnikov
Stepan Alekseevich Komkov
Aleksandr Petiushko
63
132
0
19 Mar 2021
Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling
Jie Lei
Linjie Li
Luowei Zhou
Zhe Gan
Tamara L. Berg
Joey Tianyi Zhou
Jingjing Liu
CLIP
116
661
0
11 Feb 2021
Trusted Multi-View Classification
Zongbo Han
Changqing Zhang
Huazhu Fu
Qiufeng Wang
EDL
41
168
0
03 Feb 2021
Probabilistic Embeddings for Cross-Modal Retrieval
Sanghyuk Chun
Seong Joon Oh
Rafael Sampaio de Rezende
Yannis Kalantidis
Diane Larlus
UQCV
463
206
0
13 Jan 2021
Similarity Reasoning and Filtration for Image-Text Matching
Haiwen Diao
Ying Zhang
Lingyun Ma
Huchuan Lu
278
335
0
05 Jan 2021
Multi-modal Transformer for Video Retrieval
Valentin Gabeur
Chen Sun
Alahari Karteek
Cordelia Schmid
ViT
523
608
0
21 Jul 2020
Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning
Shizhe Chen
Yida Zhao
Qin Jin
Qi Wu
82
314
0
01 Mar 2020
Use What You Have: Video Retrieval Using Representations From Collaborative Experts
Yang Liu
Samuel Albanie
Arsha Nagrani
Andrew Zisserman
71
389
0
31 Jul 2019
Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval
Yale Song
M. Soleymani
52
244
0
11 Jun 2019
Evidential Deep Learning to Quantify Classification Uncertainty
Murat Sensoy
Lance M. Kaplan
M. Kandemir
OOD
UQCV
EDL
BDL
177
991
0
05 Jun 2018
AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks
Tao Xu
Pengchuan Zhang
Qiuyuan Huang
Han Zhang
Zhe Gan
Xiaolei Huang
Xiaodong He
GAN
ViT
105
1,716
0
28 Nov 2017
Localizing Moments in Video with Natural Language
Lisa Anne Hendricks
Oliver Wang
Eli Shechtman
Josef Sivic
Trevor Darrell
Bryan C. Russell
110
946
0
04 Aug 2017
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
Peter Anderson
Xiaodong He
Chris Buehler
Damien Teney
Mark Johnson
Stephen Gould
Lei Zhang
AIMat
117
4,214
0
25 Jul 2017
Variational Dropout Sparsifies Deep Neural Networks
Dmitry Molchanov
Arsenii Ashukha
Dmitry Vetrov
BDL
139
828
0
19 Jan 2017
SGDR: Stochastic Gradient Descent with Warm Restarts
I. Loshchilov
Frank Hutter
ODL
307
8,114
0
13 Aug 2016
DenseCap: Fully Convolutional Localization Networks for Dense Captioning
Justin Johnson
A. Karpathy
Li Fei-Fei
VLM
126
1,168
0
24 Nov 2015
Variational Dropout and the Local Reparameterization Trick
Diederik P. Kingma
Tim Salimans
Max Welling
BDL
220
1,510
0
08 Jun 2015
Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
VLM
315
18,609
0
06 Feb 2015
Deep Visual-Semantic Alignments for Generating Image Descriptions
A. Karpathy
Li Fei-Fei
118
5,583
0
07 Dec 2014
Black Box Variational Inference
Rajesh Ranganath
S. Gerrish
David M. Blei
DRL
BDL
131
1,166
0
31 Dec 2013
Exact solutions to the nonlinear dynamics of learning in deep linear neural networks
Andrew M. Saxe
James L. McClelland
Surya Ganguli
ODL
165
1,843
0
20 Dec 2013
1