Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2408.13705
Cited By
Cross-Modal Denoising: A Novel Training Paradigm for Enhancing Speech-Image Retrieval
15 August 2024
Lifeng Zhou
Yuke Li
Rui Deng
Yuting Yang
Haoqi Zhu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Cross-Modal Denoising: A Novel Training Paradigm for Enhancing Speech-Image Retrieval"
1 / 1 papers shown
Title
SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model
Yi-Jen Shih
Hsuan-Fu Wang
Heng-Jui Chang
Layne Berry
Hung-yi Lee
David F. Harwath
VLM
CLIP
46
32
0
03 Oct 2022
1