ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.10084
  4. Cited By
Revisiting Deep Audio-Text Retrieval Through the Lens of Transportation

Revisiting Deep Audio-Text Retrieval Through the Lens of Transportation

16 May 2024
Manh Luong
Khai Nguyen
Nhat Ho
Reza Haf
D.Q. Phung
Lizhen Qu
ArXivPDFHTML

Papers citing "Revisiting Deep Audio-Text Retrieval Through the Lens of Transportation"

9 / 9 papers shown
Title
Tree-Sliced Wasserstein Distance with Nonlinear Projection
Tree-Sliced Wasserstein Distance with Nonlinear Projection
T. Tran
Viet-Hoang Tran
Thanh T. Chu
Trang Pham
Laurent El Ghaoui
Tam Le
T. Nguyen
26
0
0
02 May 2025
Expanding on EnCLAP with Auxiliary Retrieval Model for Automated Audio
  Captioning
Expanding on EnCLAP with Auxiliary Retrieval Model for Automated Audio Captioning
Jaeyeon Kim
Jaeyoon Jung
Minjeong Jeon
Sang Hoon Woo
Jinjoo Lee
24
1
0
02 Sep 2024
Hierarchical Hybrid Sliced Wasserstein: A Scalable Metric for
  Heterogeneous Joint Distributions
Hierarchical Hybrid Sliced Wasserstein: A Scalable Metric for Heterogeneous Joint Distributions
Khai Nguyen
Nhat Ho
44
3
0
23 Apr 2024
Parameter Estimation in DAGs from Incomplete Data via Optimal Transport
Parameter Estimation in DAGs from Incomplete Data via Optimal Transport
Vy Vo
Trung Le
L. Vuong
He Zhao
Edwin V. Bonilla
Dinh Q. Phung
OT
34
4
0
25 May 2023
Sliced Wasserstein Estimation with Control Variates
Sliced Wasserstein Estimation with Control Variates
Khai Nguyen
Nhat Ho
39
12
0
30 Apr 2023
Audio Retrieval with WavText5K and CLAP Training
Audio Retrieval with WavText5K and CLAP Training
Soham Deshmukh
Benjamin Elizalde
Huaming Wang
3DV
CLIP
124
51
0
28 Sep 2022
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound
  Classification and Detection
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection
Ke Chen
Xingjian Du
Bilei Zhu
Zejun Ma
Taylor Berg-Kirkpatrick
Shlomo Dubnov
ViT
127
263
0
02 Feb 2022
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text
  Understanding
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
Hu Xu
Gargi Ghosh
Po-Yao (Bernie) Huang
Dmytro Okhonko
Armen Aghajanyan
Florian Metze
Luke Zettlemoyer
Florian Metze Luke Zettlemoyer Christoph Feichtenhofer
CLIP
VLM
259
561
0
28 Sep 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
337
3,720
0
11 Feb 2021
1