Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2307.15344
Cited By
Improving Audio-Text Retrieval via Hierarchical Cross-Modal Interaction and Auxiliary Captions
28 July 2023
Yifei Xin
Yuexian Zou
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Improving Audio-Text Retrieval via Hierarchical Cross-Modal Interaction and Auxiliary Captions"
7 / 7 papers shown
Title
DiffATR: Diffusion-based Generative Modeling for Audio-Text Retrieval
Yifei Xin
Xuxin Cheng
Zhihong Zhu
Xusheng Yang
Yuexian Zou
DiffM
28
5
0
16 Sep 2024
Towards Spoken Language Understanding via Multi-level Multi-grained Contrastive Learning
Xuxin Cheng
Wanshi Xu
Zhihong Zhu
Hongxiang Li
Yuexian Zou
61
13
0
31 May 2024
MINT: Boosting Audio-Language Model via Multi-Target Pre-Training and Instruction Tuning
Hang Zhao
Yifei Xin
Zhesong Yu
Bilei Zhu
Lu Lu
Zejun Ma
AuLLM
28
4
0
12 Feb 2024
Masked Audio Modeling with CLAP and Multi-Objective Learning
Yifei Xin
Xiulian Peng
Yan Lu
44
8
0
29 Jan 2024
Improving Weakly Supervised Sound Event Detection with Causal Intervention
Yifei Xin
Dongchao Yang
Fan Cui
Yujun Wang
Yuexian Zou
CML
46
8
0
10 Mar 2023
Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
Wenhao Wu
Haipeng Luo
Bo Fang
Jingdong Wang
Wanli Ouyang
98
80
0
31 Dec 2022
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection
Ke Chen
Xingjian Du
Bilei Zhu
Zejun Ma
Taylor Berg-Kirkpatrick
Shlomo Dubnov
ViT
118
264
0
02 Feb 2022
1