Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2309.05767
Cited By
Natural Language Supervision for General-Purpose Audio Representations
11 September 2023
Benjamin Elizalde
Soham Deshmukh
Huaming Wang
AuLLM
AI4TS
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Natural Language Supervision for General-Purpose Audio Representations"
14 / 14 papers shown
Title
Audio-Language Datasets of Scenes and Events: A Survey
Gijs Wijngaard
Elia Formisano
Michele Esposito
M. Dumontier
81
2
0
10 Jan 2025
MMAudio: Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis
Ho Kei Cheng
Masato Ishii
Akio Hayakawa
Takashi Shibuya
A. Schwing
Yuki Mitsufuji
VGen
126
12
0
19 Dec 2024
Do Audio-Language Models Understand Linguistic Variations?
Ramaneswaran Selvakumar
Sonal Kumar
Hemant Kumar Giri
Nishit Anand
Ashish Seth
Sreyan Ghosh
Dinesh Manocha
AuLLM
VLM
55
1
0
21 Oct 2024
MT2KD: Towards A General-Purpose Encoder for Speech, Speaker, and Audio Events
Xiaoyu Yang
Qiujia Li
Chao Zhang
P. Woodland
29
0
0
25 Sep 2024
Semi-Supervised Cognitive State Classification from Speech with Multi-View Pseudo-Labeling
Yuanchao Li
Zixing Zhang
Jing Han
P. Bell
Catherine Lai
77
0
0
25 Sep 2024
Language-based Audio Moment Retrieval
Hokuto Munakata
Taichi Nishimura
Shota Nakada
Tatsuya Komatsu
40
1
0
24 Sep 2024
Semi-intrusive audio evaluation: Casting non-intrusive assessment as a multi-modal text prediction task
Jozef Coldenhoff
Milos Cernak
41
0
0
21 Sep 2024
Leveraging Audio-Only Data for Text-Queried Target Sound Extraction
Kohei Saijo
Janek Ebbers
François Germain
Sameer Khurana
G. Wichern
Jonathan Le Roux
44
1
0
20 Sep 2024
Sequential Contrastive Audio-Visual Learning
Ioannis Tsiamas
Santiago Pascual
Chunghsin Yeh
Joan Serrà
44
2
0
08 Jul 2024
Bridging Language Gaps in Audio-Text Retrieval
Zhiyong Yan
Heinrich Dinkel
Yongqing Wang
Jizhong Liu
Junbo Zhang
Yujun Wang
Bin Wang
VLM
39
4
0
11 Jun 2024
Correlation of Fréchet Audio Distance With Human Perception of Environmental Audio Is Embedding Dependant
Modan Tailleur
Junwon Lee
Mathieu Lagrange
Keunwoo Choi
Laurie M. Heller
Keisuke Imoto
Yuki Okamoto
30
10
0
26 Mar 2024
Audio Retrieval with WavText5K and CLAP Training
Soham Deshmukh
Benjamin Elizalde
Huaming Wang
3DV
CLIP
124
50
0
28 Sep 2022
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection
Ke Chen
Xingjian Du
Bilei Zhu
Zejun Ma
Taylor Berg-Kirkpatrick
Shlomo Dubnov
ViT
127
264
0
02 Feb 2022
Identifying Actions for Sound Event Classification
Benjamin Elizalde
Radu Revutchi
Samarjit Das
Bhiksha Raj
Ian Lane
Laurie M. Heller
19
5
0
26 Apr 2021
1