ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.14076
  4. Cited By
Transformation of audio embeddings into interpretable, concept-based representations

Transformation of audio embeddings into interpretable, concept-based representations

18 April 2025
Alice Zhang
Edison Thomaz
Lie Lu
ArXivPDFHTML

Papers citing "Transformation of audio embeddings into interpretable, concept-based representations"

18 / 18 papers shown
Title
SPES: Spectrogram Perturbation for Explainable Speech-to-Text Generation
SPES: Spectrogram Perturbation for Explainable Speech-to-Text Generation
Dennis Fucci
Marco Gaido
Beatrice Savoldi
Matteo Negri
Mauro Cettolo
L. Bentivogli
246
2
0
03 Nov 2024
Interpreting CLIP with Sparse Linear Concept Embeddings (SpLiCE)
Interpreting CLIP with Sparse Linear Concept Embeddings (SpLiCE)
Usha Bhalla
Alexander X. Oesterling
Suraj Srinivas
Flavio du Pin Calmon
Himabindu Lakkaraju
105
40
0
16 Feb 2024
Label-Free Concept Bottleneck Models
Label-Free Concept Bottleneck Models
Tuomas P. Oikarinen
Subhro Das
Lam M. Nguyen
Tsui-Wei Weng
86
177
0
12 Apr 2023
WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for
  Audio-Language Multimodal Research
WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research
Xinhao Mei
Chutong Meng
Haohe Liu
Qiuqiang Kong
Tom Ko
Chengqi Zhao
Mark D. Plumbley
Yuexian Zou
Wenwu Wang
117
211
0
30 Mar 2023
Large-scale Contrastive Language-Audio Pretraining with Feature Fusion
  and Keyword-to-Caption Augmentation
Large-scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption Augmentation
Yusong Wu
Kai Chen
Tianyu Zhang
Yuchen Hui
Marianna Nezhurina
Taylor Berg-Kirkpatrick
Shlomo Dubnov
CLIP
122
531
0
12 Nov 2022
Disentangling visual and written concepts in CLIP
Disentangling visual and written concepts in CLIP
Joanna Materzyñska
Antonio Torralba
David Bau
CoGe
61
51
0
15 Jun 2022
Vocalsound: A Dataset for Improving Human Vocal Sounds Recognition
Vocalsound: A Dataset for Improving Human Vocal Sounds Recognition
Yuan Gong
Jingbo Yu
James R. Glass
67
40
0
06 May 2022
CLIP-Dissect: Automatic Description of Neuron Representations in Deep
  Vision Networks
CLIP-Dissect: Automatic Description of Neuron Representations in Deep Vision Networks
Tuomas P. Oikarinen
Tsui-Wei Weng
VLM
50
88
1
23 Apr 2022
Listen to Interpret: Post-hoc Interpretability for Audio Networks with
  NMF
Listen to Interpret: Post-hoc Interpretability for Audio Networks with NMF
Jayneel Parekh
Sanjeel Parekh
Pavlo Mozharovskyi
Florence dÁlché-Buc
G. Richard
52
25
0
23 Feb 2022
AudioCLIP: Extending CLIP to Image, Text and Audio
AudioCLIP: Extending CLIP to Image, Text and Audio
A. Guzhov
Federico Raue
Jörn Hees
Andreas Dengel
CLIP
VLM
119
366
0
24 Jun 2021
Learning Transferable Visual Models From Natural Language Supervision
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
929
29,436
0
26 Feb 2021
FSD50K: An Open Dataset of Human-Labeled Sound Events
FSD50K: An Open Dataset of Human-Labeled Sound Events
Eduardo Fonseca
Xavier Favory
Jordi Pons
F. Font
Xavier Serra
77
459
0
01 Oct 2020
Concept Bottleneck Models
Concept Bottleneck Models
Pang Wei Koh
Thao Nguyen
Y. S. Tang
Stephen Mussmann
Emma Pierson
Been Kim
Percy Liang
96
823
0
09 Jul 2020
Invertible Concept-based Explanations for CNN Models with Non-negative
  Concept Activation Vectors
Invertible Concept-based Explanations for CNN Models with Non-negative Concept Activation Vectors
Ruihan Zhang
Prashan Madumal
Tim Miller
Krista A. Ehinger
Benjamin I. P. Rubinstein
FAtt
61
104
0
27 Jun 2020
PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern
  Recognition
PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition
Qiuqiang Kong
Yin Cao
Turab Iqbal
Yuxuan Wang
Wenwu Wang
Mark D. Plumbley
VLM
SSL
189
1,076
0
21 Dec 2019
Clotho: An Audio Captioning Dataset
Clotho: An Audio Captioning Dataset
Konstantinos Drossos
Samuel Lipping
Tuomas Virtanen
98
389
0
21 Oct 2019
European Union regulations on algorithmic decision-making and a "right
  to explanation"
European Union regulations on algorithmic decision-making and a "right to explanation"
B. Goodman
Seth Flaxman
FaML
AILaw
63
1,901
0
28 Jun 2016
"Why Should I Trust You?": Explaining the Predictions of Any Classifier
"Why Should I Trust You?": Explaining the Predictions of Any Classifier
Marco Tulio Ribeiro
Sameer Singh
Carlos Guestrin
FAtt
FaML
1.2K
16,990
0
16 Feb 2016
1