Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2211.04772
Cited By
Efficient Large-scale Audio Tagging via Transformer-to-CNN Knowledge Distillation
9 November 2022
Florian Schmid
Khaled Koutini
Gerhard Widmer
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Efficient Large-scale Audio Tagging via Transformer-to-CNN Knowledge Distillation"
30 / 30 papers shown
Title
TACOS: Temporally-aligned Audio CaptiOnS for Language-Audio Pretraining
Paul Primus
Florian Schmid
Gerhard Widmer
CLIP
AI4TS
VLM
33
0
0
12 May 2025
Exploring Performance-Complexity Trade-Offs in Sound Event Detection
T. Morocutti
Florian Schmid
Jonathan Greif
Francesco Foscarin
Gerhard Widmer
38
0
0
14 Mar 2025
Creating a Good Teacher for Knowledge Distillation in Acoustic Scene Classification
T. Morocutti
Florian Schmid
Khaled Koutini
Gerhard Widmer
39
0
0
14 Mar 2025
A Survey of Recent Advances and Challenges in Deep Audio-Visual Correlation Learning
Luis Vilaca
Yi Yu
Paula Vinan
75
0
0
24 Nov 2024
Data-Efficient Low-Complexity Acoustic Scene Classification via Distilling and Progressive Pruning
Bing Han
Wen Huang
Zhengyang Chen
Anbai Jiang
Pingyi Fan
Cheng Lu
Zhiqiang Lv
Jia Liu
W. Zhang
Yanmin Qian
31
2
0
28 Oct 2024
Spectral and Rhythm Features for Audio Classification with Deep Convolutional Neural Networks
Friedrich Wolf-Monheim
21
2
0
09 Oct 2024
MT2KD: Towards A General-Purpose Encoder for Speech, Speaker, and Audio Events
Xiaoyu Yang
Qiujia Li
Chao Zhang
P. Woodland
24
0
0
25 Sep 2024
Generalization in birdsong classification: impact of transfer learning methods and dataset characteristics
Burooj Ghani
Vincent J. Kalkman
Bob Planqué
Willem-Pier Vellinga
L. Gill
Dan Stowell
VLM
32
5
0
21 Sep 2024
Effective Pre-Training of Audio Transformers for Sound Event Detection
Florian Schmid
T. Morocutti
Francesco Foscarin
Jan Schluter
Paul Primus
Gerhard Widmer
ViT
28
2
0
14 Sep 2024
Wav2Small: Distilling Wav2Vec2 to 72K parameters for Low-Resource Speech emotion recognition
Dionyssos Kounadis-Bastian
Oliver Schrufer
Anna Derington
H. Wierstorf
F. Eyben
Felix Burkhardt
Björn Schuller
29
1
0
25 Aug 2024
Macformer: Transformer with Random Maclaurin Feature Attention
Yuhan Guo
Lizhong Ding
Ye Yuan
Guoren Wang
46
0
0
21 Aug 2024
Estimated Audio-Caption Correspondences Improve Language-Based Audio Retrieval
Paul Primus
Florian Schmid
Gerhard Widmer
31
2
0
21 Aug 2024
TEAdapter: Supply abundant guidance for controllable text-to-music generation
Jialing Zou
Jiahao Mei
Xudong Nan
Jinghua Li
Daoguo Dong
Liang He
31
0
0
09 Aug 2024
Integrating IP Broadcasting with Audio Tags: Workflow and Challenges
Rhys Burchett-Vass
Arshdeep Singh
Gabriel Bibbó
Mark D. Plumbley
29
0
0
22 Jul 2024
Efficient Audio Captioning with Encoder-Level Knowledge Distillation
Xuenan Xu
Haohe Liu
Mengyue Wu
Wenwu Wang
Mark D. Plumbley
40
1
0
19 Jul 2024
Improving Audio Spectrogram Transformers for Sound Event Detection Through Multi-Stage Training
Florian Schmid
Paul Primus
T. Morocutti
Jonathan Greif
Gerhard Widmer
32
5
0
17 Jul 2024
Fusing Audio and Metadata Embeddings Improves Language-based Audio Retrieval
Paul Primus
Gerhard Widmer
52
3
0
22 Jun 2024
AudioRepInceptionNeXt: A lightweight single-stream architecture for efficient audio recognition
Kin Wai Lau
Yasar Abbas Ur Rehman
L. Po
44
1
0
21 Apr 2024
Robust Active Speaker Detection in Noisy Environments
Siva Sai Nagender Vasireddy
Chenxu Zhang
Xiaohu Guo
Yapeng Tian
40
0
0
27 Mar 2024
Tuning In: Analysis of Audio Classifier Performance in Clinical Settings with Limited Data
Hamza Mahdi
Eptehal Nashnoush
Rami Saab
Arjun Balachandar
Rishit Dagli
Lucas X. Perri
H. Khosravani
24
1
0
07 Feb 2024
Dynamic Convolutional Neural Networks as Efficient Pre-trained Audio Models
Florian Schmid
Khaled Koutini
Gerhard Widmer
18
11
0
24 Oct 2023
Can Language Models Laugh at YouTube Short-form Videos?
Dayoon Ko
Sangho Lee
Gunhee Kim
36
6
0
22 Oct 2023
CED: Consistent ensemble distillation for audio tagging
Heinrich Dinkel
Yongqing Wang
Zhiyong Yan
Junbo Zhang
Yujun Wang
26
17
0
23 Aug 2023
Domain Information Control at Inference Time for Acoustic Scene Classification
Shahed Masoudian
Khaled Koutini
Markus Schedl
Gerhard Widmer
Navid Rekabsaz
26
1
0
13 Jun 2023
Adapting a ConvNeXt model to audio classification on AudioSet
Thomas Pellegrini
Ismail Khalfaoui-Hassani
Etienne Labbé
T. Masquelier
6
21
0
01 Jun 2023
Streaming Audio Transformers for Online Audio Tagging
Heinrich Dinkel
Zhiyong Yan
Yongqing Wang
Junbo Zhang
Yujun Wang
Bin Wang
34
4
0
29 May 2023
Low-Complexity Audio Embedding Extractors
Florian Schmid
Khaled Koutini
Gerhard Widmer
21
4
0
03 Mar 2023
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection
Ke Chen
Xingjian Du
Bilei Zhu
Zejun Ma
Taylor Berg-Kirkpatrick
Shlomo Dubnov
ViT
118
264
0
02 Feb 2022
PSLA: Improving Audio Tagging with Pretraining, Sampling, Labeling, and Aggregation
Yuan Gong
Yu-An Chung
James R. Glass
VLM
104
144
0
02 Feb 2021
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
Andrew G. Howard
Menglong Zhu
Bo Chen
Dmitry Kalenichenko
Weijun Wang
Tobias Weyand
M. Andreetto
Hartwig Adam
3DH
950
20,567
0
17 Apr 2017
1