ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2211.04772
  4. Cited By
Efficient Large-scale Audio Tagging via Transformer-to-CNN Knowledge
  Distillation

Efficient Large-scale Audio Tagging via Transformer-to-CNN Knowledge Distillation

9 November 2022
Florian Schmid
Khaled Koutini
Gerhard Widmer
    ViT
ArXivPDFHTML

Papers citing "Efficient Large-scale Audio Tagging via Transformer-to-CNN Knowledge Distillation"

30 / 30 papers shown
Title
TACOS: Temporally-aligned Audio CaptiOnS for Language-Audio Pretraining
TACOS: Temporally-aligned Audio CaptiOnS for Language-Audio Pretraining
Paul Primus
Florian Schmid
Gerhard Widmer
CLIP
AI4TS
VLM
33
0
0
12 May 2025
Exploring Performance-Complexity Trade-Offs in Sound Event Detection
T. Morocutti
Florian Schmid
Jonathan Greif
Francesco Foscarin
Gerhard Widmer
38
0
0
14 Mar 2025
Creating a Good Teacher for Knowledge Distillation in Acoustic Scene Classification
T. Morocutti
Florian Schmid
Khaled Koutini
Gerhard Widmer
39
0
0
14 Mar 2025
A Survey of Recent Advances and Challenges in Deep Audio-Visual Correlation Learning
Luis Vilaca
Yi Yu
Paula Vinan
75
0
0
24 Nov 2024
Data-Efficient Low-Complexity Acoustic Scene Classification via
  Distilling and Progressive Pruning
Data-Efficient Low-Complexity Acoustic Scene Classification via Distilling and Progressive Pruning
Bing Han
Wen Huang
Zhengyang Chen
Anbai Jiang
Pingyi Fan
Cheng Lu
Zhiqiang Lv
Jia Liu
W. Zhang
Yanmin Qian
31
2
0
28 Oct 2024
Spectral and Rhythm Features for Audio Classification with Deep
  Convolutional Neural Networks
Spectral and Rhythm Features for Audio Classification with Deep Convolutional Neural Networks
Friedrich Wolf-Monheim
21
2
0
09 Oct 2024
MT2KD: Towards A General-Purpose Encoder for Speech, Speaker, and Audio Events
MT2KD: Towards A General-Purpose Encoder for Speech, Speaker, and Audio Events
Xiaoyu Yang
Qiujia Li
Chao Zhang
P. Woodland
24
0
0
25 Sep 2024
Generalization in birdsong classification: impact of transfer learning
  methods and dataset characteristics
Generalization in birdsong classification: impact of transfer learning methods and dataset characteristics
Burooj Ghani
Vincent J. Kalkman
Bob Planqué
Willem-Pier Vellinga
L. Gill
Dan Stowell
VLM
32
5
0
21 Sep 2024
Effective Pre-Training of Audio Transformers for Sound Event Detection
Effective Pre-Training of Audio Transformers for Sound Event Detection
Florian Schmid
T. Morocutti
Francesco Foscarin
Jan Schluter
Paul Primus
Gerhard Widmer
ViT
28
2
0
14 Sep 2024
Wav2Small: Distilling Wav2Vec2 to 72K parameters for Low-Resource Speech
  emotion recognition
Wav2Small: Distilling Wav2Vec2 to 72K parameters for Low-Resource Speech emotion recognition
Dionyssos Kounadis-Bastian
Oliver Schrufer
Anna Derington
H. Wierstorf
F. Eyben
Felix Burkhardt
Björn Schuller
29
1
0
25 Aug 2024
Macformer: Transformer with Random Maclaurin Feature Attention
Macformer: Transformer with Random Maclaurin Feature Attention
Yuhan Guo
Lizhong Ding
Ye Yuan
Guoren Wang
46
0
0
21 Aug 2024
Estimated Audio-Caption Correspondences Improve Language-Based Audio
  Retrieval
Estimated Audio-Caption Correspondences Improve Language-Based Audio Retrieval
Paul Primus
Florian Schmid
Gerhard Widmer
31
2
0
21 Aug 2024
TEAdapter: Supply abundant guidance for controllable text-to-music
  generation
TEAdapter: Supply abundant guidance for controllable text-to-music generation
Jialing Zou
Jiahao Mei
Xudong Nan
Jinghua Li
Daoguo Dong
Liang He
31
0
0
09 Aug 2024
Integrating IP Broadcasting with Audio Tags: Workflow and Challenges
Integrating IP Broadcasting with Audio Tags: Workflow and Challenges
Rhys Burchett-Vass
Arshdeep Singh
Gabriel Bibbó
Mark D. Plumbley
29
0
0
22 Jul 2024
Efficient Audio Captioning with Encoder-Level Knowledge Distillation
Efficient Audio Captioning with Encoder-Level Knowledge Distillation
Xuenan Xu
Haohe Liu
Mengyue Wu
Wenwu Wang
Mark D. Plumbley
40
1
0
19 Jul 2024
Improving Audio Spectrogram Transformers for Sound Event Detection
  Through Multi-Stage Training
Improving Audio Spectrogram Transformers for Sound Event Detection Through Multi-Stage Training
Florian Schmid
Paul Primus
T. Morocutti
Jonathan Greif
Gerhard Widmer
32
5
0
17 Jul 2024
Fusing Audio and Metadata Embeddings Improves Language-based Audio
  Retrieval
Fusing Audio and Metadata Embeddings Improves Language-based Audio Retrieval
Paul Primus
Gerhard Widmer
52
3
0
22 Jun 2024
AudioRepInceptionNeXt: A lightweight single-stream architecture for
  efficient audio recognition
AudioRepInceptionNeXt: A lightweight single-stream architecture for efficient audio recognition
Kin Wai Lau
Yasar Abbas Ur Rehman
L. Po
44
1
0
21 Apr 2024
Robust Active Speaker Detection in Noisy Environments
Robust Active Speaker Detection in Noisy Environments
Siva Sai Nagender Vasireddy
Chenxu Zhang
Xiaohu Guo
Yapeng Tian
40
0
0
27 Mar 2024
Tuning In: Analysis of Audio Classifier Performance in Clinical Settings
  with Limited Data
Tuning In: Analysis of Audio Classifier Performance in Clinical Settings with Limited Data
Hamza Mahdi
Eptehal Nashnoush
Rami Saab
Arjun Balachandar
Rishit Dagli
Lucas X. Perri
H. Khosravani
24
1
0
07 Feb 2024
Dynamic Convolutional Neural Networks as Efficient Pre-trained Audio
  Models
Dynamic Convolutional Neural Networks as Efficient Pre-trained Audio Models
Florian Schmid
Khaled Koutini
Gerhard Widmer
18
11
0
24 Oct 2023
Can Language Models Laugh at YouTube Short-form Videos?
Can Language Models Laugh at YouTube Short-form Videos?
Dayoon Ko
Sangho Lee
Gunhee Kim
36
6
0
22 Oct 2023
CED: Consistent ensemble distillation for audio tagging
CED: Consistent ensemble distillation for audio tagging
Heinrich Dinkel
Yongqing Wang
Zhiyong Yan
Junbo Zhang
Yujun Wang
26
17
0
23 Aug 2023
Domain Information Control at Inference Time for Acoustic Scene
  Classification
Domain Information Control at Inference Time for Acoustic Scene Classification
Shahed Masoudian
Khaled Koutini
Markus Schedl
Gerhard Widmer
Navid Rekabsaz
26
1
0
13 Jun 2023
Adapting a ConvNeXt model to audio classification on AudioSet
Adapting a ConvNeXt model to audio classification on AudioSet
Thomas Pellegrini
Ismail Khalfaoui-Hassani
Etienne Labbé
T. Masquelier
6
21
0
01 Jun 2023
Streaming Audio Transformers for Online Audio Tagging
Streaming Audio Transformers for Online Audio Tagging
Heinrich Dinkel
Zhiyong Yan
Yongqing Wang
Junbo Zhang
Yujun Wang
Bin Wang
34
4
0
29 May 2023
Low-Complexity Audio Embedding Extractors
Low-Complexity Audio Embedding Extractors
Florian Schmid
Khaled Koutini
Gerhard Widmer
21
4
0
03 Mar 2023
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound
  Classification and Detection
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection
Ke Chen
Xingjian Du
Bilei Zhu
Zejun Ma
Taylor Berg-Kirkpatrick
Shlomo Dubnov
ViT
118
264
0
02 Feb 2022
PSLA: Improving Audio Tagging with Pretraining, Sampling, Labeling, and
  Aggregation
PSLA: Improving Audio Tagging with Pretraining, Sampling, Labeling, and Aggregation
Yuan Gong
Yu-An Chung
James R. Glass
VLM
104
144
0
02 Feb 2021
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision
  Applications
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
Andrew G. Howard
Menglong Zhu
Bo Chen
Dmitry Kalenichenko
Weijun Wang
Tobias Weyand
M. Andreetto
Hartwig Adam
3DH
950
20,567
0
17 Apr 2017
1