ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2007.11154
  4. Cited By
Rethinking CNN Models for Audio Classification

Rethinking CNN Models for Audio Classification

22 July 2020
Kamalesh Palanisamy
Dipika Singhania
Angela Yao
    SSL
ArXivPDFHTML

Papers citing "Rethinking CNN Models for Audio Classification"

48 / 48 papers shown
Title
When Vision Models Meet Parameter Efficient Look-Aside Adapters Without
  Large-Scale Audio Pretraining
When Vision Models Meet Parameter Efficient Look-Aside Adapters Without Large-Scale Audio Pretraining
Juan Yeo
Jinkwan Jang
Kyubyung Chae
Seongkyu Mun
Taesup Kim
VLM
62
0
0
08 Dec 2024
Transfer Learning in Vocal Education: Technical Evaluation of Limited
  Samples Describing Mezzo-soprano
Transfer Learning in Vocal Education: Technical Evaluation of Limited Samples Describing Mezzo-soprano
Zhenyi Hou
Xu Zhao
Kejie Ye
Xinyu Sheng
Shanggerile Jiang
...
Jiaxing Chen
Yan Zou
Yuchao Feng
Guangyu Fan
Xin Yuan
DiffM
35
1
0
30 Oct 2024
Towards reliable respiratory disease diagnosis based on cough sounds and
  vision transformers
Towards reliable respiratory disease diagnosis based on cough sounds and vision transformers
Qian Wang
Zhaoyang Bu
Jiaxuan Mao
Wenyu Zhu
Jingya Zhao
Wei Du
Guochao Shi
Min Zhou
Si Chen
Jieming Qu
MedIm
41
0
0
28 Aug 2024
Imperceptible Rhythm Backdoor Attacks: Exploring Rhythm Transformation
  for Embedding Undetectable Vulnerabilities on Speech Recognition
Imperceptible Rhythm Backdoor Attacks: Exploring Rhythm Transformation for Embedding Undetectable Vulnerabilities on Speech Recognition
Wenhan Yao
Jiangkun Yang
yongqiang He
Jia Liu
Weiping Wen
52
1
0
16 Jun 2024
FastAST: Accelerating Audio Spectrogram Transformer via Token Merging
  and Cross-Model Knowledge Distillation
FastAST: Accelerating Audio Spectrogram Transformer via Token Merging and Cross-Model Knowledge Distillation
Swarup Ranjan Behera
Abhishek Dhiman
Karthik Gowda
Aalekhya Satya Narayani
26
1
0
11 Jun 2024
Images that Sound: Composing Images and Sounds on a Single Canvas
Images that Sound: Composing Images and Sounds on a Single Canvas
Ziyang Chen
Daniel Geng
Andrew Owens
DiffM
50
9
0
20 May 2024
Exploring Vision Transformers for 3D Human Motion-Language Models with
  Motion Patches
Exploring Vision Transformers for 3D Human Motion-Language Models with Motion Patches
Qing Yu
Mikihiro Tanaka
Kent Fujiwara
ViT
47
2
0
08 May 2024
Tuning In: Analysis of Audio Classifier Performance in Clinical Settings
  with Limited Data
Tuning In: Analysis of Audio Classifier Performance in Clinical Settings with Limited Data
Hamza Mahdi
Eptehal Nashnoush
Rami Saab
Arjun Balachandar
Rishit Dagli
Lucas X. Perri
H. Khosravani
24
1
0
07 Feb 2024
Emotional Speech-driven 3D Body Animation via Disentangled Latent
  Diffusion
Emotional Speech-driven 3D Body Animation via Disentangled Latent Diffusion
Kiran Chhatre
Radek Danvevcek
Nikos Athanasiou
Giorgio Becherini
Christopher Peters
Michael J. Black
Timo Bolkart
DiffM
36
16
0
07 Dec 2023
A Holistic Evaluation of Piano Sound Quality
A Holistic Evaluation of Piano Sound Quality
Monan Zhou
Shangda Wu
Shaohua Ji
Zijin Li
Wei Li
29
0
0
07 Oct 2023
LanguageBind: Extending Video-Language Pretraining to N-modality by
  Language-based Semantic Alignment
LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
Bin Zhu
Bin Lin
Munan Ning
Yang Yan
Jiaxi Cui
...
Zongwei Li
Wancai Zhang
Zhifeng Li
Wei Liu
Liejie Yuan
VLM
MLLM
32
205
0
03 Oct 2023
Real-Time Emergency Vehicle Detection using Mel Spectrograms and Regular
  Expressions
Real-Time Emergency Vehicle Detection using Mel Spectrograms and Regular Expressions
Alberto Pacheco-Gonzalez
Raymund J. Torres
R. Chacon
Isidro Robledo
16
1
0
25 Sep 2023
A Large-scale Dataset for Audio-Language Representation Learning
A Large-scale Dataset for Audio-Language Representation Learning
Luoyi Sun
Xuenan Xu
Mengyue Wu
Weidi Xie
34
20
0
20 Sep 2023
AudRandAug: Random Image Augmentations for Audio Classification
AudRandAug: Random Image Augmentations for Audio Classification
Teerath Kumar
Muhammad Turab
Alessandra Mileo
Malika Bendechache
Takfarinas Saber
23
7
0
09 Sep 2023
Example-Based Framework for Perceptually Guided Audio Texture Generation
Example-Based Framework for Perceptually Guided Audio Texture Generation
Purnima Kamath
Chitralekha Gupta
L. Wyse
Suranga Nanayakkara
24
4
0
23 Aug 2023
Improving Primate Sounds Classification using Binary Presorting for Deep
  Learning
Improving Primate Sounds Classification using Binary Presorting for Deep Learning
Michael Kolle
Steffen Illium
Maximilian Zorn
Jonas Nusslein
Patrick Suchostawski
Claudia Linnhoff-Popien
35
1
0
28 Jun 2023
Dual Bayesian ResNet: A Deep Learning Approach to Heart Murmur Detection
Dual Bayesian ResNet: A Deep Learning Approach to Heart Murmur Detection
Benjamin Walker
"Felix H. Krones
Ivan Kiskin
Guy Parsons
Terry Lyons
Adam Mahdi
9
15
0
26 May 2023
Towards generalizing deep-audio fake detection networks
Towards generalizing deep-audio fake detection networks
Konstantin Gasenzer
Moritz Wolter
36
4
0
22 May 2023
Towards Controllable Audio Texture Morphing
Towards Controllable Audio Texture Morphing
Chitralekha Gupta
Purnima Kamath
Yize Wei
Zhuoyao Li
Suranga Nanayakkara
L. Wyse
26
4
0
23 Apr 2023
Detection and classification of vocal productions in large scale audio
  recordings
Detection and classification of vocal productions in large scale audio recordings
Guillem Bonafos
Pierre Pudlo
Jean-Marc Freyermuth
T. Legou
J. Fagot
Samuel Tronccon
Arnaud Rey
AI4TS
21
1
0
14 Feb 2023
SemanticAC: Semantics-Assisted Framework for Audio Classification
SemanticAC: Semantics-Assisted Framework for Audio Classification
Yicheng Xiao
Yue Ma
Shuyan Li
Hantao Zhou
Ran Liao
Xiu Li
13
8
0
12 Feb 2023
Revisiting Pre-training in Audio-Visual Learning
Revisiting Pre-training in Audio-Visual Learning
Ruoxuan Feng
Wenke Xia
Di Hu
36
1
0
07 Feb 2023
GAFX: A General Audio Feature eXtractor
GAFX: A General Audio Feature eXtractor
Zhaoyang Bu
Han Zhang
Xiaohu Zhu
30
0
0
19 Jul 2022
Sound Model Factory: An Integrated System Architecture for Generative
  Audio Modelling
Sound Model Factory: An Integrated System Architecture for Generative Audio Modelling
L. Wyse
Purnima Kamath
Chitralekha Gupta
11
9
0
27 Jun 2022
Domain Generalization with Relaxed Instance Frequency-wise Normalization
  for Multi-device Acoustic Scene Classification
Domain Generalization with Relaxed Instance Frequency-wise Normalization for Multi-device Acoustic Scene Classification
Byeonggeun Kim
Seunghan Yang
Jangho Kim
Hyunsin Park
Juntae Lee
Simyung Chang
43
28
0
24 Jun 2022
QbyE-MLPMixer: Query-by-Example Open-Vocabulary Keyword Spotting using
  MLPMixer
QbyE-MLPMixer: Query-by-Example Open-Vocabulary Keyword Spotting using MLPMixer
Jinmiao Huang
W. Gharbieh
Qianhui Wan
Han Suk Shim
Chul Lee
22
9
0
23 Jun 2022
Investigating Multi-Feature Selection and Ensembling for Audio
  Classification
Investigating Multi-Feature Selection and Ensembling for Audio Classification
Muhammad Turab
Teerath Kumar
Malika Bendechache
Takfarinas Saber
33
41
0
15 Jun 2022
Deep Learning-based automated classification of Chinese Speech Sound
  Disorders
Deep Learning-based automated classification of Chinese Speech Sound Disorders
Yao-Ming Kuo
S. Ruan
Yu-Chin Chen
Ya-Wen Tu
19
6
0
24 May 2022
UFRC: A Unified Framework for Reliable COVID-19 Detection on
  Crowdsourced Cough Audio
UFRC: A Unified Framework for Reliable COVID-19 Detection on Crowdsourced Cough Audio
Jiangeng Chang
Y. Ruan
Shaoze Cui
John Soong Tshon Yit
Mengling Feng
24
6
0
16 Apr 2022
CMKD: CNN/Transformer-Based Cross-Model Knowledge Distillation for Audio
  Classification
CMKD: CNN/Transformer-Based Cross-Model Knowledge Distillation for Audio Classification
Yuan Gong
Sameer Khurana
Andrew Rouditchenko
James R. Glass
VLM
25
29
0
13 Mar 2022
Maximizing Audio Event Detection Model Performance on Small Datasets
  Through Knowledge Transfer, Data Augmentation, And Pretraining: An Ablation
  Study
Maximizing Audio Event Detection Model Performance on Small Datasets Through Knowledge Transfer, Data Augmentation, And Pretraining: An Ablation Study
Daniel C. Tompkins
Kshitiz Kumar
Jian Wu
17
5
0
07 Feb 2022
Continual Transformers: Redundancy-Free Attention for Online Inference
Continual Transformers: Redundancy-Free Attention for Online Inference
Lukas Hedegaard
Arian Bakhtiarnia
Alexandros Iosifidis
CLL
27
11
0
17 Jan 2022
Deep Learning for Enhanced Scratch Input
Deep Learning for Enhanced Scratch Input
Aman Bhargava
Alice Zhou
Adam Carnaffan
Steve Mann
20
0
0
30 Nov 2021
NeuroView: Explainable Deep Network Decision Making
NeuroView: Explainable Deep Network Decision Making
C. Barberan
Randall Balestriero
Richard G. Baraniuk
FAtt
13
2
0
15 Oct 2021
HumBugDB: A Large-scale Acoustic Mosquito Dataset
HumBugDB: A Large-scale Acoustic Mosquito Dataset
Ivan Kiskin
Marianne E. Sinka
Adam D. Cobb
Waqas Rafique
Lawrence Wang
...
E. Kaindoa
G. Killeen
Eva Herreros-Moya
Katherine J. Willis
Stephen J. Roberts
46
28
0
14 Oct 2021
Cross-domain Semi-Supervised Audio Event Classification Using
  Contrastive Regularization
Cross-domain Semi-Supervised Audio Event Classification Using Contrastive Regularization
Donmoon Lee
Kyogu Lee
25
3
0
29 Sep 2021
A Visual Domain Transfer Learning Approach for Heartbeat Sound
  Classification
A Visual Domain Transfer Learning Approach for Heartbeat Sound Classification
Uddipan Mukherjee
Sidharth Pancholi
27
0
0
28 Jul 2021
AudioCLIP: Extending CLIP to Image, Text and Audio
AudioCLIP: Extending CLIP to Image, Text and Audio
A. Guzhov
Federico Raue
Jörn Hees
Andreas Dengel
CLIP
VLM
25
360
0
24 Jun 2021
Do sound event representations generalize to other audio tasks? A case
  study in audio transfer learning
Do sound event representations generalize to other audio tasks? A case study in audio transfer learning
Anurag Kumar
Yun Wang
V. Ithapu
Christian Fuegen
24
3
0
21 Jun 2021
Single-Layer Vision Transformers for More Accurate Early Exits with Less
  Overhead
Single-Layer Vision Transformers for More Accurate Early Exits with Less Overhead
Arian Bakhtiarnia
Qi Zhang
Alexandros Iosifidis
27
35
0
19 May 2021
ESResNe(X)t-fbsp: Learning Robust Time-Frequency Transformation of Audio
ESResNe(X)t-fbsp: Learning Robust Time-Frequency Transformation of Audio
A. Guzhov
Federico Raue
Jörn Hees
Andreas Dengel
21
38
0
23 Apr 2021
Detection of Audio-Video Synchronization Errors Via Event Detection
Detection of Audio-Video Synchronization Errors Via Event Detection
Joshua Peter Ebenezer
Yongjun Wu
Hai Wei
S. Sethuraman
Z. Liu
37
12
0
20 Apr 2021
AST: Audio Spectrogram Transformer
AST: Audio Spectrogram Transformer
Yuan Gong
Yu-An Chung
James R. Glass
ViT
40
830
0
05 Apr 2021
SoundCLR: Contrastive Learning of Representations For Improved
  Environmental Sound Classification
SoundCLR: Contrastive Learning of Representations For Improved Environmental Sound Classification
Alireza Nasiri
Jianjun Hu
30
17
0
02 Mar 2021
PSLA: Improving Audio Tagging with Pretraining, Sampling, Labeling, and
  Aggregation
PSLA: Improving Audio Tagging with Pretraining, Sampling, Labeling, and Aggregation
Yuan Gong
Yu-An Chung
James R. Glass
VLM
104
144
0
02 Feb 2021
Triplet Entropy Loss: Improving The Generalisation of Short Speech
  Language Identification Systems
Triplet Entropy Loss: Improving The Generalisation of Short Speech Language Identification Systems
Ruan van der Merwe
20
7
0
03 Dec 2020
Simple and Scalable Predictive Uncertainty Estimation using Deep
  Ensembles
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles
Balaji Lakshminarayanan
Alexander Pritzel
Charles Blundell
UQCV
BDL
276
5,675
0
05 Dec 2016
SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image
  Segmentation
SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
Vijay Badrinarayanan
Alex Kendall
R. Cipolla
SSeg
446
15,645
0
02 Nov 2015
1