Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2007.11154
Cited By
Rethinking CNN Models for Audio Classification
22 July 2020
Kamalesh Palanisamy
Dipika Singhania
Angela Yao
SSL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Rethinking CNN Models for Audio Classification"
48 / 48 papers shown
Title
When Vision Models Meet Parameter Efficient Look-Aside Adapters Without Large-Scale Audio Pretraining
Juan Yeo
Jinkwan Jang
Kyubyung Chae
Seongkyu Mun
Taesup Kim
VLM
62
0
0
08 Dec 2024
Transfer Learning in Vocal Education: Technical Evaluation of Limited Samples Describing Mezzo-soprano
Zhenyi Hou
Xu Zhao
Kejie Ye
Xinyu Sheng
Shanggerile Jiang
...
Jiaxing Chen
Yan Zou
Yuchao Feng
Guangyu Fan
Xin Yuan
DiffM
35
1
0
30 Oct 2024
Towards reliable respiratory disease diagnosis based on cough sounds and vision transformers
Qian Wang
Zhaoyang Bu
Jiaxuan Mao
Wenyu Zhu
Jingya Zhao
Wei Du
Guochao Shi
Min Zhou
Si Chen
Jieming Qu
MedIm
41
0
0
28 Aug 2024
Imperceptible Rhythm Backdoor Attacks: Exploring Rhythm Transformation for Embedding Undetectable Vulnerabilities on Speech Recognition
Wenhan Yao
Jiangkun Yang
yongqiang He
Jia Liu
Weiping Wen
52
1
0
16 Jun 2024
FastAST: Accelerating Audio Spectrogram Transformer via Token Merging and Cross-Model Knowledge Distillation
Swarup Ranjan Behera
Abhishek Dhiman
Karthik Gowda
Aalekhya Satya Narayani
26
1
0
11 Jun 2024
Images that Sound: Composing Images and Sounds on a Single Canvas
Ziyang Chen
Daniel Geng
Andrew Owens
DiffM
50
9
0
20 May 2024
Exploring Vision Transformers for 3D Human Motion-Language Models with Motion Patches
Qing Yu
Mikihiro Tanaka
Kent Fujiwara
ViT
47
2
0
08 May 2024
Tuning In: Analysis of Audio Classifier Performance in Clinical Settings with Limited Data
Hamza Mahdi
Eptehal Nashnoush
Rami Saab
Arjun Balachandar
Rishit Dagli
Lucas X. Perri
H. Khosravani
24
1
0
07 Feb 2024
Emotional Speech-driven 3D Body Animation via Disentangled Latent Diffusion
Kiran Chhatre
Radek Danvevcek
Nikos Athanasiou
Giorgio Becherini
Christopher Peters
Michael J. Black
Timo Bolkart
DiffM
36
16
0
07 Dec 2023
A Holistic Evaluation of Piano Sound Quality
Monan Zhou
Shangda Wu
Shaohua Ji
Zijin Li
Wei Li
29
0
0
07 Oct 2023
LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
Bin Zhu
Bin Lin
Munan Ning
Yang Yan
Jiaxi Cui
...
Zongwei Li
Wancai Zhang
Zhifeng Li
Wei Liu
Liejie Yuan
VLM
MLLM
32
205
0
03 Oct 2023
Real-Time Emergency Vehicle Detection using Mel Spectrograms and Regular Expressions
Alberto Pacheco-Gonzalez
Raymund J. Torres
R. Chacon
Isidro Robledo
16
1
0
25 Sep 2023
A Large-scale Dataset for Audio-Language Representation Learning
Luoyi Sun
Xuenan Xu
Mengyue Wu
Weidi Xie
34
20
0
20 Sep 2023
AudRandAug: Random Image Augmentations for Audio Classification
Teerath Kumar
Muhammad Turab
Alessandra Mileo
Malika Bendechache
Takfarinas Saber
23
7
0
09 Sep 2023
Example-Based Framework for Perceptually Guided Audio Texture Generation
Purnima Kamath
Chitralekha Gupta
L. Wyse
Suranga Nanayakkara
24
4
0
23 Aug 2023
Improving Primate Sounds Classification using Binary Presorting for Deep Learning
Michael Kolle
Steffen Illium
Maximilian Zorn
Jonas Nusslein
Patrick Suchostawski
Claudia Linnhoff-Popien
35
1
0
28 Jun 2023
Dual Bayesian ResNet: A Deep Learning Approach to Heart Murmur Detection
Benjamin Walker
"Felix H. Krones
Ivan Kiskin
Guy Parsons
Terry Lyons
Adam Mahdi
9
15
0
26 May 2023
Towards generalizing deep-audio fake detection networks
Konstantin Gasenzer
Moritz Wolter
36
4
0
22 May 2023
Towards Controllable Audio Texture Morphing
Chitralekha Gupta
Purnima Kamath
Yize Wei
Zhuoyao Li
Suranga Nanayakkara
L. Wyse
26
4
0
23 Apr 2023
Detection and classification of vocal productions in large scale audio recordings
Guillem Bonafos
Pierre Pudlo
Jean-Marc Freyermuth
T. Legou
J. Fagot
Samuel Tronccon
Arnaud Rey
AI4TS
21
1
0
14 Feb 2023
SemanticAC: Semantics-Assisted Framework for Audio Classification
Yicheng Xiao
Yue Ma
Shuyan Li
Hantao Zhou
Ran Liao
Xiu Li
13
8
0
12 Feb 2023
Revisiting Pre-training in Audio-Visual Learning
Ruoxuan Feng
Wenke Xia
Di Hu
36
1
0
07 Feb 2023
GAFX: A General Audio Feature eXtractor
Zhaoyang Bu
Han Zhang
Xiaohu Zhu
30
0
0
19 Jul 2022
Sound Model Factory: An Integrated System Architecture for Generative Audio Modelling
L. Wyse
Purnima Kamath
Chitralekha Gupta
11
9
0
27 Jun 2022
Domain Generalization with Relaxed Instance Frequency-wise Normalization for Multi-device Acoustic Scene Classification
Byeonggeun Kim
Seunghan Yang
Jangho Kim
Hyunsin Park
Juntae Lee
Simyung Chang
43
28
0
24 Jun 2022
QbyE-MLPMixer: Query-by-Example Open-Vocabulary Keyword Spotting using MLPMixer
Jinmiao Huang
W. Gharbieh
Qianhui Wan
Han Suk Shim
Chul Lee
22
9
0
23 Jun 2022
Investigating Multi-Feature Selection and Ensembling for Audio Classification
Muhammad Turab
Teerath Kumar
Malika Bendechache
Takfarinas Saber
33
41
0
15 Jun 2022
Deep Learning-based automated classification of Chinese Speech Sound Disorders
Yao-Ming Kuo
S. Ruan
Yu-Chin Chen
Ya-Wen Tu
19
6
0
24 May 2022
UFRC: A Unified Framework for Reliable COVID-19 Detection on Crowdsourced Cough Audio
Jiangeng Chang
Y. Ruan
Shaoze Cui
John Soong Tshon Yit
Mengling Feng
24
6
0
16 Apr 2022
CMKD: CNN/Transformer-Based Cross-Model Knowledge Distillation for Audio Classification
Yuan Gong
Sameer Khurana
Andrew Rouditchenko
James R. Glass
VLM
25
29
0
13 Mar 2022
Maximizing Audio Event Detection Model Performance on Small Datasets Through Knowledge Transfer, Data Augmentation, And Pretraining: An Ablation Study
Daniel C. Tompkins
Kshitiz Kumar
Jian Wu
17
5
0
07 Feb 2022
Continual Transformers: Redundancy-Free Attention for Online Inference
Lukas Hedegaard
Arian Bakhtiarnia
Alexandros Iosifidis
CLL
27
11
0
17 Jan 2022
Deep Learning for Enhanced Scratch Input
Aman Bhargava
Alice Zhou
Adam Carnaffan
Steve Mann
20
0
0
30 Nov 2021
NeuroView: Explainable Deep Network Decision Making
C. Barberan
Randall Balestriero
Richard G. Baraniuk
FAtt
13
2
0
15 Oct 2021
HumBugDB: A Large-scale Acoustic Mosquito Dataset
Ivan Kiskin
Marianne E. Sinka
Adam D. Cobb
Waqas Rafique
Lawrence Wang
...
E. Kaindoa
G. Killeen
Eva Herreros-Moya
Katherine J. Willis
Stephen J. Roberts
46
28
0
14 Oct 2021
Cross-domain Semi-Supervised Audio Event Classification Using Contrastive Regularization
Donmoon Lee
Kyogu Lee
25
3
0
29 Sep 2021
A Visual Domain Transfer Learning Approach for Heartbeat Sound Classification
Uddipan Mukherjee
Sidharth Pancholi
27
0
0
28 Jul 2021
AudioCLIP: Extending CLIP to Image, Text and Audio
A. Guzhov
Federico Raue
Jörn Hees
Andreas Dengel
CLIP
VLM
25
360
0
24 Jun 2021
Do sound event representations generalize to other audio tasks? A case study in audio transfer learning
Anurag Kumar
Yun Wang
V. Ithapu
Christian Fuegen
24
3
0
21 Jun 2021
Single-Layer Vision Transformers for More Accurate Early Exits with Less Overhead
Arian Bakhtiarnia
Qi Zhang
Alexandros Iosifidis
27
35
0
19 May 2021
ESResNe(X)t-fbsp: Learning Robust Time-Frequency Transformation of Audio
A. Guzhov
Federico Raue
Jörn Hees
Andreas Dengel
21
38
0
23 Apr 2021
Detection of Audio-Video Synchronization Errors Via Event Detection
Joshua Peter Ebenezer
Yongjun Wu
Hai Wei
S. Sethuraman
Z. Liu
37
12
0
20 Apr 2021
AST: Audio Spectrogram Transformer
Yuan Gong
Yu-An Chung
James R. Glass
ViT
40
830
0
05 Apr 2021
SoundCLR: Contrastive Learning of Representations For Improved Environmental Sound Classification
Alireza Nasiri
Jianjun Hu
30
17
0
02 Mar 2021
PSLA: Improving Audio Tagging with Pretraining, Sampling, Labeling, and Aggregation
Yuan Gong
Yu-An Chung
James R. Glass
VLM
104
144
0
02 Feb 2021
Triplet Entropy Loss: Improving The Generalisation of Short Speech Language Identification Systems
Ruan van der Merwe
20
7
0
03 Dec 2020
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles
Balaji Lakshminarayanan
Alexander Pritzel
Charles Blundell
UQCV
BDL
276
5,675
0
05 Dec 2016
SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
Vijay Badrinarayanan
Alex Kendall
R. Cipolla
SSeg
446
15,645
0
02 Nov 2015
1