Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1609.09430
Cited By
CNN Architectures for Large-Scale Audio Classification
29 September 2016
Shawn Hershey
Sourish Chaudhuri
D. Ellis
J. Gemmeke
A. Jansen
R. C. Moore
Manoj Plakal
D. Platt
Rif A. Saurous
Bryan Seybold
M. Slaney
Ron J. Weiss
K. Wilson
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CNN Architectures for Large-Scale Audio Classification"
50 / 336 papers shown
Title
Audio-Visual Event Localization via Recursive Fusion by Joint Co-Attention
Bin Duan
Hao Tang
Wei Wang
Ziliang Zong
Guowei Yang
Yan Yan
33
59
0
14 Aug 2020
Surgical Mask Detection with Convolutional Neural Networks and Data Augmentations on Spectrograms
Steffen Illium
Robert Muller
Andreas Sedlmeier
Claudia Linnhoff-Popien
18
11
0
11 Aug 2020
Rethinking CNN Models for Audio Classification
Kamalesh Palanisamy
Dipika Singhania
Angela Yao
SSL
25
144
0
22 Jul 2020
Multi-modal Transformer for Video Retrieval
Valentin Gabeur
Chen Sun
Alahari Karteek
Cordelia Schmid
ViT
424
596
0
21 Jul 2020
Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video Parsing
Yapeng Tian
Dingzeyu Li
Chenliang Xu
34
180
0
21 Jul 2020
Streaming ResLSTM with Causal Mean Aggregation for Device-Directed Utterance Detection
Xiaosu Tong
Che-Wei Huang
Sri Harish Reddy Mallidi
Shaun Joseph
Sonal Pareek
Chander Chandak
Ariya Rastrow
Roland Maas
14
5
0
17 Jul 2020
Deep multi-metric learning for text-independent speaker verification
Jiwei Xu
Xinggang Wang
Bin Feng
Wenyu Liu
46
25
0
17 Jul 2020
Multiple Sound Sources Localization from Coarse to Fine
Rui Qian
Di Hu
Heinrich Dinkel
Mengyue Wu
N. Xu
Weiyao Lin
28
155
0
13 Jul 2020
Visualizing Classification Structure of Large-Scale Classifiers
B. Alsallakh
Zhixin Yan
Shabnam Ghaffarzadegan
Zeng Dai
Liu Ren
FAtt
10
1
0
12 Jul 2020
A Sequential Self Teaching Approach for Improving Generalization in Sound Event Recognition
Anurag Kumar
V. Ithapu
19
35
0
30 Jun 2020
Implicit Neural Representations with Periodic Activation Functions
Vincent Sitzmann
Julien N. P. Martel
Alexander W. Bergman
David B. Lindell
Gordon Wetzstein
AI4TS
47
2,486
0
17 Jun 2020
Visual Attention for Musical Instrument Recognition
Karn N. Watcharasupat
Francesco Ferroni
Alexander Lerch
19
3
0
17 Jun 2020
Telling Left from Right: Learning Spatial Correspondence of Sight and Sound
Karren D. Yang
Bryan C. Russell
Justin Salamon
SSL
24
75
0
11 Jun 2020
Exploring Automatic Diagnosis of COVID-19 from Crowdsourced Respiratory Sound Data
Chloë Brown
Jagmohan Chauhan
Andreas Grammenos
Jing Han
Apinan Hasthanasombat
Dimitris Spathis
Tong Xia
Pietro Cicuta
Cecilia Mascolo
36
411
0
10 Jun 2020
Toward Automated Classroom Observation: Multimodal Machine Learning to Estimate CLASS Positive Climate and Negative Climate
Anand Ramakrishnan
Brian Zylich
Erin Ottmar
Jennifer LoCasale-Crouch
Jacob Whitehill
24
25
0
19 May 2020
Learning to Segment Actions from Observation and Narration
Daniel Fried
Jean-Baptiste Alayrac
Phil Blunsom
Chris Dyer
S. Clark
Aida Nematzadeh
33
31
0
07 May 2020
Addressing Missing Labels in Large-Scale Sound Event Recognition Using a Teacher-Student Framework With Loss Masking
Eduardo Fonseca
Shawn Hershey
Manoj Plakal
D. Ellis
A. Jansen
R. C. Moore
Xavier Serra
NoLa
25
23
0
02 May 2020
Deep Neural Network for Respiratory Sound Classification in Wearable Devices Enabled by Patient Specific Model Tuning
Jyotibdha Acharya
A. Basu
19
157
0
16 Apr 2020
Semi-supervised acoustic modelling for five-lingual code-switched ASR using automatically-segmented soap opera speech
N. Wilkinson
A. Biswas
Emre Yilmaz
Febe de Wet
Ewald van der Westhuizen
T. Niesler
25
10
0
08 Apr 2020
Attribution in Scale and Space
Shawn Xu
Subhashini Venugopalan
Mukund Sundararajan
FAtt
BDL
9
71
0
03 Apr 2020
Can Machine Learning Be Used to Recognize and Diagnose Coughs?
Charles Bales
Muhammad Nabeel
Charles N. John
Usama Masood
Haneya N. Qureshi
Hasan Farooq
Iryna Posokhova
A. Imran
9
41
0
01 Apr 2020
Multi-modal Dense Video Captioning
Vladimir E. Iashin
Esa Rahtu
22
164
0
17 Mar 2020
On Translation Invariance in CNNs: Convolutional Layers can Exploit Absolute Spatial Location
O. Kayhan
Jan van Gemert
211
232
0
16 Mar 2020
An Open-set Recognition and Few-Shot Learning Dataset for Audio Event Classification in Domestic Environments
Javier Naranjo-Alcazar
Sergi Perez-Castanos
P. Zuccarello
Ana M. Torres
Jose J. Lopez
Franscesc J. Ferri
M. Cobos
23
15
0
26 Feb 2020
Towards Learning a Universal Non-Semantic Representation of Speech
Joel Shor
A. Jansen
Ronnie Maor
Oran Lang
Omry Tuval
Félix de Chaumont Quitry
Marco Tagliasacchi
Ira Shavitt
Dotan Emanuel
Yinnon A. Haviv
SSL
44
155
0
25 Feb 2020
Sound Event Detection by Multitask Learning of Sound Events and Scenes with Soft Scene Labels
Keisuke Imoto
Noriyuki Tonami
Yuma Koizumi
Masahiro Yasuda
Ryosuke Yamanishi
Y. Yamashita
19
37
0
14 Feb 2020
Limitations of weak labels for embedding and tagging
Nicolas Turpault
Romain Serizel
Emmanuel Vincent
18
9
0
05 Feb 2020
Compact recurrent neural networks for acoustic event detection on low-energy low-complexity platforms
G. Cerutti
Rahul Prasad
A. Brutti
Elisabetta Farella
21
47
0
29 Jan 2020
Audiovisual SlowFast Networks for Video Recognition
Fanyi Xiao
Yong Jae Lee
Kristen Grauman
Jitendra Malik
Christoph Feichtenhofer
197
207
0
23 Jan 2020
Neural Architecture Search on Acoustic Scene Classification
Jixiang Li
Chuming Liang
Bo-Wen Zhang
Zhao Wang
Fei Xiang
Xiangxiang Chu
26
15
0
30 Dec 2019
Leveraging Topics and Audio Features with Multimodal Attention for Audio Visual Scene-Aware Dialog
Shachi H. Kumar
Eda Okur
Saurav Sahay
Jonathan Huang
L. Nachman
8
7
0
20 Dec 2019
Predominant Musical Instrument Classification based on Spectral Features
Karthikeya Racharla
Vineet Kumar
Chaudhari Bhushan Jayant
Ankit Khairkar
P. Harish
8
19
0
30 Nov 2019
Scene-Aware Audio Rendering via Deep Acoustic Analysis
Zhenyu Tang
Nicholas J. Bryan
Dingzeyu Li
Timothy R. Langlois
Tianyi Zhou
33
40
0
14 Nov 2019
RandAugment: Practical automated data augmentation with a reduced search space
E. D. Cubuk
Barret Zoph
Jonathon Shlens
Quoc V. Le
MQ
96
3,416
0
30 Sep 2019
Multimodal Deep Models for Predicting Affective Responses Evoked by Movies
Ha Thi Phuong Thao
Dorien Herremans
Gemma Roig
31
16
0
16 Sep 2019
Receptive-field-regularized CNN variants for acoustic scene classification
Khaled Koutini
Hamid Eghbalzadeh
Gerhard Widmer
24
29
0
05 Sep 2019
AI for Earth: Rainforest Conservation by Acoustic Surveillance
Yuan Liu
Zhongwei Cheng
Jie Liu
Bourhan Yassin
Zhe Nan
Jiebo Luo
11
5
0
20 Aug 2019
Use What You Have: Video Retrieval Using Representations From Collaborative Experts
Yang Liu
Samuel Albanie
Arsha Nagrani
Andrew Zisserman
36
387
0
31 Jul 2019
Sub-band Convolutional Neural Networks for Small-footprint Spoken Term Classification
Chieh-Chi Kao
Ming Sun
Yixin Gao
S. Vitaladevuni
Chao Wang
21
13
0
02 Jul 2019
On the performance of residual block design alternatives in convolutional neural networks for end-to-end audio classification
Javier Naranjo-Alcazar
Sergi Perez-Castanos
Irene Martín-Morató
P. Zuccarello
M. Cobos
6
7
0
26 Jun 2019
Specifying Weight Priors in Bayesian Deep Neural Networks with Empirical Bayes
R. Krishnan
Mahesh Subedar
Omesh Tickoo
BDL
20
46
0
12 Jun 2019
Learning Individual Styles of Conversational Gesture
Shiry Ginosar
Amir Bar
Gefen Kohavi
Caroline Chan
Andrew Owens
Jitendra Malik
SLR
18
326
0
10 Jun 2019
ET-GAN: Cross-Language Emotion Transfer Based on Cycle-Consistent Generative Adversarial Networks
Xiaoqi Jia
Jianwei Tai
Hang Zhou
Yakai Li
Weijuan Zhang
Haichao Du
Qingjia Huang
GAN
22
6
0
27 May 2019
Machine learning in acoustics: theory and applications
Michael J. Bianco
Peter Gerstoft
James Traer
Emma Ozanich
M. Roch
Sharon Gannot
Charles-Alban Deledalle
AI4CE
28
376
0
11 May 2019
Joint Analysis of Acoustic Events and Scenes Based on Multitask Learning
Noriyuki Tonami
Keisuke Imoto
M. Niitsuma
Ryosuke Yamanishi
Y. Yamashita
14
13
0
27 Apr 2019
End-to-End Environmental Sound Classification using a 1D Convolutional Neural Network
Sajjad Abdoli
P. Cardinal
Alessandro Lameiras Koerich
34
270
0
18 Apr 2019
A Simple Baseline for Audio-Visual Scene-Aware Dialog
Idan Schwartz
A. Schwing
Tamir Hazan
24
69
0
11 Apr 2019
Multiscale CNN based Deep Metric Learning for Bioacoustic Classification: Overcoming Training Data Scarcity Using Dynamic Triplet Loss
Anshul Thakur
Daksh Thapar
Padmanabhan Rajan
A. Nigam
22
36
0
26 Mar 2019
Weakly Labelled AudioSet Tagging with Attention Neural Networks
Qiuqiang Kong
Changsong Yu
Turab Iqbal
Yong-mei Xu
Wenwu Wang
Mark D. Plumbley
NoLa
24
78
0
02 Mar 2019
Audio Caption: Listen and Tell
Mengyue Wu
Heinrich Dinkel
Kai Yu
22
61
0
25 Feb 2019
Previous
1
2
3
4
5
6
7
Next