Self-supervised audio representation learning for mobile devices

24 May 2019

Marco Tagliasacchi

Beat Gfeller

Félix de Chaumont Quitry

Papers citing "Self-supervised audio representation learning for mobile devices"

33 / 33 papers shown

Title
Learning Problem-agnostic Speech Representations from Multiple Self-supervised Tasks Santiago Pascual Mirco Ravanelli Joan Serrà Antonio Bonafonte Yoshua Bengio SSL 113 251 0 06 Apr 2019
Towards Federated Learning at Scale: System Design Keith Bonawitz Hubert Eichner W. Grieskamp Dzmitry Huba A. Ingerman ... H. B. McMahan Timon Van Overveldt David Petrou Daniel Ramage Jason Roselander FedML 121 2,660 0 04 Feb 2019
Automatic acoustic detection of birds through deep learning: the first Bird Audio Detection challenge D. Stowell Y. Stylianou Mike Wood H. Pamula H. Glotin 71 310 0 16 Jul 2018
Representation Learning with Contrastive Predictive Coding Aaron van den Oord Yazhe Li Oriol Vinyals DRL SSL 298 10,253 0 10 Jul 2018
Cooperative Learning of Audio and Video Models from Self-Supervised Synchronization Bruno Korbar Du Tran Lorenzo Torresani 95 475 0 30 Jun 2018
Unsupervised Cross-Modal Alignment of Speech and Text Embedding Spaces Yu-An Chung W. Weng S. Tong James R. Glass 71 100 0 18 May 2018
Audio-Visual Scene Analysis with Self-Supervised Multisensory Features Andrew Owens Alexei A. Efros SSL 89 748 0 10 Apr 2018
Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition Pete Warden 74 1,615 0 09 Apr 2018
Learning to Separate Object Sounds by Watching Unlabeled Video Ruohan Gao Rogerio Feris Kristen Grauman SSL 63 284 0 05 Apr 2018
Speech2Vec: A Sequence-to-Sequence Framework for Learning Word Embeddings from Speech Yu-An Chung James R. Glass 3DV 64 184 0 23 Mar 2018
Unsupervised Representation Learning by Predicting Image Rotations Spyros Gidaris Praveer Singh N. Komodakis OOD SSL DRL 245 3,283 0 21 Mar 2018
The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks Jonathan Frankle Michael Carbin 219 3,457 0 09 Mar 2018
MobileNetV2: Inverted Residuals and Linear Bottlenecks Mark Sandler Andrew G. Howard Menglong Zhu A. Zhmoginov Liang-Chieh Chen 171 19,204 0 13 Jan 2018
Learning Sight from Sound: Ambient Sound Provides Supervision for Visual Learning Andrew Owens Jiajun Wu Josh H. McDermott William T. Freeman Antonio Torralba SSL 65 176 0 20 Dec 2017
Objects that Sound Relja Arandjelović Andrew Zisserman ObjD VOS 92 529 0 18 Dec 2017
Unsupervised Feature Learning for Audio Analysis Matthias Meyer J. Beutel Lothar Thiele SSL 39 18 0 11 Dec 2017
Now Playing: Continuous low-power music recognition Blaise Agüera y Arcas Beat Gfeller Ruiqi Guo Kevin Kilgour Sanjiv Kumar ... J. Odell Marvin Ritter Dominik Roblek Matthew Sharifi Mihajlo Velimirović MGen 38 35 0 29 Nov 2017
Unsupervised Learning of Semantic Audio Representations A. Jansen Manoj Plakal R. Pandya D. Ellis Shawn Hershey Jiayang Liu R. C. Moore Rif A. Saurous SSL 79 131 0 06 Nov 2017
Multi-task Self-Supervised Visual Learning Carl Doersch Andrew Zisserman SSL 75 631 0 25 Aug 2017
Learning and Evaluating Musical Features with Deep Autoencoders Mason Bretan Sageev Oore Douglas Eck Larry Heck 34 6 0 14 Jun 2017
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications Andrew G. Howard Menglong Zhu Bo Chen Dmitry Kalenichenko Weijun Wang Tobias Weyand M. Andreetto Hartwig Adam 3DH 1.1K 20,813 0 17 Apr 2017
Learning Features by Watching Objects Move Deepak Pathak Ross B. Girshick Piotr Dollár Trevor Darrell Bharath Hariharan SSL VOS OCL 67 525 0 19 Dec 2016
Self-Supervised Video Representation Learning With Odd-One-Out Networks Basura Fernando Hakan Bilen E. Gavves Stephen Gould SSL 42 450 0 21 Nov 2016
CNN Architectures for Large-Scale Audio Classification Shawn Hershey Sourish Chaudhuri D. Ellis J. Gemmeke A. Jansen ... Rif A. Saurous Bryan Seybold M. Slaney Ron J. Weiss K. Wilson 111 2,497 0 29 Sep 2016
Unsupervised Feature Learning Based on Deep Models for Environmental Audio Tagging Yong-mei Xu Qiang Huang Wenwu Wang Peter Foster Siddharth Sigtia Philip J. B. Jackson Mark D. Plumbley 49 79 0 13 Jul 2016
Context Encoders: Feature Learning by Inpainting Deepak Pathak Philipp Krahenbuhl Jeff Donahue Trevor Darrell Alexei A. Efros SSL 67 5,287 0 25 Apr 2016
Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles M. Noroozi Paolo Favaro SSL 157 2,980 0 30 Mar 2016
Colorful Image Colorization Richard Y. Zhang Phillip Isola Alexei A. Efros 124 3,530 0 28 Mar 2016
Audio Word2Vec: Unsupervised Learning of Audio Segment Representations using Sequence-to-sequence Autoencoder Yu-An Chung Chao-Chung Wu Chia-Hao Shen Hung-yi Lee Lin-Shan Lee AI4TS 60 182 0 03 Mar 2016
MUSAN: A Music, Speech, and Noise Corpus David Snyder Guoguo Chen Daniel Povey 75 1,346 0 28 Oct 2015
Listen, Attend and Spell William Chan Navdeep Jaitly Quoc V. Le Oriol Vinyals RALM 147 2,265 0 05 Aug 2015
Unsupervised Visual Representation Learning by Context Prediction Carl Doersch Abhinav Gupta Alexei A. Efros DRL SSL 164 2,782 0 19 May 2015
Efficient Estimation of Word Representations in Vector Space Tomas Mikolov Kai Chen G. Corrado J. Dean 3DV 637 31,469 0 16 Jan 2013