wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations

20 June 2020

Papers citing "wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations"

37 / 187 papers shown

Title
Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech David Harwath Wei-Ning Hsu James R. Glass 69 84 0 21 Nov 2019
End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern Architectures Gabriel Synnaeve Qiantong Xu Jacob Kahn Tatiana Likhomanenko Edouard Grave Vineel Pratap Anuroop Sriram Vitaliy Liptchinsky R. Collobert SSL AI4TS 107 247 0 19 Nov 2019
Momentum Contrast for Unsupervised Visual Representation Learning Kaiming He Haoqi Fan Yuxin Wu Saining Xie Ross B. Girshick SSL 187 12,073 0 13 Nov 2019
Effectiveness of self-supervised pre-training for speech recognition Alexei Baevski Michael Auli Abdel-rahman Mohamed SSL 74 147 0 10 Nov 2019
Towards Unsupervised Speech Recognition and Synthesis with Quantized Speech Representation Learning Alexander H. Liu Tao Tu Hung-yi Lee Lin-Shan Lee SSL 64 50 0 28 Oct 2019
Improving Transformer-based Speech Recognition Using Unsupervised Pre-training Dongwei Jiang Xiaoning Lei Wubo Li Ne Luo Yuxuan Hu Wei Zou Xiangang Li 48 99 0 22 Oct 2019
vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations Alexei Baevski Steffen Schneider Michael Auli SSL 150 666 0 12 Oct 2019
Reducing Transformer Depth on Demand with Structured Dropout Angela Fan Edouard Grave Armand Joulin 113 592 0 25 Sep 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach Yinhan Liu Myle Ott Naman Goyal Jingfei Du Mandar Joshi Danqi Chen Omer Levy M. Lewis Luke Zettlemoyer Veselin Stoyanov AIMat 582 24,422 0 26 Jul 2019
Learning Representations by Maximizing Mutual Information Across Views Philip Bachman R. Devon Hjelm William Buchwalter SSL 189 1,472 0 03 Jun 2019
VQVAE Unsupervised Unit Discovery and Multi-scale Code2Spec Inverter for Zerospeech Challenge 2019 Andros Tjandra Berrak Sisman Mingyang Zhang S. Sakti Haizhou Li Satoshi Nakamura 68 71 0 27 May 2019
Data-Efficient Image Recognition with Contrastive Predictive Coding Olivier J. Hénaff A. Srinivas J. Fauw Ali Razavi Carl Doersch S. M. Ali Eslami Aaron van den Oord SSL 115 1,428 0 22 May 2019
RWTH ASR Systems for LibriSpeech: Hybrid vs Attention -- w/o Data Augmentation Christoph Luscher Eugen Beck Kazuki Irie M. Kitza Wilfried Michel Albert Zeyer Ralf Schluter Hermann Ney VLM 103 234 0 08 May 2019
Transformers with convolutional context for ASR Abdel-rahman Mohamed Dmytro Okhonko Luke Zettlemoyer 56 168 0 26 Apr 2019
SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition Daniel S. Park William Chan Yu Zhang Chung-Cheng Chiu Barret Zoph E. D. Cubuk Quoc V. Le VLM 174 3,455 0 18 Apr 2019
Unsupervised acoustic unit discovery for speech synthesis using discrete latent-variable neural networks Ryan Eloff A. Nortje Benjamin van Niekerk Avashna Govender Leanne Nortje Arnu Pretorius Elan Van Biljon Ewald van der Westhuizen Lisa van Staden Herman Kamper DRL 63 57 0 16 Apr 2019
An Unsupervised Autoregressive Model for Speech Representation Learning Yu-An Chung Wei-Ning Hsu Hao Tang James R. Glass SSL 74 408 0 05 Apr 2019
fairseq: A Fast, Extensible Toolkit for Sequence Modeling Myle Ott Sergey Edunov Alexei Baevski Angela Fan Sam Gross Nathan Ng David Grangier Michael Auli VLM FaML 97 3,150 0 01 Apr 2019
Pay Less Attention with Lightweight and Dynamic Convolutions Felix Wu Angela Fan Alexei Baevski Yann N. Dauphin Michael Auli 72 609 0 29 Jan 2019
Unsupervised speech representation learning using WaveNet autoencoders J. Chorowski Ron J. Weiss Samy Bengio Aaron van den Oord SSL 72 318 0 25 Jan 2019
wav2letter++: The Fastest Open-source Speech Recognition System Vineel Pratap Awni Y. Hannun Qiantong Xu Jeff Cai Jacob Kahn Gabriel Synnaeve Vitaliy Liptchinsky R. Collobert VLM 54 156 0 18 Dec 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Jacob Devlin Ming-Wei Chang Kenton Lee Kristina Toutanova VLM SSL SSeg 1.7K 94,729 0 11 Oct 2018
Adaptive Input Representations for Neural Language Modeling Alexei Baevski Michael Auli 101 390 0 28 Sep 2018
Representation Learning with Contrastive Predictive Coding Aaron van den Oord Yazhe Li Oriol Vinyals DRL SSL 302 10,282 0 10 Jul 2018
The challenge of realistic music generation: modelling raw audio at scale Sander Dieleman Aaron van den Oord Karen Simonyan 86 185 0 26 Jun 2018
Scaling Neural Machine Translation Myle Ott Sergey Edunov David Grangier Michael Auli AIMat 172 614 0 01 Jun 2018
Light Gated Recurrent Units for Speech Recognition Mirco Ravanelli Philemon Brakel M. Omologo Yoshua Bengio 45 317 0 26 Mar 2018
Deep contextualized word representations Matthew E. Peters Mark Neumann Mohit Iyyer Matt Gardner Christopher Clark Kenton Lee Luke Zettlemoyer NAI 204 11,546 0 15 Feb 2018
Learning Filterbanks from Raw Speech for Phone Recognition Neil Zeghidour Nicolas Usunier Iasonas Kokkinos Thomas Schatz Gabriel Synnaeve Emmanuel Dupoux 64 120 0 03 Nov 2017
Neural Discrete Representation Learning Aaron van den Oord Oriol Vinyals Koray Kavukcuoglu BDL SSL OCL 220 5,004 0 02 Nov 2017
Attention Is All You Need Ashish Vaswani Noam M. Shazeer Niki Parmar Jakob Uszkoreit Llion Jones Aidan Gomez Lukasz Kaiser Illia Polosukhin 3DV 670 131,414 0 12 Jun 2017
Categorical Reparameterization with Gumbel-Softmax Eric Jang S. Gu Ben Poole BDL 303 5,364 0 03 Nov 2016
Layer Normalization Jimmy Lei Ba J. Kiros Geoffrey E. Hinton 386 10,481 0 21 Jul 2016
Gaussian Error Linear Units (GELUs) Dan Hendrycks Kevin Gimpel 169 5,001 0 27 Jun 2016
Deep Networks with Stochastic Depth Gao Huang Yu Sun Zhuang Liu Daniel Sedra Kilian Q. Weinberger 209 2,356 0 30 Mar 2016
Adam: A Method for Stochastic Optimization Diederik P. Kingma Jimmy Ba ODL 1.7K 150,006 0 22 Dec 2014
A* Sampling Chris J. Maddison Daniel Tarlow T. Minka 77 392 0 31 Oct 2014