Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.05453
Cited By
vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations
12 October 2019
Alexei Baevski
Steffen Schneider
Michael Auli
SSL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations"
39 / 189 papers shown
Title
Multi-Task Self-Supervised Pre-Training for Music Classification
Ho-Hsiang Wu
Chieh-Chi Kao
Qingming Tang
Ming Sun
Brian McFee
J. P. Bello
Chao Wang
SSL
39
37
0
05 Feb 2021
General-Purpose Speech Representation Learning through a Self-Supervised Multi-Granularity Framework
Yucheng Zhao
Dacheng Yin
Chong Luo
Zhiyuan Zhao
Chuanxin Tang
Wenjun Zeng
Zhengjun Zha
SSL
11
6
0
03 Feb 2021
UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data
Chengyi Wang
Yu-Huan Wu
Yao Qian
K. Kumatani
Shujie Liu
Furu Wei
Michael Zeng
Xuedong Huang
OT
SSL
38
112
0
19 Jan 2021
Efficiently Fusing Pretrained Acoustic and Linguistic Encoders for Low-resource Speech Recognition
Cheng Yi
Shiyu Zhou
Bo Xu
51
40
0
17 Jan 2021
What all do audio transformer models hear? Probing Acoustic Representations for Language Delivery and its Structure
Jui Shah
Yaman Kumar Singla
Changyou Chen
R. Shah
33
81
0
02 Jan 2021
Reservoir Transformers
Sheng Shen
Alexei Baevski
Ari S. Morcos
Kurt Keutzer
Michael Auli
Douwe Kiela
35
17
0
30 Dec 2020
Applying Wav2vec2.0 to Speech Recognition in Various Low-resource Languages
Cheng Yi
Jianzhong Wang
Ning Cheng
Shiyu Zhou
Bo Xu
SSL
VLM
24
80
0
22 Dec 2020
Towards unsupervised phone and word segmentation using self-supervised vector-quantized neural networks
Herman Kamper
Benjamin van Niekerk
SSL
MQ
23
35
0
14 Dec 2020
Towards Semi-Supervised Semantics Understanding from Speech
Cheng-I Jeff Lai
Jin Cao
S. Bodapati
Shang-Wen Li
SSL
22
7
0
11 Nov 2020
Non-Autoregressive Predictive Coding for Learning Speech Representations from Local Dependencies
Alexander H. Liu
Yu-An Chung
James R. Glass
SSL
27
87
0
01 Nov 2020
Multitask Training with Text Data for End-to-End Speech Recognition
Peidong Wang
Tara N. Sainath
Ron J. Weiss
18
27
0
27 Oct 2020
Speech SIMCLR: Combining Contrastive and Reconstruction Objective for Self-supervised Speech Representation Learning
Dongwei Jiang
Wubo Li
Miao Cao
Wei Zou
Xiangang Li
SSL
27
65
0
27 Oct 2020
Two-stage Textual Knowledge Distillation for End-to-End Spoken Language Understanding
Seongbin Kim
Gyuwan Kim
Seongjin Shin
Sangmin Lee
VLM
18
19
0
25 Oct 2020
Probing Acoustic Representations for Phonetic Properties
Danni Ma
Neville Ryant
M. Liberman
25
45
0
25 Oct 2020
Multilingual Speech Translation with Efficient Finetuning of Pretrained Models
Xian Li
Changhan Wang
Yun Tang
C. Tran
Yuqing Tang
J. Pino
Alexei Baevski
Alexis Conneau
Michael Auli
21
6
0
24 Oct 2020
Any-to-One Sequence-to-Sequence Voice Conversion using Self-Supervised Discrete Speech Representations
Wen-Chin Huang
Yi-Chiao Wu
Tomoki Hayashi
T. Toda
BDL
57
37
0
23 Oct 2020
Similarity Analysis of Self-Supervised Speech Representations
Yu-An Chung
Yonatan Belinkov
James R. Glass
SSL
36
36
0
22 Oct 2020
Contrastive Learning of General-Purpose Audio Representations
Aaqib Saeed
David Grangier
Neil Zeghidour
VLM
SSL
24
262
0
21 Oct 2020
Contrastive Representation Learning: A Framework and Review
Phúc H. Lê Khắc
Graham Healy
Alan F. Smeaton
SSL
AI4TS
186
687
0
10 Oct 2020
Understanding Self-supervised Learning with Dual Deep Networks
Yuandong Tian
Lantao Yu
Xinlei Chen
Surya Ganguli
SSL
13
78
0
01 Oct 2020
Identity-Based Patterns in Deep Convolutional Networks: Generative Adversarial Phonology and Reduplication
Gašper Beguš
GAN
SSL
13
15
0
13 Sep 2020
A Primer on Motion Capture with Deep Learning: Principles, Pitfalls and Perspectives
Alexander Mathis
Steffen Schneider
Jessy Lauer
Mackenzie W. Mathis
35
165
0
01 Sep 2020
Jointly Fine-Tuning "BERT-like" Self Supervised Models to Improve Multimodal Speech Emotion Recognition
Shamane Siriwardhana
Andrew Reis
Rivindu Weerasekera
Suranga Nanayakkara
21
112
0
15 Aug 2020
Pretraining Techniques for Sequence-to-Sequence Voice Conversion
Wen-Chin Huang
Tomoki Hayashi
Yi-Chiao Wu
Hirokazu Kameoka
T. Toda
27
38
0
07 Aug 2020
Privacy-preserving Voice Analysis via Disentangled Representations
Ranya Aloufi
Hamed Haddadi
David E. Boyle
DRL
28
58
0
29 Jul 2020
TERA: Self-Supervised Learning of Transformer Encoder Representation for Speech
Andy T. Liu
Shang-Wen Li
Hung-yi Lee
SSL
62
356
0
12 Jul 2020
Unsupervised Cross-lingual Representation Learning for Speech Recognition
Alexis Conneau
Alexei Baevski
R. Collobert
Abdel-rahman Mohamed
Michael Auli
SSL
70
755
0
24 Jun 2020
Self-Supervised Representations Improve End-to-End Speech Translation
Anne Wu
Changhan Wang
J. Pino
Jiatao Gu
SSL
25
40
0
22 Jun 2020
High-Fidelity Audio Generation and Representation Learning with Guided Adversarial Autoencoder
Kazi Nazmul Haque
R. Rana
Björn W Schuller
DRL
28
12
0
01 Jun 2020
Vector-quantized neural networks for acoustic unit discovery in the ZeroSpeech 2020 challenge
Benjamin van Niekerk
Leanne Nortje
Herman Kamper
33
115
0
19 May 2020
Audio ALBERT: A Lite BERT for Self-supervised Learning of Audio Representation
Po-Han Chi
Pei-Hung Chung
Tsung-Han Wu
Chun-Cheng Hsieh
Yen-Hao Chen
Shang-Wen Li
Hung-yi Lee
SSL
9
147
0
18 May 2020
Vector-Quantized Autoregressive Predictive Coding
Yu-An Chung
Hao Tang
James R. Glass
SSL
19
114
0
17 May 2020
DiscreTalk: Text-to-Speech as a Machine Translation Problem
Tomoki Hayashi
Shinji Watanabe
27
32
0
12 May 2020
Vector Quantized Contrastive Predictive Coding for Template-based Music Generation
Gaëtan Hadjeres
Léopold Crestel
34
18
0
21 Apr 2020
Effectiveness of self-supervised pre-training for speech recognition
Alexei Baevski
Michael Auli
Abdel-rahman Mohamed
SSL
27
147
0
10 Nov 2019
Towards Unsupervised Speech Recognition and Synthesis with Quantized Speech Representation Learning
Alexander H. Liu
Tao Tu
Hung-yi Lee
Lin-Shan Lee
SSL
37
50
0
28 Oct 2019
Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders
Andy T. Liu
Shu-Wen Yang
Po-Han Chi
Po-Chun Hsu
Hung-yi Lee
SSL
45
372
0
25 Oct 2019
Self-supervised pre-training with acoustic configurations for replay spoofing detection
Hye-jin Shim
Hee-Soo Heo
Jee-weon Jung
Ha-Jin Yu
33
6
0
22 Oct 2019
BERTphone: Phonetically-Aware Encoder Representations for Utterance-Level Speaker and Language Recognition
Shaoshi Ling
Julian Salazar
Yuzong Liu
Katrin Kirchhoff
SSL
30
28
0
30 Jun 2019
Previous
1
2
3
4