ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.05453
  4. Cited By
vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations

vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations

12 October 2019
Alexei Baevski
Steffen Schneider
Michael Auli
    SSL
ArXivPDFHTML

Papers citing "vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations"

39 / 189 papers shown
Title
Multi-Task Self-Supervised Pre-Training for Music Classification
Multi-Task Self-Supervised Pre-Training for Music Classification
Ho-Hsiang Wu
Chieh-Chi Kao
Qingming Tang
Ming Sun
Brian McFee
J. P. Bello
Chao Wang
SSL
39
37
0
05 Feb 2021
General-Purpose Speech Representation Learning through a Self-Supervised
  Multi-Granularity Framework
General-Purpose Speech Representation Learning through a Self-Supervised Multi-Granularity Framework
Yucheng Zhao
Dacheng Yin
Chong Luo
Zhiyuan Zhao
Chuanxin Tang
Wenjun Zeng
Zhengjun Zha
SSL
11
6
0
03 Feb 2021
UniSpeech: Unified Speech Representation Learning with Labeled and
  Unlabeled Data
UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data
Chengyi Wang
Yu-Huan Wu
Yao Qian
K. Kumatani
Shujie Liu
Furu Wei
Michael Zeng
Xuedong Huang
OT
SSL
38
112
0
19 Jan 2021
Efficiently Fusing Pretrained Acoustic and Linguistic Encoders for
  Low-resource Speech Recognition
Efficiently Fusing Pretrained Acoustic and Linguistic Encoders for Low-resource Speech Recognition
Cheng Yi
Shiyu Zhou
Bo Xu
51
40
0
17 Jan 2021
What all do audio transformer models hear? Probing Acoustic
  Representations for Language Delivery and its Structure
What all do audio transformer models hear? Probing Acoustic Representations for Language Delivery and its Structure
Jui Shah
Yaman Kumar Singla
Changyou Chen
R. Shah
33
81
0
02 Jan 2021
Reservoir Transformers
Reservoir Transformers
Sheng Shen
Alexei Baevski
Ari S. Morcos
Kurt Keutzer
Michael Auli
Douwe Kiela
35
17
0
30 Dec 2020
Applying Wav2vec2.0 to Speech Recognition in Various Low-resource
  Languages
Applying Wav2vec2.0 to Speech Recognition in Various Low-resource Languages
Cheng Yi
Jianzhong Wang
Ning Cheng
Shiyu Zhou
Bo Xu
SSL
VLM
24
80
0
22 Dec 2020
Towards unsupervised phone and word segmentation using self-supervised
  vector-quantized neural networks
Towards unsupervised phone and word segmentation using self-supervised vector-quantized neural networks
Herman Kamper
Benjamin van Niekerk
SSL
MQ
23
35
0
14 Dec 2020
Towards Semi-Supervised Semantics Understanding from Speech
Towards Semi-Supervised Semantics Understanding from Speech
Cheng-I Jeff Lai
Jin Cao
S. Bodapati
Shang-Wen Li
SSL
22
7
0
11 Nov 2020
Non-Autoregressive Predictive Coding for Learning Speech Representations
  from Local Dependencies
Non-Autoregressive Predictive Coding for Learning Speech Representations from Local Dependencies
Alexander H. Liu
Yu-An Chung
James R. Glass
SSL
27
87
0
01 Nov 2020
Multitask Training with Text Data for End-to-End Speech Recognition
Multitask Training with Text Data for End-to-End Speech Recognition
Peidong Wang
Tara N. Sainath
Ron J. Weiss
18
27
0
27 Oct 2020
Speech SIMCLR: Combining Contrastive and Reconstruction Objective for
  Self-supervised Speech Representation Learning
Speech SIMCLR: Combining Contrastive and Reconstruction Objective for Self-supervised Speech Representation Learning
Dongwei Jiang
Wubo Li
Miao Cao
Wei Zou
Xiangang Li
SSL
27
65
0
27 Oct 2020
Two-stage Textual Knowledge Distillation for End-to-End Spoken Language
  Understanding
Two-stage Textual Knowledge Distillation for End-to-End Spoken Language Understanding
Seongbin Kim
Gyuwan Kim
Seongjin Shin
Sangmin Lee
VLM
18
19
0
25 Oct 2020
Probing Acoustic Representations for Phonetic Properties
Probing Acoustic Representations for Phonetic Properties
Danni Ma
Neville Ryant
M. Liberman
25
45
0
25 Oct 2020
Multilingual Speech Translation with Efficient Finetuning of Pretrained
  Models
Multilingual Speech Translation with Efficient Finetuning of Pretrained Models
Xian Li
Changhan Wang
Yun Tang
C. Tran
Yuqing Tang
J. Pino
Alexei Baevski
Alexis Conneau
Michael Auli
21
6
0
24 Oct 2020
Any-to-One Sequence-to-Sequence Voice Conversion using Self-Supervised
  Discrete Speech Representations
Any-to-One Sequence-to-Sequence Voice Conversion using Self-Supervised Discrete Speech Representations
Wen-Chin Huang
Yi-Chiao Wu
Tomoki Hayashi
T. Toda
BDL
57
37
0
23 Oct 2020
Similarity Analysis of Self-Supervised Speech Representations
Similarity Analysis of Self-Supervised Speech Representations
Yu-An Chung
Yonatan Belinkov
James R. Glass
SSL
36
36
0
22 Oct 2020
Contrastive Learning of General-Purpose Audio Representations
Contrastive Learning of General-Purpose Audio Representations
Aaqib Saeed
David Grangier
Neil Zeghidour
VLM
SSL
24
262
0
21 Oct 2020
Contrastive Representation Learning: A Framework and Review
Contrastive Representation Learning: A Framework and Review
Phúc H. Lê Khắc
Graham Healy
Alan F. Smeaton
SSL
AI4TS
186
687
0
10 Oct 2020
Understanding Self-supervised Learning with Dual Deep Networks
Understanding Self-supervised Learning with Dual Deep Networks
Yuandong Tian
Lantao Yu
Xinlei Chen
Surya Ganguli
SSL
13
78
0
01 Oct 2020
Identity-Based Patterns in Deep Convolutional Networks: Generative
  Adversarial Phonology and Reduplication
Identity-Based Patterns in Deep Convolutional Networks: Generative Adversarial Phonology and Reduplication
Gašper Beguš
GAN
SSL
13
15
0
13 Sep 2020
A Primer on Motion Capture with Deep Learning: Principles, Pitfalls and
  Perspectives
A Primer on Motion Capture with Deep Learning: Principles, Pitfalls and Perspectives
Alexander Mathis
Steffen Schneider
Jessy Lauer
Mackenzie W. Mathis
35
165
0
01 Sep 2020
Jointly Fine-Tuning "BERT-like" Self Supervised Models to Improve
  Multimodal Speech Emotion Recognition
Jointly Fine-Tuning "BERT-like" Self Supervised Models to Improve Multimodal Speech Emotion Recognition
Shamane Siriwardhana
Andrew Reis
Rivindu Weerasekera
Suranga Nanayakkara
21
112
0
15 Aug 2020
Pretraining Techniques for Sequence-to-Sequence Voice Conversion
Pretraining Techniques for Sequence-to-Sequence Voice Conversion
Wen-Chin Huang
Tomoki Hayashi
Yi-Chiao Wu
Hirokazu Kameoka
T. Toda
27
38
0
07 Aug 2020
Privacy-preserving Voice Analysis via Disentangled Representations
Privacy-preserving Voice Analysis via Disentangled Representations
Ranya Aloufi
Hamed Haddadi
David E. Boyle
DRL
28
58
0
29 Jul 2020
TERA: Self-Supervised Learning of Transformer Encoder Representation for
  Speech
TERA: Self-Supervised Learning of Transformer Encoder Representation for Speech
Andy T. Liu
Shang-Wen Li
Hung-yi Lee
SSL
62
356
0
12 Jul 2020
Unsupervised Cross-lingual Representation Learning for Speech
  Recognition
Unsupervised Cross-lingual Representation Learning for Speech Recognition
Alexis Conneau
Alexei Baevski
R. Collobert
Abdel-rahman Mohamed
Michael Auli
SSL
70
755
0
24 Jun 2020
Self-Supervised Representations Improve End-to-End Speech Translation
Self-Supervised Representations Improve End-to-End Speech Translation
Anne Wu
Changhan Wang
J. Pino
Jiatao Gu
SSL
25
40
0
22 Jun 2020
High-Fidelity Audio Generation and Representation Learning with Guided
  Adversarial Autoencoder
High-Fidelity Audio Generation and Representation Learning with Guided Adversarial Autoencoder
Kazi Nazmul Haque
R. Rana
Björn W Schuller
DRL
28
12
0
01 Jun 2020
Vector-quantized neural networks for acoustic unit discovery in the
  ZeroSpeech 2020 challenge
Vector-quantized neural networks for acoustic unit discovery in the ZeroSpeech 2020 challenge
Benjamin van Niekerk
Leanne Nortje
Herman Kamper
33
115
0
19 May 2020
Audio ALBERT: A Lite BERT for Self-supervised Learning of Audio
  Representation
Audio ALBERT: A Lite BERT for Self-supervised Learning of Audio Representation
Po-Han Chi
Pei-Hung Chung
Tsung-Han Wu
Chun-Cheng Hsieh
Yen-Hao Chen
Shang-Wen Li
Hung-yi Lee
SSL
9
147
0
18 May 2020
Vector-Quantized Autoregressive Predictive Coding
Vector-Quantized Autoregressive Predictive Coding
Yu-An Chung
Hao Tang
James R. Glass
SSL
19
114
0
17 May 2020
DiscreTalk: Text-to-Speech as a Machine Translation Problem
DiscreTalk: Text-to-Speech as a Machine Translation Problem
Tomoki Hayashi
Shinji Watanabe
27
32
0
12 May 2020
Vector Quantized Contrastive Predictive Coding for Template-based Music
  Generation
Vector Quantized Contrastive Predictive Coding for Template-based Music Generation
Gaëtan Hadjeres
Léopold Crestel
34
18
0
21 Apr 2020
Effectiveness of self-supervised pre-training for speech recognition
Effectiveness of self-supervised pre-training for speech recognition
Alexei Baevski
Michael Auli
Abdel-rahman Mohamed
SSL
27
147
0
10 Nov 2019
Towards Unsupervised Speech Recognition and Synthesis with Quantized
  Speech Representation Learning
Towards Unsupervised Speech Recognition and Synthesis with Quantized Speech Representation Learning
Alexander H. Liu
Tao Tu
Hung-yi Lee
Lin-Shan Lee
SSL
37
50
0
28 Oct 2019
Mockingjay: Unsupervised Speech Representation Learning with Deep
  Bidirectional Transformer Encoders
Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders
Andy T. Liu
Shu-Wen Yang
Po-Han Chi
Po-Chun Hsu
Hung-yi Lee
SSL
45
372
0
25 Oct 2019
Self-supervised pre-training with acoustic configurations for replay
  spoofing detection
Self-supervised pre-training with acoustic configurations for replay spoofing detection
Hye-jin Shim
Hee-Soo Heo
Jee-weon Jung
Ha-Jin Yu
33
6
0
22 Oct 2019
BERTphone: Phonetically-Aware Encoder Representations for
  Utterance-Level Speaker and Language Recognition
BERTphone: Phonetically-Aware Encoder Representations for Utterance-Level Speaker and Language Recognition
Shaoshi Ling
Julian Salazar
Yuzong Liu
Katrin Kirchhoff
SSL
30
28
0
30 Jun 2019
Previous
1234