R-Spin: Efficient Speaker and Noise-invariant Representation Learning with Acoustic Pieces

15 November 2023

Papers citing "R-Spin: Efficient Speaker and Noise-invariant Representation Learning with Acoustic Pieces"

19 / 19 papers shown

Title
DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning Alexander H. Liu Heng-Jui Chang Michael Auli Wei-Ning Hsu James R. Glass 44 26 0 17 May 2023
Wav2code: Restore Clean Speech Representations via Codebook Lookup for Noise-Robust ASR Yuchen Hu Cheng Chen Qiu-shi Zhu Eng Siong Chng 106 16 0 11 Apr 2023
Robust Data2vec: Noise-robust Speech Representation Learning for ASR by Combining Regression and Improved Contrastive Learning Qiu-shi Zhu Long Zhou Jie Zhang Shujie Liu Yu-Chen Hu Lirong Dai VLM SSL 73 37 0 27 Oct 2022
Content-Context Factorized Representations for Automated Speech Recognition David M. Chan Shalini Ghosh 64 13 0 19 May 2022
Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo Languages Felix Wu Kwangyoun Kim Shinji Watanabe Kyu Jeong Han Ryan T. McDonald Kilian Q. Weinberger Yoav Artzi SyDa 73 38 0 02 May 2022
SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities Hsiang-Sheng Tsai Heng-Jui Chang Wen-Chin Huang Zili Huang Kushal Lakhotia ... Hsuan-Jui Chen Shang-Wen Li Shinji Watanabe Abdel-rahman Mohamed Hung-yi Lee 83 111 0 14 Mar 2022
data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language Alexei Baevski Wei-Ning Hsu Qiantong Xu Arun Babu Jiatao Gu Michael Auli SSL VLM ViT 97 855 0 07 Feb 2022
Neural Analysis and Synthesis: Reconstructing Speech from Self-Supervised Representations Hyeong-Seok Choi Juheon Lee W. Kim Jie Hwan Lee Hoon Heo Kyogu Lee 74 158 0 27 Oct 2021
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing Sanyuan Chen Chengyi Wang Zhengyang Chen Yu-Huan Wu Shujie Liu ... Yao Qian Jian Wu Micheal Zeng Xiangzhan Yu Furu Wei SSL 239 1,857 0 26 Oct 2021
DistilHuBERT: Speech Representation Learning by Layer-wise Distillation of Hidden-unit BERT Heng-Jui Chang Shu-Wen Yang Hung-yi Lee SSL 82 174 0 05 Oct 2021
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units Wei-Ning Hsu Benjamin Bolte Yao-Hung Hubert Tsai Kushal Lakhotia Ruslan Salakhutdinov Abdel-rahman Mohamed SSL 166 2,949 0 14 Jun 2021
Barlow Twins: Self-Supervised Learning via Redundancy Reduction Jure Zbontar Li Jing Ishan Misra Yann LeCun Stéphane Deny SSL 300 2,344 0 04 Mar 2021
Unsupervised Learning of Disentangled Speech Content and Style Representation Andros Tjandra Ruoming Pang Yu Zhang Shigeki Karita BDL DRL 41 15 0 24 Oct 2020
Self-Supervised Contrastive Learning for Unsupervised Phoneme Segmentation Felix Kreuk Joseph Keshet Yossi Adi SSL 55 79 0 27 Jul 2020
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations Alexei Baevski Henry Zhou Abdel-rahman Mohamed Michael Auli SSL 282 5,790 0 20 Jun 2020
Unsupervised Learning of Visual Features by Contrasting Cluster Assignments Mathilde Caron Ishan Misra Julien Mairal Priya Goyal Piotr Bojanowski Armand Joulin OCL SSL 230 4,074 0 17 Jun 2020
Speech Model Pre-training for End-to-End Spoken Language Understanding Loren Lugosch Mirco Ravanelli Patrick Ignoto Vikrant Singh Tomar Yoshua Bengio SyDa AuLLM 65 352 0 07 Apr 2019
fairseq: A Fast, Extensible Toolkit for Sequence Modeling Myle Ott Sergey Edunov Alexei Baevski Angela Fan Sam Gross Nathan Ng David Grangier Michael Auli VLM FaML 105 3,149 0 01 Apr 2019
Neural Machine Translation of Rare Words with Subword Units Rico Sennrich Barry Haddow Alexandra Birch 215 7,735 0 31 Aug 2015