ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2211.05172
  4. Cited By
Speech separation with large-scale self-supervised learning
v1v2 (latest)

Speech separation with large-scale self-supervised learning

9 November 2022
Zhuo Chen
Naoyuki Kanda
Jian Wu
Yu-Huan Wu
Xiaofei Wang
Takuya Yoshioka
Jinyu Li
S. Sivasankaran
Sefik Emre Eskimez
ArXiv (abs)PDFHTML

Papers citing "Speech separation with large-scale self-supervised learning"

26 / 26 papers shown
Title
TF-GridNet: Making Time-Frequency Domain Models Great Again for Monaural
  Speaker Separation
TF-GridNet: Making Time-Frequency Domain Models Great Again for Monaural Speaker Separation
Zhong-Qiu Wang
Samuele Cornell
Shukjae Choi
Younglo Lee
Byeonghak Kim
Shinji Watanabe
127
108
0
08 Sep 2022
Why does Self-Supervised Learning for Speech Recognition Benefit Speaker
  Recognition?
Why does Self-Supervised Learning for Speech Recognition Benefit Speaker Recognition?
Sanyuan Chen
Yu Wu
Chengyi Wang
Shujie Liu
Zhuo Chen
...
Gang Liu
Jinyu Li
Jian Wu
Xiangzhan Yu
Furu Wei
SSL
93
42
0
27 Apr 2022
Leveraging Real Conversational Data for Multi-Channel Continuous Speech
  Separation
Leveraging Real Conversational Data for Multi-Channel Continuous Speech Separation
Xiaofei Wang
Dongmei Wang
Naoyuki Kanda
Sefik Emre Eskimez
Takuya Yoshioka
86
8
0
07 Apr 2022
SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark
  for Semantic and Generative Capabilities
SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities
Hsiang-Sheng Tsai
Heng-Jui Chang
Wen-Chin Huang
Zili Huang
Kushal Lakhotia
...
Hsuan-Jui Chen
Shang-Wen Li
Shinji Watanabe
Abdel-rahman Mohamed
Hung-yi Lee
91
111
0
14 Mar 2022
RemixIT: Continual self-training of speech enhancement models via
  bootstrapped remixing
RemixIT: Continual self-training of speech enhancement models via bootstrapped remixing
Efthymios Tzinis
Yossi Adi
V. Ithapu
Buye Xu
Paris Smaragdis
Anurag Kumar
CLL
70
54
0
17 Feb 2022
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech
  Processing
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Sanyuan Chen
Chengyi Wang
Zhengyang Chen
Yu-Huan Wu
Shujie Liu
...
Yao Qian
Jian Wu
Micheal Zeng
Xiangzhan Yu
Furu Wei
SSL
271
1,905
0
26 Oct 2021
Adapting Speech Separation to Real-World Meetings Using Mixture
  Invariant Training
Adapting Speech Separation to Real-World Meetings Using Mixture Invariant Training
Aswin Sivaraman
Scott Wisdom
Hakan Erdogan
J. Hershey
44
22
0
20 Oct 2021
Large-scale Self-Supervised Speech Representation Learning for Automatic
  Speaker Verification
Large-scale Self-Supervised Speech Representation Learning for Automatic Speaker Verification
Zhengyang Chen
Sanyuan Chen
Yu-Huan Wu
Yao Qian
Chengyi Wang
Shujie Liu
Y. Qian
Michael Zeng
SSL
77
129
0
12 Oct 2021
Investigation of Practical Aspects of Single Channel Speech Separation
  for ASR
Investigation of Practical Aspects of Single Channel Speech Separation for ASR
Jian Wu
Zhuo Chen
Sanyuan Chen
Yu-Huan Wu
Takuya Yoshioka
Naoyuki Kanda
Shujie Liu
Jinyu Li
63
17
0
05 Jul 2021
XLM-E: Cross-lingual Language Model Pre-training via ELECTRA
XLM-E: Cross-lingual Language Model Pre-training via ELECTRA
Zewen Chi
Shaohan Huang
Li Dong
Shuming Ma
Bo Zheng
...
Payal Bajaj
Xia Song
Xian-Ling Mao
Heyan Huang
Furu Wei
97
120
0
30 Jun 2021
HuBERT: Self-Supervised Speech Representation Learning by Masked
  Prediction of Hidden Units
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units
Wei-Ning Hsu
Benjamin Bolte
Yao-Hung Hubert Tsai
Kushal Lakhotia
Ruslan Salakhutdinov
Abdel-rahman Mohamed
SSL
184
3,004
0
14 Jun 2021
Large-Scale Pre-Training of End-to-End Multi-Talker ASR for Meeting
  Transcription with Single Distant Microphone
Large-Scale Pre-Training of End-to-End Multi-Talker ASR for Meeting Transcription with Single Distant Microphone
Naoyuki Kanda
Guoli Ye
Yu-Huan Wu
Yashesh Gaur
Xiaofei Wang
Zhong Meng
Zhuo Chen
Takuya Yoshioka
94
42
0
31 Mar 2021
Attention is All You Need in Speech Separation
Attention is All You Need in Speech Separation
Cem Subakan
Mirco Ravanelli
Samuele Cornell
Mirko Bronzi
Jianyuan Zhong
97
565
0
25 Oct 2020
Continuous Speech Separation with Conformer
Continuous Speech Separation with Conformer
Sanyuan Chen
Yu-Huan Wu
Zhuo Chen
Jian Wu
Jinyu Li
Takuya Yoshioka
Chengyi Wang
Shujie Liu
M. Zhou
68
130
0
13 Aug 2020
The INTERSPEECH 2020 Deep Noise Suppression Challenge: Datasets,
  Subjective Testing Framework, and Challenge Results
The INTERSPEECH 2020 Deep Noise Suppression Challenge: Datasets, Subjective Testing Framework, and Challenge Results
Chandan K. A. Reddy
Vishak Gopal
Ross Cutler
Ebrahim Beyrami
R. Cheng
...
A. Aazami
Sebastian Braun
Puneet Rana
Sriram Srinivasan
J. Gehrke
96
318
0
16 May 2020
Common Voice: A Massively-Multilingual Speech Corpus
Common Voice: A Massively-Multilingual Speech Corpus
Rosana Ardila
Megan Branson
Kelly Davis
Michael Henretty
M. Kohler
Josh Meyer
Reuben Morais
Lindsay Saunders
Francis M. Tyers
Gregor Weber
VLM
96
1,620
0
13 Dec 2019
Advances in Online Audio-Visual Meeting Transcription
Advances in Online Audio-Visual Meeting Transcription
Takuya Yoshioka
Igor Abramovski
Cem Aksoylar
Zhuo Chen
Moshe David
...
Huaming Wang
Zhenghao Wang
Jun Zhang
Yong Zhao
Tianyan Zhou
87
75
0
10 Dec 2019
Dual-path RNN: efficient long sequence modeling for time-domain
  single-channel speech separation
Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation
Yi Luo
Zhuo Chen
Takuya Yoshioka
AI4TS
98
775
0
14 Oct 2019
Low-Latency Speaker-Independent Continuous Speech Separation
Low-Latency Speaker-Independent Continuous Speech Separation
Takuya Yoshioka
Zhuo Chen
Changliang Liu
Xiong Xiao
Hakan Erdogan
Dimitrios Dimitriadis
BDLVLM
30
28
0
13 Apr 2019
Recognizing Overlapped Speech in Meetings: A Multichannel Separation
  Approach Using Neural Networks
Recognizing Overlapped Speech in Meetings: A Multichannel Separation Approach Using Neural Networks
Takuya Yoshioka
Hakan Erdogan
Zhuo Chen
Xiong Xiao
F. Alleva
BDL
68
82
0
08 Oct 2018
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for
  Speech Separation
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation
Yi Luo
N. Mesgarani
171
1,796
0
20 Sep 2018
VoxCeleb2: Deep Speaker Recognition
VoxCeleb2: Deep Speaker Recognition
Joon Son Chung
Arsha Nagrani
Andrew Zisserman
356
2,287
0
14 Jun 2018
VoxCeleb: a large-scale speaker identification dataset
VoxCeleb: a large-scale speaker identification dataset
Arsha Nagrani
Joon Son Chung
Andrew Zisserman
129
2,287
0
26 Jun 2017
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
808
132,725
0
12 Jun 2017
Permutation Invariant Training of Deep Models for Speaker-Independent
  Multi-talker Speech Separation
Permutation Invariant Training of Deep Models for Speaker-Independent Multi-talker Speech Separation
Dong Yu
Morten Kolbæk
Zheng-Hua Tan
Jesper Jensen
103
859
0
01 Jul 2016
Deep clustering: Discriminative embeddings for segmentation and
  separation
Deep clustering: Discriminative embeddings for segmentation and separation
J. Hershey
Zhuo Chen
Jonathan Le Roux
Shinji Watanabe
64
1,321
0
18 Aug 2015
1