Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2409.07841
Cited By
v1
v2
v3 (latest)
TSELM: Target Speaker Extraction using Discrete Tokens and Language Models
12 September 2024
Beilong Tang
Bang Zeng
Ming Li
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"TSELM: Target Speaker Extraction using Discrete Tokens and Language Models"
15 / 15 papers shown
Title
AnyEnhance: A Unified Generative Model with Prompt-Guidance and Self-Critic for Voice Enhancement
Junan Zhang
Jing Yang
Zihao Fang
Yansen Wang
Zehua Zhang
Zhuo Wang
Fan Fan
Zhikai Wu
125
4
0
26 Jan 2025
USEF-TSE: Universal Speaker Embedding Free Target Speaker Extraction
Bang Zeng
Ming Li
66
4
0
04 Sep 2024
TokenSplit: Using Discrete Speech Representations for Direct, Refined, and Transcript-Conditioned Speech Separation and Recognition
Hakan Erdogan
Scott Wisdom
Xuankai Chang
Zalan Borsos
Marco Tagliasacchi
Neil Zeghidour
J. Hershey
43
11
0
21 Aug 2023
Voice Conversion With Just Nearest Neighbors
Matthew Baas
Benjamin van Niekerk
Herman Kamper
SSL
93
60
0
30 May 2023
AudioGen: Textually Guided Audio Generation
Felix Kreuk
Gabriel Synnaeve
Adam Polyak
Uriel Singer
Alexandre Défossez
Jade Copet
Devi Parikh
Yaniv Taigman
Yossi Adi
DiffM
82
308
0
30 Sep 2022
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Sanyuan Chen
Chengyi Wang
Zhengyang Chen
Yu-Huan Wu
Shujie Liu
...
Yao Qian
Jian Wu
Micheal Zeng
Xiangzhan Yu
Furu Wei
SSL
252
1,873
0
26 Oct 2021
DNSMOS P.835: A Non-Intrusive Perceptual Objective Speech Quality Metric to Evaluate Noise Suppressors
Chandan K. A. Reddy
Vishak Gopal
Ross Cutler
83
218
0
05 Oct 2021
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units
Wei-Ning Hsu
Benjamin Bolte
Yao-Hung Hubert Tsai
Kushal Lakhotia
Ruslan Salakhutdinov
Abdel-rahman Mohamed
SSL
180
2,966
0
14 Jun 2021
SpeechBrain: A General-Purpose Speech Toolkit
Mirco Ravanelli
Titouan Parcollet
Peter William VanHarn Plantinga
Aku Rouhe
Samuele Cornell
...
William Aris
Hwidong Na
Yan Gao
R. Mori
Yoshua Bengio
83
767
0
08 Jun 2021
Attention is All You Need in Speech Separation
Cem Subakan
Mirco Ravanelli
Samuele Cornell
Mirko Bronzi
Jianyuan Zhong
95
557
0
25 Oct 2020
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
814
42,055
0
28 May 2020
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
445
20,181
0
23 Oct 2019
LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech
Heiga Zen
Viet Dang
R. Clark
Yu Zhang
Ron J. Weiss
Ye Jia
Zhiwen Chen
Yonghui Wu
104
954
0
05 Apr 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.8K
94,891
0
11 Oct 2018
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation
Yi Luo
N. Mesgarani
159
1,787
0
20 Sep 2018
1