Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2409.09357
Cited By
Joint Semantic Knowledge Distillation and Masked Acoustic Modeling for Full-band Speech Restoration with Improved Intelligibility
14 September 2024
Xiaoyu Liu
Xu Li
Joan Serrà
Santiago Pascual
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Joint Semantic Knowledge Distillation and Masked Acoustic Modeling for Full-band Speech Restoration with Improved Intelligibility"
17 / 17 papers shown
Title
Miipher-2: A Universal Speech Restoration Model for Million-Hour Scale Data Restoration
Shigeki Karita
Yuma Koizumi
Heiga Zen
Haruko Ishikawa
Robin Scheibler
M. Bacchiani
VLM
400
1
0
07 May 2025
AnyEnhance: A Unified Generative Model with Prompt-Guidance and Self-Critic for Voice Enhancement
Junan Zhang
Jing Yang
Zihao Fang
Yansen Wang
Zehua Zhang
Zhuo Wang
Fan Fan
Zhikai Wu
120
4
0
26 Jan 2025
Universal Score-based Speech Enhancement with High Content Preservation
Robin Scheibler
Yusuke Fujita
Yuma Shirahata
Tatsuya Komatsu
DiffM
78
15
0
18 Jun 2024
High-Fidelity Audio Compression with Improved RVQGAN
Rithesh Kumar
Prem Seetharaman
Alejandro Luebs
I. Kumar
Kundan Kumar
94
327
0
11 Jun 2023
DeepFilterNet: Perceptually Motivated Real-Time Speech Enhancement
Hendrik Schröter
T. Rosenkranz
Alberto N. Escalante
Andreas Maier
53
18
0
14 May 2023
MusicLM: Generating Music From Text
A. Agostinelli
Timo I. Denk
Zalan Borsos
Jesse Engel
Mauro Verzetti
...
Adam Roberts
Marco Tagliasacchi
Matthew Sharifi
Neil Zeghidour
Christian Frank
MGen
141
444
0
26 Jan 2023
Classifier-Free Diffusion Guidance
Jonathan Ho
Tim Salimans
FaML
193
3,898
0
26 Jul 2022
Universal Speech Enhancement with Score-based Diffusion
Joan Serrà
Santiago Pascual
Jordi Pons
R. O. Araz
D. Scaini
DiffM
90
105
0
07 Jun 2022
Fast Real-time Personalized Speech Enhancement: End-to-End Enhancement Network (E3Net) and Knowledge Distillation
Manthan Thakker
Sefik Emre Eskimez
Takuya Yoshioka
Huaming Wang
41
29
0
02 Apr 2022
MaskGIT: Masked Generative Image Transformer
Huiwen Chang
Han Zhang
Lu Jiang
Ce Liu
William T. Freeman
ViT
153
678
0
08 Feb 2022
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Sanyuan Chen
Chengyi Wang
Zhengyang Chen
Yu-Huan Wu
Shujie Liu
...
Yao Qian
Jian Wu
Micheal Zeng
Xiangzhan Yu
Furu Wei
SSL
250
1,873
0
26 Oct 2021
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units
Wei-Ning Hsu
Benjamin Bolte
Yao-Hung Hubert Tsai
Kushal Lakhotia
Ruslan Salakhutdinov
Abdel-rahman Mohamed
SSL
180
2,966
0
14 Jun 2021
DNSMOS: A Non-Intrusive Perceptual Objective Speech Quality metric to evaluate Noise Suppressors
Chandan K. A. Reddy
Vishak Gopal
Ross Cutler
72
311
0
28 Oct 2020
SESQA: semi-supervised learning for speech quality assessment
Joan Serrà
Jordi Pons
Santiago Pascual
40
42
0
01 Oct 2020
Real Time Speech Enhancement in the Waveform Domain
Alexandre Défossez
Gabriel Synnaeve
Yossi Adi
78
462
0
23 Jun 2020
The INTERSPEECH 2020 Deep Noise Suppression Challenge: Datasets, Subjective Testing Framework, and Challenge Results
Chandan K. A. Reddy
Vishak Gopal
Ross Cutler
Ebrahim Beyrami
R. Cheng
...
A. Aazami
Sebastian Braun
Puneet Rana
Sriram Srinivasan
J. Gehrke
92
316
0
16 May 2020
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.8K
94,891
0
11 Oct 2018
1