Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.12998
Cited By
Articulatory Encodec: Coding Speech through Vocal Tract Kinematics
18 June 2024
Cheol Jun Cho
Peter Wu
Tejas S. Prabhune
Dhruv Agarwal
Gopala K. Anumanchipalli
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Articulatory Encodec: Coding Speech through Vocal Tract Kinematics"
21 / 21 papers shown
Title
HierSpeech++: Bridging the Gap between Semantic and Acoustic Representation of Speech by Hierarchical Variational Inference for Zero-shot Speech Synthesis
Sang-Hoon Lee
Haram Choi
Seung-Bin Kim
Seong-Whan Lee
BDL
96
35
0
21 Nov 2023
SD-HuBERT: Sentence-Level Self-Distillation Induces Syllabic Organization in HuBERT
Cheol Jun Cho
Abdelrahman Mohamed
Shang-Wen Li
Alan W. Black
Gopala K. Anumanchipalli
77
9
0
16 Oct 2023
Improving Speech Inversion Through Self-Supervised Embeddings and Enhanced Tract Variables
Ahmed Adel Attia
Yashish M. Siriwardena
Carol Espy-Wilson
SSL
58
8
0
17 Sep 2023
QuickVC: Any-to-many Voice Conversion Using Inverse Short-time Fourier Transform for Faster Conversion
Houjian Guo
Chaoran Liu
C. Ishi
H. Ishiguro
BDL
63
12
0
16 Feb 2023
NANSY++: Unified Voice Synthesis with Neural Analysis and Synthesis
Hyeong-Seok Choi
Jinhyeok Yang
Juheon Lee
Hyeongju Kim
57
46
0
17 Nov 2022
The Secret Source : Incorporating Source Features to Improve Acoustic-to-Articulatory Speech Inversion
Yashish M. Siriwardena
C. Espy-Wilson
60
16
0
29 Oct 2022
Why does Self-Supervised Learning for Speech Recognition Benefit Speaker Recognition?
Sanyuan Chen
Yu Wu
Chengyi Wang
Shujie Liu
Zhuo Chen
...
Gang Liu
Jinyu Li
Jian Wu
Xiangzhan Yu
Furu Wei
SSL
78
42
0
27 Apr 2022
UTMOS: UTokyo-SaruLab System for VoiceMOS Challenge 2022
Takaaki Saeki
Detai Xin
Wataru Nakata
Tomoki Koriyama
Shinnosuke Takamichi
Hiroshi Saruwatari
90
207
0
05 Apr 2022
Deep Neural Convolutive Matrix Factorization for Articulatory Representation Decomposition
Jiachen Lian
A. Black
Louis Goldstein
Gopala Krishna Anumanchipalli
63
18
0
01 Apr 2022
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
Edresson Casanova
Julian Weber
C. Shulby
Arnaldo Cândido Júnior
Eren Golge
M. Ponti
221
408
0
04 Dec 2021
Neural Analysis and Synthesis: Reconstructing Speech from Self-Supervised Representations
Hyeong-Seok Choi
Juheon Lee
W. Kim
Jie Hwan Lee
Hoon Heo
Kyogu Lee
74
158
0
27 Oct 2021
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Sanyuan Chen
Chengyi Wang
Zhengyang Chen
Yu-Huan Wu
Shujie Liu
...
Yao Qian
Jian Wu
Micheal Zeng
Xiangzhan Yu
Furu Wei
SSL
239
1,857
0
26 Oct 2021
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units
Wei-Ning Hsu
Benjamin Bolte
Yao-Hung Hubert Tsai
Kushal Lakhotia
Ruslan Salakhutdinov
Abdel-rahman Mohamed
SSL
166
2,949
0
14 Jun 2021
Exploring wav2vec 2.0 on speaker verification and language identification
Zhiyun Fan
Meng Li
Shiyu Zhou
Bo Xu
133
203
0
11 Dec 2020
MLS: A Large-Scale Multilingual Dataset for Speech Research
Vineel Pratap
Qiantong Xu
Anuroop Sriram
Gabriel Synnaeve
R. Collobert
AuLLM
86
503
0
07 Dec 2020
AISHELL-3: A Multi-speaker Mandarin TTS Corpus and the Baselines
Yao Shi
Hui Bu
Xin Xu
Shaojing Zhang
Ming Li
70
222
0
22 Oct 2020
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Jungil Kong
Jaehyeon Kim
Jaekyoung Bae
177
1,931
0
12 Oct 2020
JVS corpus: free Japanese multi-speaker voice corpus
Shinnosuke Takamichi
Kentaro Mitsui
Yuki Saito
Tomoki Koriyama
Naoko Tanji
Hiroshi Saruwatari
40
71
0
17 Aug 2019
LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech
Heiga Zen
Viet Dang
R. Clark
Yu Zhang
Ron J. Weiss
Ye Jia
Zhiwen Chen
Yonghui Wu
104
951
0
05 Apr 2019
Neural Discrete Representation Learning
Aaron van den Oord
Oriol Vinyals
Koray Kavukcuoglu
BDL
SSL
OCL
226
5,008
0
02 Nov 2017
VoxCeleb: a large-scale speaker identification dataset
Arsha Nagrani
Joon Son Chung
Andrew Zisserman
122
2,273
0
26 Jun 2017
1