Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.12007
Cited By
PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit
20 May 2022
Hui Zhang
Tian Yuan
Junkun Chen
Xintong Li
Renjie Zheng
Yuxin Huang
Xiaojie Chen
Enlei Gong
Zeyu Chen
Xiaoguang Hu
Dianhai Yu
Yanjun Ma
Liang Huang
AuLLM
Re-assign community
ArXiv (abs)
PDF
HTML
Github (11936★)
Papers citing
"PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit"
36 / 36 papers shown
Title
SpeechBrain: A General-Purpose Speech Toolkit
Mirco Ravanelli
Titouan Parcollet
Peter William VanHarn Plantinga
Aku Rouhe
Samuele Cornell
...
William Aris
Hwidong Na
Yan Gao
R. Mori
Yoshua Bengio
83
767
0
08 Jun 2021
AST: Audio Spectrogram Transformer
Yuan Gong
Yu-An Chung
James R. Glass
ViT
125
865
0
05 Apr 2021
WeNet: Production oriented Streaming and Non-streaming End-to-End Speech Recognition Toolkit
Zhuoyuan Yao
Di Wu
Xiong Wang
Binbin Zhang
Fan Yu
Chao Yang
Zhendong Peng
Xiaoyu Chen
Lei Xie
X. Lei
73
267
0
02 Feb 2021
Automatic punctuation restoration with BERT models
A. Nagy
Bence Bial
Judit Ács
42
25
0
18 Jan 2021
NeurST: Neural Speech Translation Toolkit
Chengqi Zhao
Mingxuan Wang
Qianqian Dong
Rong Ye
Lei Li
68
32
0
18 Dec 2020
Unified Streaming and Non-streaming Two-pass End-to-end Model for Speech Recognition
Binbin Zhang
Di Wu
Zhuoyuan Yao
Xiong Wang
F. Yu
Chao Yang
Liyong Guo
Yaguang Hu
Lei Xie
X. Lei
76
81
0
10 Dec 2020
StyleMelGAN: An Efficient High-Fidelity Adversarial Vocoder with Temporal Adaptive Normalization
Ahmed Mustafa
N. Pia
Guillaume Fuchs
64
73
0
03 Nov 2020
AISHELL-3: A Multi-speaker Mandarin TTS Corpus and the Baselines
Yao Shi
Hui Bu
Xin Xu
Shaojing Zhang
Ming Li
83
222
0
22 Oct 2020
Fluent and Low-latency Simultaneous Speech-to-Speech Translation with Self-adaptive Training
Renjie Zheng
Mingbo Ma
Baigong Zheng
Kaibo Liu
Jiahong Yuan
Kenneth Church
Liang Huang
43
14
0
20 Oct 2020
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Jungil Kong
Jaehyeon Kim
Jaekyoung Bae
177
1,936
0
12 Oct 2020
fairseq S2T: Fast Speech-to-Text Modeling with fairseq
Changhan Wang
Yun Tang
Xutai Ma
Anne Wu
Sravya Popuri
Dmytro Okhonko
J. Pino
VLM
LRM
75
271
0
11 Oct 2020
SpeedySpeech: Efficient Neural Speech Synthesis
Jan Vainer
Ondrej Dusek
48
43
0
09 Aug 2020
FastPitch: Parallel Text-to-speech with Pitch Prediction
Adrian Lañcucki
75
340
0
11 Jun 2020
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
Yi Ren
Chenxu Hu
Xu Tan
Tao Qin
Sheng Zhao
Zhou Zhao
Tie-Yan Liu
105
1,396
0
08 Jun 2020
Multi-band MelGAN: Faster Waveform Generation for High-Quality Text-to-Speech
Geng Yang
Shan Yang
Kai-Chun Liu
Peng Fang
Wei Chen
Lei Xie
127
199
0
11 May 2020
ESPnet-ST: All-in-One Speech Translation Toolkit
Hirofumi Inaguma
Shun Kiyono
Kevin Duh
Shigeki Karita
Nelson Yalta
Tomoki Hayashi
Shinji Watanabe
89
165
0
21 Apr 2020
PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition
Qiuqiang Kong
Yin Cao
Turab Iqbal
Yuxuan Wang
Wenwu Wang
Mark D. Plumbley
VLM
SSL
192
1,082
0
21 Dec 2019
PyTorch: An Imperative Style, High-Performance Deep Learning Library
Adam Paszke
Sam Gross
Francisco Massa
Adam Lerer
James Bradbury
...
Sasank Chilamkurthy
Benoit Steiner
Lu Fang
Junjie Bai
Soumith Chintala
ODL
520
42,449
0
03 Dec 2019
WaveFlow: A Compact Flow-based Model for Raw Audio
Ming-Yu Liu
Kainan Peng
Kexin Zhao
Z. Song
75
117
0
03 Dec 2019
Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram
Ryuichi Yamamoto
Eunwoo Song
Jae-Min Kim
58
818
0
25 Oct 2019
MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis
Kundan Kumar
Rithesh Kumar
T. Boissière
L. Gestin
Wei Zhen Teoh
Jose M. R. Sotelo
A. D. Brébisson
Yoshua Bengio
Aaron Courville
GAN
161
953
0
08 Oct 2019
Learning Alignment for Multimodal Emotion Recognition from Speech
Haiyang Xu
Hui Zhang
Kun Han
Yun Wang
Yiping Peng
Xiangang Li
46
123
0
06 Sep 2019
DELTA: A DEep learning based Language Technology plAtform
Kun Han
Junwen Chen
Hui Zhang
Haiyang Xu
Yiping Peng
...
Cheng Gong
Yunbo Wang
Wei Zou
Hui Song
Xiangang Li
VLM
18
10
0
02 Aug 2019
ERNIE: Enhanced Representation through Knowledge Integration
Yu Sun
Shuohuan Wang
Yukun Li
Shikun Feng
Xuyi Chen
Han Zhang
Xin Tian
Danxiang Zhu
Hao Tian
Hua Wu
121
901
0
19 Apr 2019
Cross-task learning for audio tagging, sound event detection spatial localization: DCASE 2019 baseline systems
Qiuqiang Kong
Yin Cao
Turab Iqbal
Yong-mei Xu
Wenwu Wang
Mark D. Plumbley
AI4TS
122
77
0
11 Apr 2019
Deep Segment Attentive Embedding for Duration Robust Speaker Verification
Bin Liu
Shuai Nie
Yaping Zhang
Shan Liang
Wenju Liu
47
4
0
01 Nov 2018
SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing
Taku Kudo
John Richardson
196
3,520
0
19 Aug 2018
Automatic acoustic detection of birds through deep learning: the first Bird Audio Detection challenge
D. Stowell
Y. Stylianou
Mike Wood
H. Pamula
H. Glotin
77
310
0
16 Jul 2018
Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition
Pete Warden
86
1,619
0
09 Apr 2018
ESPnet: End-to-End Speech Processing Toolkit
Shinji Watanabe
Takaaki Hori
Shigeki Karita
Tomoki Hayashi
Jiro Nishitoba
...
Jahn Heymann
Sanjeev Khudanpur
Nanxin Chen
Adithya Renduchintala
Tsubasa Ochiai
VLM
109
1,507
0
30 Mar 2018
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions
Jonathan Shen
Ruoming Pang
Ron J. Weiss
M. Schuster
Navdeep Jaitly
...
Yuxuan Wang
RJ Skerry-Ryan
Rif A. Saurous
Yannis Agiomyrgiannakis
Yonghui Wu
79
2,698
0
16 Dec 2017
AISHELL-1: An Open-Source Mandarin Speech Corpus and A Speech Recognition Baseline
Hui Bu
Jiayu Du
Xingyu Na
Bengu Wu
Hao Zheng
CVBM
64
841
0
16 Sep 2017
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
713
131,652
0
12 Jun 2017
Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders
Jesse Engel
Cinjon Resnick
Adam Roberts
Sander Dieleman
Douglas Eck
Karen Simonyan
Mohammad Norouzi
118
627
0
05 Apr 2017
TensorFlow: A system for large-scale machine learning
Martín Abadi
P. Barham
Jianmin Chen
Zhiwen Chen
Andy Davis
...
Vijay Vasudevan
Pete Warden
Martin Wicke
Yuan Yu
Xiaoqiang Zhang
GNN
AI4CE
433
18,361
0
27 May 2016
Deep Speech 2: End-to-End Speech Recognition in English and Mandarin
Dario Amodei
Rishita Anubhai
Eric Battenberg
Carl Case
Jared Casper
...
Chong-Jun Wang
Bo Xiao
Dani Yogatama
J. Zhan
Zhenyao Zhu
137
2,973
0
08 Dec 2015
1