Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.17098
Cited By
Structured State Space Decoder for Speech Recognition and Synthesis
31 October 2022
Koichi Miyazaki
Masato Murata
Tomoki Koriyama
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Structured State Space Decoder for Speech Recognition and Synthesis"
26 / 26 papers shown
Title
Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding
Yifan Peng
Siddharth Dalmia
Ian Lane
Shinji Watanabe
65
147
0
06 Jul 2022
It's Raw! Audio Generation with State-Space Models
Karan Goel
Albert Gu
Chris Donahue
Christopher Ré
55
191
0
20 Feb 2022
Efficiently Modeling Long Sequences with Structured State Spaces
Albert Gu
Karan Goel
Christopher Ré
205
1,773
0
31 Oct 2021
Combining Recurrent, Convolutional, and Continuous-time Models with Linear State-Space Layers
Albert Gu
Isys Johnson
Karan Goel
Khaled Kamal Saab
Tri Dao
Atri Rudra
Christopher Ré
112
595
0
26 Oct 2021
ESPnet2-TTS: Extending the Edge of TTS Research
Tomoki Hayashi
Ryuichi Yamamoto
Takenori Yoshimura
Peter Wu
Jiatong Shi
Takaaki Saeki
Yooncheol Ju
Yusuke Yasuda
Shinnosuke Takamichi
Shinji Watanabe
VLM
65
62
0
15 Oct 2021
SRU++: Pioneering Fast Recurrence with Attention for Speech Recognition
Jing Pan
Tao Lei
Kwangyoun Kim
Kyu Jeong Han
Shinji Watanabe
VLM
36
9
0
11 Oct 2021
A Comparative Study on Neural Architectures and Training Methods for Japanese Speech Recognition
Shigeki Karita
Yotaro Kubo
M. Bacchiani
Llion Jones
39
13
0
09 Jun 2021
Recent Developments on ESPnet Toolkit Boosted by Conformer
Pengcheng Guo
Florian Boyer
Xuankai Chang
Tomoki Hayashi
Yosuke Higuchi
...
Jing Shi
Shinji Watanabe
Kun Wei
Wangyou Zhang
Yuekai Zhang
63
263
0
26 Oct 2020
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Jungil Kong
Jaehyeon Kim
Jaekyoung Bae
177
1,931
0
12 Oct 2020
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
538
2,081
0
28 Jul 2020
Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention
Angelos Katharopoulos
Apoorv Vyas
Nikolaos Pappas
Franccois Fleuret
198
1,760
0
29 Jun 2020
Conformer: Convolution-augmented Transformer for Speech Recognition
Anmol Gulati
James Qin
Chung-Cheng Chiu
Niki Parmar
Yu Zhang
...
Wei Han
Shibo Wang
Zhengdong Zhang
Yonghui Wu
Ruoming Pang
218
3,130
0
16 May 2020
ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context
Wei Han
Zhengdong Zhang
Yu Zhang
Jiahui Yu
Chung-Cheng Chiu
James Qin
Anmol Gulati
Ruoming Pang
Yonghui Wu
61
263
0
07 May 2020
Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss
Qian Zhang
Han Lu
Hasim Sak
Anshuman Tripathi
Erik McDermott
Stephen Koo
Shankar Kumar
74
480
0
07 Feb 2020
A Comparative Study on Transformer vs RNN in Speech Applications
Shigeki Karita
Nanxin Chen
Tomoki Hayashi
Takaaki Hori
Hirofumi Inaguma
...
Ryuichi Yamamoto
Xiao-fei Wang
Shinji Watanabe
Takenori Yoshimura
Wangyou Zhang
68
720
0
13 Sep 2019
Robust Sequence-to-Sequence Acoustic Modeling with Stepwise Monotonic Attention for Neural TTS
Mutian He
Yan Deng
Lei He
56
81
0
03 Jun 2019
SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition
Daniel S. Park
William Chan
Yu Zhang
Chung-Cheng Chiu
Barret Zoph
E. D. Cubuk
Quoc V. Le
VLM
174
3,456
0
18 Apr 2019
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Zihang Dai
Zhilin Yang
Yiming Yang
J. Carbonell
Quoc V. Le
Ruslan Salakhutdinov
VLM
216
3,726
0
09 Jan 2019
ESPnet: End-to-End Speech Processing Toolkit
Shinji Watanabe
Takaaki Hori
Shigeki Karita
Tomoki Hayashi
Jiro Nishitoba
...
Jahn Heymann
Sanjeev Khudanpur
Nanxin Chen
Adithya Renduchintala
Tsubasa Ochiai
VLM
102
1,504
0
30 Mar 2018
Self-Attention with Relative Position Representations
Peter Shaw
Jakob Uszkoreit
Ashish Vaswani
170
2,289
0
06 Mar 2018
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions
Jonathan Shen
Ruoming Pang
Ron J. Weiss
M. Schuster
Navdeep Jaitly
...
Yuxuan Wang
RJ Skerry-Ryan
Rif A. Saurous
Yannis Agiomyrgiannakis
Yonghui Wu
77
2,697
0
16 Dec 2017
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
687
131,526
0
12 Jun 2017
Language Modeling with Gated Convolutional Networks
Yann N. Dauphin
Angela Fan
Michael Auli
David Grangier
237
2,396
0
23 Dec 2016
Layer Normalization
Jimmy Lei Ba
J. Kiros
Geoffrey E. Hinton
407
10,481
0
21 Jul 2016
Deep Networks with Stochastic Depth
Gao Huang
Yu Sun
Zhuang Liu
Daniel Sedra
Kilian Q. Weinberger
209
2,356
0
30 Mar 2016
Listen, Attend and Spell
William Chan
Navdeep Jaitly
Quoc V. Le
Oriol Vinyals
RALM
153
2,266
0
05 Aug 2015
1