v1v2 (latest)

Online and Linear-Time Attention by Enforcing Monotonic Alignments

3 April 2017

Papers citing "Online and Linear-Time Attention by Enforcing Monotonic Alignments"

50 / 155 papers shown

Title
Decision Attentive Regularization to Improve Simultaneous Speech Translation Systems Mohd Abbas Zaidi Beomseok Lee Sangha Kim Chanwoo Kim 66 5 0 13 Oct 2021
Translating Images into Maps Avishkar Saha Oscar Alejandro Mendez Maldonado Chris Russell Richard Bowden ViT 103 148 0 03 Oct 2021
MeetDot: Videoconferencing with Live Translation Captions Arkady Arkhangorodsky Christopher Chu Scot Fang Yiqi Huang Denglin Jiang Ajay Nagesh Boliang Zhang Kevin Knight VLM 27 4 0 20 Sep 2021
On-device neural speech synthesis Sivanand Achanta Albert Antony L. Golipour Jiangchuan Li T. Raitio ... Francesco Rossi Jennifer Shi Jaimin Upadhyay David Winarsky Hepeng Zhang 118 17 0 17 Sep 2021
Universal Simultaneous Machine Translation with Mixture-of-Experts Wait-k Policy Shaolei Zhang Yang Feng MoE 95 43 0 11 Sep 2021
Infusing Future Information into Monotonic Attention Through Language Models Mohd Abbas Zaidi S. Indurthi Beomseok Lee Nikhil Kumar Lakumarapu Sangha Kim 60 2 0 07 Sep 2021
Sequence-to-Sequence Learning with Latent Neural Grammars Yoon Kim 168 40 0 02 Sep 2021
Enhancing audio quality for expressive Neural Text-to-Speech Abdelhamid Ezzerg Adam Gabry's Bartosz Putrycz Daniel Korzekwa Daniel Sáez-Trigueros David McHardy Kamil Pokora Jakub Lachowicz Jaime Lorenzo-Trueba V. Klimkov 140 6 0 13 Aug 2021
VAD-free Streaming Hybrid CTC/Attention ASR for Unsegmented Recording Hirofumi Inaguma Tatsuya Kawahara 113 2 0 15 Jul 2021
StableEmit: Selection Probability Discount for Reducing Emission Latency of Streaming Monotonic Attention ASR Hirofumi Inaguma Tatsuya Kawahara 68 4 0 01 Jul 2021
A Survey on Neural Speech Synthesis Xu Tan Tao Qin Frank Soong Tie-Yan Liu AI4TS 139 359 0 29 Jun 2021
Multi-mode Transformer Transducer with Stochastic Future Context Kwangyoun Kim Felix Wu Prashant Sridhar Kyu Jeong Han Shinji Watanabe 71 10 0 17 Jun 2021
Review of end-to-end speech synthesis technology based on deep learning Zhaoxi Mu Xinyu Yang Yizhuo Dong AuLLM ALM 94 25 0 20 Apr 2021
On Biasing Transformer Attention Towards Monotonicity Annette Rios Gonzales Chantal Amrhein Noëmi Aepli Rico Sennrich 31 7 0 08 Apr 2021
Extremely Low Footprint End-to-End ASR System for Smart Device Zhifu Gao Yiwu Yao Shiliang Zhang Jun Yang Ming Lei Ian Mcloughlin 43 13 0 06 Apr 2021
Attention, please! A survey of Neural Attention Models in Deep Learning Alana de Santana Correia Esther Luna Colombini HAI 130 198 0 31 Mar 2021
A study of latent monotonic attention variants Albert Zeyer Ralf Schluter Hermann Ney 75 5 0 30 Mar 2021
Mutually-Constrained Monotonic Multihead Attention for Online ASR Jae-gyun Song Hajin Shim Eunho Yang 22 0 0 26 Mar 2021
$Alleviate Exposure Bias in Sequence Prediction \\ with Recurrent Neural Networks$ Alleviate Exposure Bias in Sequence Prediction \\ with Recurrent Neural Networks Liping Yuan Jiangtao Feng Xiaoqing Zheng Xuanjing Huang 42 1 0 22 Mar 2021
Alignment Knowledge Distillation for Online Streaming Attention-based Speech Recognition Hirofumi Inaguma Tatsuya Kawahara 130 14 0 28 Feb 2021
Fast End-to-End Speech Recognition via Non-Autoregressive Models and Cross-Modal Knowledge Transferring from BERT Ye Bai Jiangyan Yi J. Tao Zhengkun Tian Zhengqi Wen Shuai Zhang RALM 91 52 0 15 Feb 2021
Thank you for Attention: A survey on Attention-based Artificial Neural Networks for Automatic Speech Recognition Priyabrata Karmakar S. Teng Guojun Lu 58 27 0 14 Feb 2021
WeNet: Production oriented Streaming and Non-streaming End-to-End Speech Recognition Toolkit Zhuoyuan Yao Di Wu Xiong Wang Binbin Zhang Fan Yu Chao Yang Zhendong Peng Xiaoyu Chen Lei Xie X. Lei 137 268 0 02 Feb 2021
A Survey on Deep Reinforcement Learning for Audio-Based Applications S. Latif Heriberto Cuayáhuitl Farrukh Pervez Fahad Shamshad Hafiz Shehbaz Ali Min Zhang OffRL 125 75 0 01 Jan 2021
Learning to Rationalize for Nonmonotonic Reasoning with Distant Supervision Faeze Brahman Vered Shwartz Rachel Rudinger Yejin Choi LRM 98 42 0 14 Dec 2020
Unified Streaming and Non-streaming Two-pass End-to-end Model for Speech Recognition Binbin Zhang Di Wu Zhuoyuan Yao Xiong Wang F. Yu Chao Yang Liyong Guo Yaguang Hu Lei Xie X. Lei 93 81 0 10 Dec 2020
EfficientTTS: An Efficient and High-Quality Text-to-Speech Architecture Chenfeng Miao Shuang Liang Zhencheng Liu Minchuan Chen Jun Ma Shaojun Wang Jing Xiao 74 38 0 07 Dec 2020
A Better and Faster End-to-End Model for Streaming ASR Yue Liu Anmol Gulati Jiahui Yu Tara N. Sainath Chung-Cheng Chiu ... Wei Han Qiao Liang Yu Zhang Trevor Strohman Yonghui Wu AuLLM 169 123 0 21 Nov 2020
SimulMT to SimulST: Adapting Simultaneous Text Translation to End-to-End Simultaneous Speech Translation Xutai Ma J. Pino Philipp Koehn 74 97 0 03 Nov 2020
FeatherTTS: Robust and Efficient attention based Neural TTS Qiao Tian Zewang Zhang Chao-Jung Liu Heng Lu Linghui Chen Bin Wei P. He Shan Liu 69 4 0 02 Nov 2020
Universal ASR: Unifying Streaming and Non-Streaming ASR Using a Single Encoder-Decoder Model Zhifu Gao Shiliang Zhang Ming Lei Ian Mcloughlin CVBM 54 15 0 27 Oct 2020
Dual-mode ASR: Unify and Improve Streaming ASR with Full-context Modeling Jiahui Yu Wei Han Anmol Gulati Chung-Cheng Chiu Yue Liu Tara N. Sainath Yonghui Wu Ruoming Pang 125 19 0 12 Oct 2020
fairseq S2T: Fast Speech-to-Text Modeling with fairseq Changhan Wang Yun Tang Xutai Ma Anne Wu Sravya Popuri Dmytro Okhonko J. Pino VLM LRM 122 276 0 11 Oct 2020
Super-Human Performance in Online Low-latency Recognition of Conversational Speech T. Nguyen S. Stueker A. Waibel BDL 74 38 0 07 Oct 2020
Controllable neural text-to-speech synthesis using intuitive prosodic features T. Raitio Ramya Rasipuram D. Castellani 78 66 0 14 Sep 2020
Any-to-Many Voice Conversion with Location-Relative Sequence-to-Sequence Modeling Songxiang Liu Yuewen Cao Disong Wang Xixin Wu Xunying Liu Helen Meng BDL 120 92 0 06 Sep 2020
Online Automatic Speech Recognition with Listen, Attend and Spell Model Roger Hsiao Dogan Can Tim Ng R. Travadi Arnab Ghoshal RALM 61 17 0 12 Aug 2020
Incremental Text to Speech for Neural Sequence-to-Sequence Models using Reinforcement Learning D. Mohan R. Lenain Lorenzo Foglianti Tian Huey Teh Marlene Staib Alexandra Torresquintero Jiameng Gao AI4TS 61 11 0 07 Aug 2020
Class LM and word mapping for contextual biasing in End-to-End ASR Rongqing Huang Ossama Abdel-Hamid Xinwei Li G. Evermann 68 49 0 10 Jul 2020
Learning to Count Words in Fluent Speech enables Online Speech Recognition George Sterpu Christian Saam N. Harte 72 4 0 08 Jun 2020
End-to-End Adversarial Text-to-Speech Jeff Donahue Sander Dieleman Mikolaj Binkowski Erich Elsen Karen Simonyan 109 187 0 05 Jun 2020
Online Versus Offline NMT Quality: An In-depth Analysis on English-German and German-English Maha Elbayad M. Ustaszewski Emmanuelle Esperancca-Rodier Francis Brunet Manquat Jakob Verbeek Laurent Besacier OffRL 85 10 0 01 Jun 2020
On the Comparison of Popular End-to-End Models for Large Scale Speech Recognition Jinyu Li Yu-Huan Wu Yashesh Gaur Chengyi Wang Rui Zhao Shujie Liu 73 137 0 28 May 2020
Low-Latency Sequence-to-Sequence Speech Recognition and Translation by Partial Hypothesis Selection Danni Liu Gerasimos Spanakis Jan Niehues 82 50 0 22 May 2020
Streaming Chunk-Aware Multihead Attention for Online End-to-End Speech Recognition Shiliang Zhang Zhifu Gao Haoneng Luo Ming Lei Jie Ying Gao Zhijie Yan Lei Xie 64 29 0 21 May 2020
A Comparison of Label-Synchronous and Frame-Synchronous End-to-End Models for Speech Recognition Linhao Dong Cheng Yi Jianzong Wang Shiyu Zhou Shuang Xu X. Jia Bo Xu 68 17 0 20 May 2020
Enhancing Monotonic Multihead Attention for Streaming ASR Hirofumi Inaguma Masato Mimura Tatsuya Kawahara 101 34 0 19 May 2020
Efficient Wait-k Models for Simultaneous Machine Translation Maha Elbayad Laurent Besacier Jakob Verbeek VLM 80 80 0 18 May 2020
MoBoAligner: a Neural Alignment Model for Non-autoregressive TTS with Monotonic Boundary Search Naihan Li Shujie Liu Yanqing Liu Sheng Zhao Ming-Yuan Liu Ming Zhou 53 6 0 18 May 2020
AdaDurIAN: Few-shot Adaptation for Neural Text-to-Speech with DurIAN Zewang Zhang Qiao Tian Heng Lu Ling-Hao Chen Shan Liu 62 27 0 12 May 2020