Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2307.01715
Cited By
Align With Purpose: Optimize Desired Properties in CTC Models with a General Plug-and-Play Framework
4 July 2023
Eliya Segev
Maya Alroy
Ronen Katsir
Noam Wies
Ayana Shenhav
Yael Ben-Oren
D. Zar
Oren Tadmor
Jacob Bitterman
Amnon Shashua
Tal Rosenwein
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Align With Purpose: Optimize Desired Properties in CTC Models with a General Plug-and-Play Framework"
27 / 27 papers shown
Title
Powerful and Extensible WFST Framework for RNN-Transducer Losses
A. Laptev
Vladimir Bataev
Igor Gitman
Boris Ginsburg
46
3
0
18 Mar 2023
WhisperX: Time-Accurate Speech Transcription of Long-Form Audio
Max Bain
Jaesung Huh
Tengda Han
Andrew Zisserman
85
237
0
01 Mar 2023
Robust Speech Recognition via Large-Scale Weak Supervision
Alec Radford
Jong Wook Kim
Tao Xu
Greg Brockman
C. McLeavey
Ilya Sutskever
OffRL
130
3,623
0
06 Dec 2022
Peak-First CTC: Reducing the Peak Latency of CTC Models by Applying Peak-First Regularization
Zhengkun Tian
Hongyu Xiang
Min Li
Fei Lin
Ke Ding
Guanglu Wan
36
7
0
07 Nov 2022
Minimum Latency Training of Sequence Transducers for Streaming End-to-End Speech Recognition
Yusuke Shinohara
Shinji Watanabe
AI4TS
55
10
0
04 Nov 2022
TrimTail: Low-Latency Streaming ASR with Simple but Effective Spectrogram-Level Length Penalty
Xingcheng Song
Di Wu
Zhiyong Wu
Binbin Zhang
Yuekai Zhang
Zhendong Peng
Wenpeng Li
Fuping Pan
Changbao Zhu
83
8
0
01 Nov 2022
BRIO: Bringing Order to Abstractive Summarization
Yixin Liu
Pengfei Liu
Dragomir R. Radev
Graham Neubig
69
285
0
31 Mar 2022
Non-Autoregressive Translation with Layer-Wise Prediction and Deep Supervision
Chenyang Huang
Hao Zhou
Osmar R. Zaïane
Lili Mou
Lei Li
122
59
0
14 Oct 2021
Why does CTC result in peaky behavior?
Albert Zeyer
Ralf Schluter
Hermann Ney
44
46
0
31 May 2021
VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation
Changhan Wang
M. Rivière
Ann Lee
Anne Wu
Chaitanya Talnikar
Daniel Haziza
Mary Williamson
J. Pino
Emmanuel Dupoux
SSL
80
484
0
02 Jan 2021
FastEmit: Low-latency Streaming ASR with Sequence-level Emission Regularization
Jiahui Yu
Chung-Cheng Chiu
Yue Liu
Shuo-yiin Chang
Tara N. Sainath
...
A. Narayanan
Wei Han
Anmol Gulati
Yonghui Wu
Ruoming Pang
52
92
0
21 Oct 2020
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations
Alexei Baevski
Henry Zhou
Abdel-rahman Mohamed
Michael Auli
SSL
228
5,774
0
20 Jun 2020
Conformer: Convolution-augmented Transformer for Speech Recognition
Anmol Gulati
James Qin
Chung-Cheng Chiu
Niki Parmar
Yu Zhang
...
Wei Han
Shibo Wang
Zhengdong Zhang
Yonghui Wu
Ruoming Pang
210
3,119
0
16 May 2020
Scaling Up Online Speech Recognition Using ConvNets
Vineel Pratap
Qiantong Xu
Jacob Kahn
Gilad Avidov
Tatiana Likhomanenko
Awni Y. Hannun
Vitaliy Liptchinsky
Gabriel Synnaeve
R. Collobert
186
38
0
27 Jan 2020
Transformer-based Acoustic Modeling for Hybrid Speech Recognition
Yongqiang Wang
Abdel-rahman Mohamed
Duc Le
Chunxi Liu
Alex Xiao
...
Xiaohui Zhang
Frank Zhang
Christian Fuegen
Geoffrey Zweig
M. Seltzer
48
248
0
22 Oct 2019
On the Variance of the Adaptive Learning Rate and Beyond
Liyuan Liu
Haoming Jiang
Pengcheng He
Weizhu Chen
Xiaodong Liu
Jianfeng Gao
Jiawei Han
ODL
230
1,900
0
08 Aug 2019
SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition
Daniel S. Park
William Chan
Yu Zhang
Chung-Cheng Chiu
Barret Zoph
E. D. Cubuk
Quoc V. Le
VLM
159
3,451
0
18 Apr 2019
Deep Audio-Visual Speech Recognition
Triantafyllos Afouras
Joon Son Chung
A. Senior
Oriol Vinyals
Andrew Zisserman
67
701
0
06 Sep 2018
Minimum Word Error Rate Training for Attention-based Sequence-to-Sequence Models
Rohit Prabhavalkar
Tara N. Sainath
Yonghui Wu
Patrick Nguyen
Zhiwen Chen
Chung-Cheng Chiu
Anjuli Kannan
48
162
0
05 Dec 2017
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
628
130,942
0
12 Jun 2017
Categorical Reparameterization with Gumbel-Softmax
Eric Jang
S. Gu
Ben Poole
BDL
281
5,360
0
03 Nov 2016
Wav2Letter: an End-to-End ConvNet-based Speech Recognition System
R. Collobert
Christian Puhrsch
Gabriel Synnaeve
3DV
56
283
0
11 Sep 2016
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
1.9K
193,426
0
10 Dec 2015
Listen, Attend and Spell
William Chan
Navdeep Jaitly
Quoc V. Le
Oriol Vinyals
RALM
147
2,265
0
05 Aug 2015
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
1.4K
149,842
0
22 Dec 2014
Deep Speech: Scaling up end-to-end speech recognition
Awni Y. Hannun
Carl Case
Jared Casper
Bryan Catanzaro
G. Diamos
...
R. Prenger
S. Satheesh
Shubho Sengupta
Adam Coates
A. Ng
176
2,124
0
17 Dec 2014
Sequence Transduction with Recurrent Neural Networks
Alex Graves
171
1,866
0
14 Nov 2012
1