Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1804.10959
Cited By
Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates
29 April 2018
Taku Kudo
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates"
50 / 628 papers shown
Title
Large-Scale Contextualised Language Modelling for Norwegian
Andrey Kutuzov
Jeremy Barnes
Erik Velldal
Lilja Ovrelid
Stephan Oepen
84
38
0
13 Apr 2021
Source and Target Bidirectional Knowledge Distillation for End-to-end Speech Translation
Hirofumi Inaguma
Tatsuya Kawahara
Shinji Watanabe
82
43
0
13 Apr 2021
Restoring and Mining the Records of the Joseon Dynasty via Neural Language Modeling and Machine Translation
Kyeongpil Kang
Kyohoon Jin
Soyoung Yang
Show-Ling Jang
Jaegul Choo
Yougbin Kim
MU
119
18
0
13 Apr 2021
Assessing Reference-Free Peer Evaluation for Machine Translation
Sweta Agrawal
George F. Foster
Markus Freitag
Colin Cherry
LRM
51
10
0
12 Apr 2021
CodeTrans: Towards Cracking the Language of Silicon's Code Through Self-Supervised Deep Learning and High Performance Computing
Ahmed Elnaggar
Wei Ding
Llion Jones
Tom Gibbs
Tamas B. Fehér
Christoph Angerer
Silvia Severini
Florian Matthes
B. Rost
72
72
0
06 Apr 2021
Contextualized Streaming End-to-End Speech Recognition with Trie-Based Deep Biasing and Shallow Fusion
Duc Le
Mahaveer Jain
Gil Keren
Suyoun Kim
Yangyang Shi
...
Yuan Shangguan
Christian Fuegen
Ozlem Kalinli
Yatharth Saraf
M. Seltzer
89
102
0
05 Apr 2021
Semantic Distance: A New Metric for ASR Performance Analysis Towards Spoken Language Understanding
Suyoun Kim
Abhinav Arora
Duc Le
Ching-Feng Yeh
Christian Fuegen
Ozlem Kalinli
M. Seltzer
70
28
0
05 Apr 2021
End-to-End Speaker-Attributed ASR with Transformer
Naoyuki Kanda
Guoli Ye
Yashesh Gaur
Xiaofei Wang
Zhong Meng
Zhuo Chen
Takuya Yoshioka
75
49
0
05 Apr 2021
IndT5: A Text-to-Text Transformer for 10 Indigenous Languages
El Moatez Billah Nagoudi
Wei-Rui Chen
Muhammad Abdul-Mageed
H. Cavusoglu
87
24
0
04 Apr 2021
Large-Scale Pre-Training of End-to-End Multi-Talker ASR for Meeting Transcription with Single Distant Microphone
Naoyuki Kanda
Guoli Ye
Yu-Huan Wu
Yashesh Gaur
Xiaofei Wang
Zhong Meng
Zhuo Chen
Takuya Yoshioka
106
42
0
31 Mar 2021
Multi-view Subword Regularization
Xinyi Wang
Sebastian Ruder
Graham Neubig
82
46
0
15 Mar 2021
Dynamic Acoustic Unit Augmentation With BPE-Dropout for Low-Resource End-to-End Speech Recognition
A. Laptev
A. Andrusenko
Ivan Podluzhny
Anton Mitrofanov
Ivan Medennikov
Yuri N. Matveev
VLM
52
14
0
12 Mar 2021
CANINE: Pre-training an Efficient Tokenization-Free Encoder for Language Representation
J. Clark
Dan Garrette
Iulia Turc
John Wieting
121
224
0
11 Mar 2021
The Sensitivity of Word Embeddings-based Author Detection Models to Semantic-preserving Adversarial Perturbations
Jeremiah Duncan
Fabian Fallas
Christopher Gropp
Emily Herron
Maria Mahbub
...
Sudarshan Srinivasan
Maofeng Tang
V. Zenkov
Quan Zhou
Edmon Begoli
DeLMO
AAML
35
0
0
23 Feb 2021
Going Full-TILT Boogie on Document Understanding with Text-Image-Layout Transformer
Rafal Powalski
Łukasz Borchmann
Dawid Jurkiewicz
Tomasz Dwojak
Michal Pietruszka
Gabriela Pałka
ViT
94
160
0
18 Feb 2021
Gaussian Kernelized Self-Attention for Long Sequence Data and Its Application to CTC-based Speech Recognition
Yosuke Kashiwagi
E. Tsunoo
Shinji Watanabe
AI4TS
57
7
0
18 Feb 2021
Improving Zero-shot Neural Machine Translation on Language-specific Encoders-Decoders
Junwei Liao
Yu Shi
Ming Gong
Linjun Shou
Hong Qu
Michael Zeng
58
11
0
12 Feb 2021
Inducing Meaningful Units from Character Sequences with Dynamic Capacity Slot Attention
Melika Behjati
James Henderson
OCL
52
1
0
01 Feb 2021
BNLP: Natural language processing toolkit for Bengali language
Sagor Sarker
58
36
0
31 Jan 2021
WangchanBERTa: Pretraining transformer-based Thai Language Models
Lalita Lowphansirikul
Charin Polpanumas
Nawat Jantrakulchai
Sarana Nutanong
58
76
0
24 Jan 2021
Training Multilingual Pre-trained Language Model with Byte-level Subwords
Junqiu Wei
Qun Liu
Yinpeng Guo
Xin Jiang
63
20
0
23 Jan 2021
Does a Hybrid Neural Network based Feature Selection Model Improve Text Classification?
Suman Dowlagar
R. Mamidi
45
1
0
22 Jan 2021
CMSAOne@Dravidian-CodeMix-FIRE2020: A Meta Embedding and Transformer model for Code-Mixed Sentiment Analysis on Social Media Text
Suman Dowlagar
R. Mamidi
53
13
0
22 Jan 2021
Arabic Speech Recognition by End-to-End, Modular Systems and Human
A. Hussein
Shinji Watanabe
Ahmed M. Ali
VLM
72
50
0
21 Jan 2021
An evaluation of word-level confidence estimation for end-to-end automatic speech recognition
Dan Oneaţă
Alexandru Caranica
Adriana Stan
H. Cucu
UQCV
88
25
0
14 Jan 2021
Detecting Hostile Posts using Relational Graph Convolutional Network
Sarthak
Shikhar Shukla
K. V. Arya
GNN
97
2
0
10 Jan 2021
Trankit: A Light-Weight Transformer-based Toolkit for Multilingual Natural Language Processing
Minh Nguyen
Viet Dac Lai
Amir Pouran Ben Veyseh
Thien Huu Nguyen
141
137
0
09 Jan 2021
Fast WordPiece Tokenization
Xinying Song
Alexandru Salcianu
Yang Song
Dave Dopson
Denny Zhou
106
166
0
31 Dec 2020
Neural Machine Translation: A Review of Methods, Resources, and Tools
Zhixing Tan
Shuo Wang
Zonghan Yang
Gang Chen
Xuancheng Huang
Maosong Sun
Yang Liu
3DV
AI4TS
97
110
0
31 Dec 2020
Generating Adversarial Examples in Chinese Texts Using Sentence-Pieces
Linyang Li
Yunfan Shao
Demin Song
Xipeng Qiu
Xuanjing Huang
AAML
GAN
40
7
0
29 Dec 2020
SubICap: Towards Subword-informed Image Captioning
Naeha Sharif
Bennamoun
Wei Liu
Syed Afaq Ali Shah
45
2
0
24 Dec 2020
Domain Adaptation of NMT models for English-Hindi Machine Translation Task at AdapMT ICON 2020
Ramchandra Joshi
Rushabh Karnavat
Kaustubh Jirapure
Raviraj Joshi
35
0
0
22 Dec 2020
Adversarial Meta Sampling for Multilingual Low-Resource Speech Recognition
Yubei Xiao
Ke Gong
Pan Zhou
Guolin Zheng
Xiaodan Liang
Liang Lin
72
35
0
22 Dec 2020
Subword Sampling for Low Resource Word Alignment
Ehsaneddin Asgari
Masoud Jalili Sabet
Philipp Dufter
Christoph Ringlstetter
Hinrich Schütze
41
5
0
21 Dec 2020
Morphology Matters: A Multilingual Language Modeling Analysis
Hyunji Hayley Park
Katherine J. Zhang
Coleman Haley
K. Steimel
Han Liu
Lane Schwartz
105
49
0
11 Dec 2020
MLS: A Large-Scale Multilingual Dataset for Speech Research
Vineel Pratap
Qiantong Xu
Anuroop Sriram
Gabriel Synnaeve
R. Collobert
AuLLM
186
513
0
07 Dec 2020
Adapt-and-Adjust: Overcoming the Long-Tail Problem of Multilingual Speech Recognition
Genta Indra Winata
Guangsen Wang
Caiming Xiong
Guosheng Lin
VLM
65
50
0
03 Dec 2020
Improving accuracy of rare words for RNN-Transducer through unigram shallow fusion
Vijay Ravi
Yile Gu
Ankur Gandhe
Ariya Rastrow
Linda Liu
Denis Filimonov
Scott Novotney
I. Bulyko
56
9
0
30 Nov 2020
Using Multiple Subwords to Improve English-Esperanto Automated Literary Translation Quality
Alberto Poncelas
J. Buts
J. Hadley
Andy Way
41
2
0
28 Nov 2020
Evaluating Input Representation for Language Identification in Hindi-English Code Mixed Text
Ramchandra Joshi
Raviraj Joshi
43
14
0
23 Nov 2020
Deep Shallow Fusion for RNN-T Personalization
Duc Le
Gil Keren
Julian Chan
Jay Mahadeokar
Christian Fuegen
M. Seltzer
76
80
0
16 Nov 2020
Simultaneous Speech-to-Speech Translation System with Neural Incremental ASR, MT, and TTS
Katsuhito Sudoh
Takatomo Kano
Sashi Novitasari
Tomoya Yanagita
S. Sakti
Satoshi Nakamura
36
13
0
10 Nov 2020
From Dataset Recycling to Multi-Property Extraction and Beyond
Tomasz Dwojak
Michal Pietruszka
Łukasz Borchmann
Jakub Chlkedowski
Filip Graliñski
91
5
0
06 Nov 2020
Minimum Bayes Risk Training for End-to-End Speaker-Attributed ASR
Naoyuki Kanda
Zhong Meng
Liang Lu
Yashesh Gaur
Xiaofei Wang
Zhuo Chen
Takuya Yoshioka
71
17
0
03 Nov 2020
Subword Segmentation and a Single Bridge Language Affect Zero-Shot Neural Machine Translation
Annette Rios Gonzales
Mathias Müller
Rico Sennrich
52
19
0
03 Nov 2020
Improved Neural Language Model Fusion for Streaming Recurrent Neural Network Transducer
Suyoun Kim
Shangguan Yuan
Jay Mahadeokar
A. Bruguier
Christian Fuegen
M. Seltzer
Duc Le
71
29
0
26 Oct 2020
Orthros: Non-autoregressive End-to-end Speech Translation with Dual-decoder
Hirofumi Inaguma
Yosuke Higuchi
Kevin Duh
Tatsuya Kawahara
Shinji Watanabe
66
22
0
25 Oct 2020
Char2Subword: Extending the Subword Embedding Space Using Robust Character Compositionality
Gustavo Aguilar
Bryan McCann
Tong Niu
Nazneen Rajani
N. Keskar
Thamar Solorio
108
13
0
24 Oct 2020
UniCase -- Rethinking Casing in Language Models
Rafal Powalski
Tomasz Stanislawek
49
4
0
22 Oct 2020
mT5: A massively multilingual pre-trained text-to-text transformer
Linting Xue
Noah Constant
Adam Roberts
Mihir Kale
Rami Al-Rfou
Aditya Siddhant
Aditya Barua
Colin Raffel
182
2,570
0
22 Oct 2020
Previous
1
2
3
...
10
11
12
13
9
Next