Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1804.10959
Cited By
Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates
29 April 2018
Taku Kudo
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates"
50 / 619 papers shown
Title
Adam Mickiewicz University at WMT 2022: NER-Assisted and Quality-Aware Neural Machine Translation
Artur Nowakowski
Gabriela Pałka
Kamil Guttmann
Miko Pokrywka
30
5
0
07 Sep 2022
Improving Contextual Recognition of Rare Words with an Alternate Spelling Prediction Model
Jennifer Drexler Fox
Natalie Delworth
KELM
30
18
0
02 Sep 2022
A Novel Multi-Task Learning Approach for Context-Sensitive Compound Type Identification in Sanskrit
Jivnesh Sandhan
Ashish Gupta
Hrishikesh Terdalkar
Tushar Sandhan
S. Samanta
Laxmidhar Behera
Pawan Goyal
29
3
0
22 Aug 2022
Domain-Specific Text Generation for Machine Translation
Yasmin Moslem
Rejwanul Haque
John D. Kelleher
Andy Way
30
16
0
11 Aug 2022
How Effective is Byte Pair Encoding for Out-Of-Vocabulary Words in Neural Machine Translation?
Ali Araabi
Christof Monz
Vlad Niculae
36
10
0
10 Aug 2022
A High-Quality and Large-Scale Dataset for English-Vietnamese Speech Translation
L. T. Nguyen
Nguyen Luong Tran
Long Doan
Manh Luong
Dat Quoc Nguyen
29
4
0
08 Aug 2022
Lost in Space Marking
Cassandra L. Jacobs
Yuval Pinter
21
1
0
02 Aug 2022
Benchmarking Azerbaijani Neural Machine Translation
Chih-Chen Chen
William Chen
29
0
0
29 Jul 2022
ILASR: Privacy-Preserving Incremental Learning for Automatic Speech Recognition at Production Scale
Gopinath Chennupati
Milind Rao
Gurpreet Chadha
Aaron Eakin
A. Raju
...
Andrew Oberlin
Buddha Nandanoor
Prahalad Venkataramanan
Zheng Wu
Pankaj Sitpure
CLL
27
8
0
19 Jul 2022
MAD for Robust Reinforcement Learning in Machine Translation
Domenic Donato
Lei Yu
Wang Ling
Chris Dyer
MoE
22
7
0
18 Jul 2022
u-HuBERT: Unified Mixed-Modal Speech Pretraining And Zero-Shot Transfer to Unlabeled Modality
Wei-Ning Hsu
Bowen Shi
SSL
VLM
29
42
0
14 Jul 2022
Language Modelling with Pixels
Phillip Rust
Jonas F. Lotz
Emanuele Bugliarello
Elizabeth Salesky
Miryam de Lhoneux
Desmond Elliott
VLM
43
46
0
14 Jul 2022
Speaker Anonymization with Phonetic Intermediate Representations
Sarina Meyer
Florian Lux
Pavel Denisov
Julia Koch
Pascal Tilli
Ngoc Thang Vu
34
27
0
11 Jul 2022
Tandem Multitask Training of Speaker Diarisation and Speech Recognition for Meeting Transcription
Xianrui Zheng
Chuxu Zhang
P. Woodland
34
16
0
08 Jul 2022
Reduce Indonesian Vocabularies with an Indonesian Sub-word Separator
Mukhlis Amien
Chong Feng
Heyan Huang
27
0
0
01 Jul 2022
The THUEE System Description for the IARPA OpenASR21 Challenge
Jing Zhao
Haoyu Wang
Jinpeng Li
Shuzhou Chai
Guan-Bo Wang
Guoguo Chen
Weiqiang Zhang
VLM
27
1
0
29 Jun 2022
On Comparison of Encoders for Attention based End to End Speech Recognition in Standalone and Rescoring Mode
Raviraj Joshi
Subodh Kumar
36
2
0
26 Jun 2022
A Simple Baseline for Domain Adaptation in End to End ASR Systems Using Synthetic Data
Raviraj Joshi
Ashutosh Kumar Singh
14
7
0
22 Jun 2022
The SIGMORPHON 2022 Shared Task on Morpheme Segmentation
Khuyagbaatar Batsuren
Gábor Bella
Aryaman Arora
Viktor Martinović
Kyle Gorman
...
Magda vSevvcíková
Katevrina Pelegrinová
Fausto Giunchiglia
Ryan Cotterell
Ekaterina Vylomova
33
39
0
15 Jun 2022
1Cademy at Semeval-2022 Task 1: Investigating the Effectiveness of Multilingual, Multitask, and Language-Agnostic Tricks for the Reverse Dictionary Task
Zhiyong Wang
Ge Zhang
Nineli Lashkarashvili
24
3
0
08 Jun 2022
Searching for Optimal Subword Tokenization in Cross-domain NER
Ruotian Ma
Yiding Tan
Xin Zhou
Xuanting Chen
Di Liang
Sirui Wang
Wei Wu
Tao Gui
Qi Zhang
OOD
64
14
0
07 Jun 2022
What do tokens know about their characters and how do they know it?
Ayush Kaushal
Kyle Mahowald
34
28
0
06 Jun 2022
EMS: Efficient and Effective Massively Multilingual Sentence Embedding Learning
Zhuoyuan Mao
Chenhui Chu
Sadao Kurohashi
45
1
0
31 May 2022
Transformer with Tree-order Encoding for Neural Program Generation
Klaudia Thellmann
Bernhard Stadler
Ricardo Usbeck
Jens Lehmann
29
1
0
30 May 2022
Contextual Adapters for Personalized Speech Recognition in Neural Transducers
Kanthashree Mysore Sathyendra
Thejaswi Muniyappa
Feng-Ju Chang
Jing Liu
Jinru Su
Grant P. Strimel
Athanasios Mouchtaris
Siegfried Kunzmann
19
75
0
26 May 2022
Local Byte Fusion for Neural Machine Translation
Makesh Narsimhan Sreedhar
Xiangpeng Wan
Yu-Jie Cheng
Junjie Hu
34
4
0
23 May 2022
Translating Hanja Historical Documents to Contemporary Korean and English
Juhee Son
Jiho Jin
Haneul Yoo
Jinyeong Bak
Kyunghyun Cho
Alice Oh
35
4
0
20 May 2022
Evaluation of Transfer Learning for Polish with a Text-to-Text Model
Aleksandra Chrabrowa
Lukasz Dragan
Karol Grzegorczyk
D. Kajtoch
Mikołaj Koszowski
Robert Mroczkowski
Piotr Rybak
42
18
0
18 May 2022
FactPEGASUS: Factuality-Aware Pre-training and Fine-tuning for Abstractive Summarization
David Wan
Joey Tianyi Zhou
HILM
25
68
0
16 May 2022
IRB-NLP at SemEval-2022 Task 1: Exploring the Relationship Between Words and Their Semantic Representations
Damir Korenčić
Ivan Grubišić
24
3
0
13 May 2022
Quantifying Synthesis and Fusion and their Impact on Machine Translation
Arturo Oncevay
Duygu Ataman
N. V. Berkel
Barry Haddow
Alexandra Birch
Johannes Bjerva
25
3
0
06 May 2022
A Few Thousand Translations Go a Long Way! Leveraging Pre-trained Models for African News Translation
David Ifeoluwa Adelani
Jesujoba Oluwadara Alabi
Angela Fan
Julia Kreutzer
Xiaoyu Shen
...
Ayodele Awokoya
Happy Buzaaba
Blessing K. Sibanda
Andiswa Bukula
Sam Manthalu
34
111
0
04 May 2022
CoCa: Contrastive Captioners are Image-Text Foundation Models
Jiahui Yu
Zirui Wang
Vijay Vasudevan
Legg Yeung
Mojtaba Seyedhosseini
Yonghui Wu
VLM
CLIP
OffRL
85
1,265
0
04 May 2022
How Robust is Neural Machine Translation to Language Imbalance in Multilingual Tokenizer Training?
Shiyue Zhang
Vishrav Chaudhary
Naman Goyal
James Cross
Guillaume Wenzek
Joey Tianyi Zhou
Francisco Guzman
38
16
0
29 Apr 2022
CogView2: Faster and Better Text-to-Image Generation via Hierarchical Transformers
Ming Ding
Wendi Zheng
Wenyi Hong
Jie Tang
VLM
41
324
0
28 Apr 2022
Improving Multimodal Speech Recognition by Data Augmentation and Speech Representations
Dan Oneaţă
H. Cucu
19
19
0
27 Apr 2022
How can NLP Help Revitalize Endangered Languages? A Case Study and Roadmap for the Cherokee Language
Shiyue Zhang
B. Frey
Joey Tianyi Zhou
24
36
0
25 Apr 2022
A Vocabulary-Free Multilingual Neural Tokenizer for End-to-End Task Learning
Md. Mofijul Islam
Gustavo Aguilar
Pragaash Ponnusamy
Clint Solomon Mathialagan
Chengyuan Ma
Chenlei Guo
VLM
19
10
0
22 Apr 2022
Layer-wise Fast Adaptation for End-to-End Multi-Accent Speech Recognition
Xun Gong
Y. Qian
Houjun Huang
Yanmin Qian
34
44
0
21 Apr 2022
Impact of Tokenization on Language Models: An Analysis for Turkish
Cagri Toraman
E. Yilmaz
Furkan Şahinuç
Oguzhan Ozcelik
38
74
0
19 Apr 2022
Improving Tokenisation by Alternative Treatment of Spaces
Edward Gow-Smith
Harish Tayyar Madabushi
Carolina Scarton
Aline Villavicencio
37
20
0
08 Apr 2022
Deliberation Model for On-Device Spoken Language Understanding
Duc Le
Akshat Shrivastava
Paden Tomasello
Suyoun Kim
Aleksandr Livshits
Ozlem Kalinli
M. Seltzer
AuLLM
37
12
0
04 Apr 2022
Multi-task RNN-T with Semantic Decoder for Streamable Spoken Language Understanding
Xuandi Fu
Feng-Ju Chang
Martin H. Radfar
Kailin Wei
Jing Liu
Grant P. Strimel
Kanthashree Mysore Sathyendra
21
4
0
01 Apr 2022
Single Model Ensemble for Subword Regularized Models in Low-Resource Machine Translation
Sho Takase
Tatsuya Hiraoka
Naoaki Okazaki
16
5
0
25 Mar 2022
One Country, 700+ Languages: NLP Challenges for Underrepresented Languages and Dialects in Indonesia
Alham Fikri Aji
Genta Indra Winata
Fajri Koto
Samuel Cahyawijaya
Ade Romadhony
...
David Moeljadi
Radityo Eko Prasojo
Timothy Baldwin
Jey Han Lau
Sebastian Ruder
42
100
0
24 Mar 2022
Small Batch Sizes Improve Training of Low-Resource Neural MT
Àlex R. Atrio
Andrei Popescu-Belis
35
6
0
20 Mar 2022
ScienceWorld: Is your Agent Smarter than a 5th Grader?
Ruoyao Wang
Peter Alexander Jansen
Marc-Alexandre Côté
Prithviraj Ammanabrolu
LLMAG
ReLM
LRM
36
109
0
14 Mar 2022
IT5: Text-to-text Pretraining for Italian Language Understanding and Generation
Gabriele Sarti
Malvina Nissim
AILaw
23
42
0
07 Mar 2022
Extracting linguistic speech patterns of Japanese fictional characters using subword units
Mika Kishino
Kanako Komiya
16
0
0
05 Mar 2022
Overlap-based Vocabulary Generation Improves Cross-lingual Transfer Among Related Languages
Vaidehi Patil
Partha P. Talukdar
Sunita Sarawagi
24
21
0
03 Mar 2022
Previous
1
2
3
...
6
7
8
...
11
12
13
Next