ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1804.10959
  4. Cited By
Subword Regularization: Improving Neural Network Translation Models with
  Multiple Subword Candidates

Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates

29 April 2018
Taku Kudo
ArXivPDFHTML

Papers citing "Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates"

50 / 619 papers shown
Title
Adam Mickiewicz University at WMT 2022: NER-Assisted and Quality-Aware
  Neural Machine Translation
Adam Mickiewicz University at WMT 2022: NER-Assisted and Quality-Aware Neural Machine Translation
Artur Nowakowski
Gabriela Pałka
Kamil Guttmann
Miko Pokrywka
30
5
0
07 Sep 2022
Improving Contextual Recognition of Rare Words with an Alternate
  Spelling Prediction Model
Improving Contextual Recognition of Rare Words with an Alternate Spelling Prediction Model
Jennifer Drexler Fox
Natalie Delworth
KELM
30
18
0
02 Sep 2022
A Novel Multi-Task Learning Approach for Context-Sensitive Compound Type
  Identification in Sanskrit
A Novel Multi-Task Learning Approach for Context-Sensitive Compound Type Identification in Sanskrit
Jivnesh Sandhan
Ashish Gupta
Hrishikesh Terdalkar
Tushar Sandhan
S. Samanta
Laxmidhar Behera
Pawan Goyal
29
3
0
22 Aug 2022
Domain-Specific Text Generation for Machine Translation
Domain-Specific Text Generation for Machine Translation
Yasmin Moslem
Rejwanul Haque
John D. Kelleher
Andy Way
30
16
0
11 Aug 2022
How Effective is Byte Pair Encoding for Out-Of-Vocabulary Words in
  Neural Machine Translation?
How Effective is Byte Pair Encoding for Out-Of-Vocabulary Words in Neural Machine Translation?
Ali Araabi
Christof Monz
Vlad Niculae
36
10
0
10 Aug 2022
A High-Quality and Large-Scale Dataset for English-Vietnamese Speech
  Translation
A High-Quality and Large-Scale Dataset for English-Vietnamese Speech Translation
L. T. Nguyen
Nguyen Luong Tran
Long Doan
Manh Luong
Dat Quoc Nguyen
29
4
0
08 Aug 2022
Lost in Space Marking
Lost in Space Marking
Cassandra L. Jacobs
Yuval Pinter
21
1
0
02 Aug 2022
Benchmarking Azerbaijani Neural Machine Translation
Benchmarking Azerbaijani Neural Machine Translation
Chih-Chen Chen
William Chen
29
0
0
29 Jul 2022
ILASR: Privacy-Preserving Incremental Learning for Automatic Speech
  Recognition at Production Scale
ILASR: Privacy-Preserving Incremental Learning for Automatic Speech Recognition at Production Scale
Gopinath Chennupati
Milind Rao
Gurpreet Chadha
Aaron Eakin
A. Raju
...
Andrew Oberlin
Buddha Nandanoor
Prahalad Venkataramanan
Zheng Wu
Pankaj Sitpure
CLL
27
8
0
19 Jul 2022
MAD for Robust Reinforcement Learning in Machine Translation
MAD for Robust Reinforcement Learning in Machine Translation
Domenic Donato
Lei Yu
Wang Ling
Chris Dyer
MoE
22
7
0
18 Jul 2022
u-HuBERT: Unified Mixed-Modal Speech Pretraining And Zero-Shot Transfer
  to Unlabeled Modality
u-HuBERT: Unified Mixed-Modal Speech Pretraining And Zero-Shot Transfer to Unlabeled Modality
Wei-Ning Hsu
Bowen Shi
SSL
VLM
29
42
0
14 Jul 2022
Language Modelling with Pixels
Language Modelling with Pixels
Phillip Rust
Jonas F. Lotz
Emanuele Bugliarello
Elizabeth Salesky
Miryam de Lhoneux
Desmond Elliott
VLM
43
46
0
14 Jul 2022
Speaker Anonymization with Phonetic Intermediate Representations
Speaker Anonymization with Phonetic Intermediate Representations
Sarina Meyer
Florian Lux
Pavel Denisov
Julia Koch
Pascal Tilli
Ngoc Thang Vu
34
27
0
11 Jul 2022
Tandem Multitask Training of Speaker Diarisation and Speech Recognition
  for Meeting Transcription
Tandem Multitask Training of Speaker Diarisation and Speech Recognition for Meeting Transcription
Xianrui Zheng
Chuxu Zhang
P. Woodland
34
16
0
08 Jul 2022
Reduce Indonesian Vocabularies with an Indonesian Sub-word Separator
Reduce Indonesian Vocabularies with an Indonesian Sub-word Separator
Mukhlis Amien
Chong Feng
Heyan Huang
27
0
0
01 Jul 2022
The THUEE System Description for the IARPA OpenASR21 Challenge
The THUEE System Description for the IARPA OpenASR21 Challenge
Jing Zhao
Haoyu Wang
Jinpeng Li
Shuzhou Chai
Guan-Bo Wang
Guoguo Chen
Weiqiang Zhang
VLM
27
1
0
29 Jun 2022
On Comparison of Encoders for Attention based End to End Speech
  Recognition in Standalone and Rescoring Mode
On Comparison of Encoders for Attention based End to End Speech Recognition in Standalone and Rescoring Mode
Raviraj Joshi
Subodh Kumar
36
2
0
26 Jun 2022
A Simple Baseline for Domain Adaptation in End to End ASR Systems Using
  Synthetic Data
A Simple Baseline for Domain Adaptation in End to End ASR Systems Using Synthetic Data
Raviraj Joshi
Ashutosh Kumar Singh
14
7
0
22 Jun 2022
The SIGMORPHON 2022 Shared Task on Morpheme Segmentation
The SIGMORPHON 2022 Shared Task on Morpheme Segmentation
Khuyagbaatar Batsuren
Gábor Bella
Aryaman Arora
Viktor Martinović
Kyle Gorman
...
Magda vSevvcíková
Katevrina Pelegrinová
Fausto Giunchiglia
Ryan Cotterell
Ekaterina Vylomova
33
39
0
15 Jun 2022
1Cademy at Semeval-2022 Task 1: Investigating the Effectiveness of
  Multilingual, Multitask, and Language-Agnostic Tricks for the Reverse
  Dictionary Task
1Cademy at Semeval-2022 Task 1: Investigating the Effectiveness of Multilingual, Multitask, and Language-Agnostic Tricks for the Reverse Dictionary Task
Zhiyong Wang
Ge Zhang
Nineli Lashkarashvili
24
3
0
08 Jun 2022
Searching for Optimal Subword Tokenization in Cross-domain NER
Searching for Optimal Subword Tokenization in Cross-domain NER
Ruotian Ma
Yiding Tan
Xin Zhou
Xuanting Chen
Di Liang
Sirui Wang
Wei Wu
Tao Gui
Qi Zhang
OOD
64
14
0
07 Jun 2022
What do tokens know about their characters and how do they know it?
What do tokens know about their characters and how do they know it?
Ayush Kaushal
Kyle Mahowald
34
28
0
06 Jun 2022
EMS: Efficient and Effective Massively Multilingual Sentence Embedding
  Learning
EMS: Efficient and Effective Massively Multilingual Sentence Embedding Learning
Zhuoyuan Mao
Chenhui Chu
Sadao Kurohashi
45
1
0
31 May 2022
Transformer with Tree-order Encoding for Neural Program Generation
Transformer with Tree-order Encoding for Neural Program Generation
Klaudia Thellmann
Bernhard Stadler
Ricardo Usbeck
Jens Lehmann
29
1
0
30 May 2022
Contextual Adapters for Personalized Speech Recognition in Neural
  Transducers
Contextual Adapters for Personalized Speech Recognition in Neural Transducers
Kanthashree Mysore Sathyendra
Thejaswi Muniyappa
Feng-Ju Chang
Jing Liu
Jinru Su
Grant P. Strimel
Athanasios Mouchtaris
Siegfried Kunzmann
19
75
0
26 May 2022
Local Byte Fusion for Neural Machine Translation
Local Byte Fusion for Neural Machine Translation
Makesh Narsimhan Sreedhar
Xiangpeng Wan
Yu-Jie Cheng
Junjie Hu
34
4
0
23 May 2022
Translating Hanja Historical Documents to Contemporary Korean and
  English
Translating Hanja Historical Documents to Contemporary Korean and English
Juhee Son
Jiho Jin
Haneul Yoo
Jinyeong Bak
Kyunghyun Cho
Alice Oh
35
4
0
20 May 2022
Evaluation of Transfer Learning for Polish with a Text-to-Text Model
Evaluation of Transfer Learning for Polish with a Text-to-Text Model
Aleksandra Chrabrowa
Lukasz Dragan
Karol Grzegorczyk
D. Kajtoch
Mikołaj Koszowski
Robert Mroczkowski
Piotr Rybak
42
18
0
18 May 2022
FactPEGASUS: Factuality-Aware Pre-training and Fine-tuning for
  Abstractive Summarization
FactPEGASUS: Factuality-Aware Pre-training and Fine-tuning for Abstractive Summarization
David Wan
Joey Tianyi Zhou
HILM
25
68
0
16 May 2022
IRB-NLP at SemEval-2022 Task 1: Exploring the Relationship Between Words
  and Their Semantic Representations
IRB-NLP at SemEval-2022 Task 1: Exploring the Relationship Between Words and Their Semantic Representations
Damir Korenčić
Ivan Grubišić
24
3
0
13 May 2022
Quantifying Synthesis and Fusion and their Impact on Machine Translation
Quantifying Synthesis and Fusion and their Impact on Machine Translation
Arturo Oncevay
Duygu Ataman
N. V. Berkel
Barry Haddow
Alexandra Birch
Johannes Bjerva
25
3
0
06 May 2022
A Few Thousand Translations Go a Long Way! Leveraging Pre-trained Models
  for African News Translation
A Few Thousand Translations Go a Long Way! Leveraging Pre-trained Models for African News Translation
David Ifeoluwa Adelani
Jesujoba Oluwadara Alabi
Angela Fan
Julia Kreutzer
Xiaoyu Shen
...
Ayodele Awokoya
Happy Buzaaba
Blessing K. Sibanda
Andiswa Bukula
Sam Manthalu
34
111
0
04 May 2022
CoCa: Contrastive Captioners are Image-Text Foundation Models
CoCa: Contrastive Captioners are Image-Text Foundation Models
Jiahui Yu
Zirui Wang
Vijay Vasudevan
Legg Yeung
Mojtaba Seyedhosseini
Yonghui Wu
VLM
CLIP
OffRL
85
1,265
0
04 May 2022
How Robust is Neural Machine Translation to Language Imbalance in
  Multilingual Tokenizer Training?
How Robust is Neural Machine Translation to Language Imbalance in Multilingual Tokenizer Training?
Shiyue Zhang
Vishrav Chaudhary
Naman Goyal
James Cross
Guillaume Wenzek
Joey Tianyi Zhou
Francisco Guzman
38
16
0
29 Apr 2022
CogView2: Faster and Better Text-to-Image Generation via Hierarchical
  Transformers
CogView2: Faster and Better Text-to-Image Generation via Hierarchical Transformers
Ming Ding
Wendi Zheng
Wenyi Hong
Jie Tang
VLM
41
324
0
28 Apr 2022
Improving Multimodal Speech Recognition by Data Augmentation and Speech
  Representations
Improving Multimodal Speech Recognition by Data Augmentation and Speech Representations
Dan Oneaţă
H. Cucu
19
19
0
27 Apr 2022
How can NLP Help Revitalize Endangered Languages? A Case Study and
  Roadmap for the Cherokee Language
How can NLP Help Revitalize Endangered Languages? A Case Study and Roadmap for the Cherokee Language
Shiyue Zhang
B. Frey
Joey Tianyi Zhou
24
36
0
25 Apr 2022
A Vocabulary-Free Multilingual Neural Tokenizer for End-to-End Task
  Learning
A Vocabulary-Free Multilingual Neural Tokenizer for End-to-End Task Learning
Md. Mofijul Islam
Gustavo Aguilar
Pragaash Ponnusamy
Clint Solomon Mathialagan
Chengyuan Ma
Chenlei Guo
VLM
19
10
0
22 Apr 2022
Layer-wise Fast Adaptation for End-to-End Multi-Accent Speech
  Recognition
Layer-wise Fast Adaptation for End-to-End Multi-Accent Speech Recognition
Xun Gong
Y. Qian
Houjun Huang
Yanmin Qian
34
44
0
21 Apr 2022
Impact of Tokenization on Language Models: An Analysis for Turkish
Impact of Tokenization on Language Models: An Analysis for Turkish
Cagri Toraman
E. Yilmaz
Furkan Şahinuç
Oguzhan Ozcelik
38
74
0
19 Apr 2022
Improving Tokenisation by Alternative Treatment of Spaces
Improving Tokenisation by Alternative Treatment of Spaces
Edward Gow-Smith
Harish Tayyar Madabushi
Carolina Scarton
Aline Villavicencio
37
20
0
08 Apr 2022
Deliberation Model for On-Device Spoken Language Understanding
Deliberation Model for On-Device Spoken Language Understanding
Duc Le
Akshat Shrivastava
Paden Tomasello
Suyoun Kim
Aleksandr Livshits
Ozlem Kalinli
M. Seltzer
AuLLM
37
12
0
04 Apr 2022
Multi-task RNN-T with Semantic Decoder for Streamable Spoken Language
  Understanding
Multi-task RNN-T with Semantic Decoder for Streamable Spoken Language Understanding
Xuandi Fu
Feng-Ju Chang
Martin H. Radfar
Kailin Wei
Jing Liu
Grant P. Strimel
Kanthashree Mysore Sathyendra
21
4
0
01 Apr 2022
Single Model Ensemble for Subword Regularized Models in Low-Resource
  Machine Translation
Single Model Ensemble for Subword Regularized Models in Low-Resource Machine Translation
Sho Takase
Tatsuya Hiraoka
Naoaki Okazaki
16
5
0
25 Mar 2022
One Country, 700+ Languages: NLP Challenges for Underrepresented
  Languages and Dialects in Indonesia
One Country, 700+ Languages: NLP Challenges for Underrepresented Languages and Dialects in Indonesia
Alham Fikri Aji
Genta Indra Winata
Fajri Koto
Samuel Cahyawijaya
Ade Romadhony
...
David Moeljadi
Radityo Eko Prasojo
Timothy Baldwin
Jey Han Lau
Sebastian Ruder
42
100
0
24 Mar 2022
Small Batch Sizes Improve Training of Low-Resource Neural MT
Small Batch Sizes Improve Training of Low-Resource Neural MT
Àlex R. Atrio
Andrei Popescu-Belis
35
6
0
20 Mar 2022
ScienceWorld: Is your Agent Smarter than a 5th Grader?
ScienceWorld: Is your Agent Smarter than a 5th Grader?
Ruoyao Wang
Peter Alexander Jansen
Marc-Alexandre Côté
Prithviraj Ammanabrolu
LLMAG
ReLM
LRM
36
109
0
14 Mar 2022
IT5: Text-to-text Pretraining for Italian Language Understanding and
  Generation
IT5: Text-to-text Pretraining for Italian Language Understanding and Generation
Gabriele Sarti
Malvina Nissim
AILaw
23
42
0
07 Mar 2022
Extracting linguistic speech patterns of Japanese fictional characters
  using subword units
Extracting linguistic speech patterns of Japanese fictional characters using subword units
Mika Kishino
Kanako Komiya
16
0
0
05 Mar 2022
Overlap-based Vocabulary Generation Improves Cross-lingual Transfer
  Among Related Languages
Overlap-based Vocabulary Generation Improves Cross-lingual Transfer Among Related Languages
Vaidehi Patil
Partha P. Talukdar
Sunita Sarawagi
24
21
0
03 Mar 2022
Previous
123...678...111213
Next