ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1804.10959
  4. Cited By
Subword Regularization: Improving Neural Network Translation Models with
  Multiple Subword Candidates

Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates

29 April 2018
Taku Kudo
ArXiv (abs)PDFHTML

Papers citing "Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates"

50 / 628 papers shown
Title
Integrating Approaches to Word Representation
Integrating Approaches to Word Representation
Yuval Pinter
NAI
94
5
0
10 Sep 2021
Speechformer: Reducing Information Loss in Direct Speech Translation
Speechformer: Reducing Information Loss in Direct Speech Translation
Sara Papi
Marco Gaido
Matteo Negri
Marco Turchi
129
24
0
09 Sep 2021
Subword Mapping and Anchoring across Languages
Subword Mapping and Anchoring across Languages
Giorgos Vernikos
Andrei Popescu-Belis
112
12
0
09 Sep 2021
Generalised Unsupervised Domain Adaptation of Neural Machine Translation
  with Cross-Lingual Data Selection
Generalised Unsupervised Domain Adaptation of Neural Machine Translation with Cross-Lingual Data Selection
Thuy-Trang Vu
Xuanli He
D.Q. Phung
Gholamreza Haffari
79
10
0
09 Sep 2021
ARMAN: Pre-training with Semantically Selecting and Reordering of
  Sentences for Persian Abstractive Summarization
ARMAN: Pre-training with Semantically Selecting and Reordering of Sentences for Persian Abstractive Summarization
Alireza Salemi
Emad Kebriaei
Ghazal Neisi Minaei
A. Shakery
CVBM
42
6
0
09 Sep 2021
Biomedical and Clinical Language Models for Spanish: On the Benefits of
  Domain-Specific Pretraining in a Mid-Resource Scenario
Biomedical and Clinical Language Models for Spanish: On the Benefits of Domain-Specific Pretraining in a Mid-Resource Scenario
C. Carrino
Jordi Armengol-Estapé
Asier Gutiérrez-Fandiño
Joan Llop-Palao
Marc Pàmies
Aitor Gonzalez-Agirre
Marta Villegas
58
44
0
08 Sep 2021
IndicBART: A Pre-trained Model for Indic Natural Language Generation
IndicBART: A Pre-trained Model for Indic Natural Language Generation
Raj Dabre
Himani Shrotriya
Anoop Kunchukuttan
Ratish Puduppully
Mitesh M. Khapra
Pratyush Kumar
127
74
0
07 Sep 2021
You should evaluate your language model on marginal likelihood over
  tokenisations
You should evaluate your language model on marginal likelihood over tokenisations
Kris Cao
Laura Rimell
101
26
0
06 Sep 2021
How Suitable Are Subword Segmentation Strategies for Translating
  Non-Concatenative Morphology?
How Suitable Are Subword Segmentation Strategies for Translating Non-Concatenative Morphology?
Chantal Amrhein
Rico Sennrich
93
13
0
02 Sep 2021
Survey of Low-Resource Machine Translation
Survey of Low-Resource Machine Translation
Barry Haddow
Rachel Bawden
Antonio Valerio Miceli Barone
Jindvrich Helcl
Alexandra Birch
AIMat
118
163
0
01 Sep 2021
AraT5: Text-to-Text Transformers for Arabic Language Generation
AraT5: Text-to-Text Transformers for Arabic Language Generation
El Moatez Billah Nagoudi
AbdelRahim Elmadany
Muhammad Abdul-Mageed
144
125
0
31 Aug 2021
Lingxi: A Diversity-aware Chinese Modern Poetry Generation System
Lingxi: A Diversity-aware Chinese Modern Poetry Generation System
Xinran Zhang
Maosong Sun
Jiafeng Liu
Xiaobing Li
70
2
0
27 Aug 2021
Towards Offensive Language Identification for Tamil Code-Mixed YouTube
  Comments and Posts
Towards Offensive Language Identification for Tamil Code-Mixed YouTube Comments and Posts
Charangan Vasantharajan
Uthayasanker Thayasivam
57
38
0
24 Aug 2021
AMMUS : A Survey of Transformer-based Pretrained Models in Natural
  Language Processing
AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing
Katikapalli Subramanyam Kalyan
A. Rajasekharan
S. Sangeetha
VLMLM&MA
107
270
0
12 Aug 2021
Learning to Look Inside: Augmenting Token-Based Encoders with
  Character-Level Information
Learning to Look Inside: Augmenting Token-Based Encoders with Character-Level Information
Yuval Pinter
Amanda Stent
Mark Dredze
Jacob Eisenstein
35
7
0
01 Aug 2021
Simultaneous Speech Translation for Live Subtitling: from Delay to
  Display
Simultaneous Speech Translation for Live Subtitling: from Delay to Display
Alina Karakanta
Sara Papi
Matteo Negri
Marco Turchi
57
10
0
19 Jul 2021
Direct speech-to-speech translation with discrete units
Direct speech-to-speech translation with discrete units
Ann Lee
Peng-Jen Chen
Changhan Wang
Jiatao Gu
Sravya Popuri
...
Yossi Adi
Qing He
Yun Tang
J. Pino
Wei-Ning Hsu
91
192
0
12 Jul 2021
A Comparative Study of Modular and Joint Approaches for
  Speaker-Attributed ASR on Monaural Long-Form Audio
A Comparative Study of Modular and Joint Approaches for Speaker-Attributed ASR on Monaural Long-Form Audio
Naoyuki Kanda
Xiong Xiao
Jian Wu
Tianyan Zhou
Yashesh Gaur
Xiaofei Wang
Zhong Meng
Zhuo Chen
Takuya Yoshioka
73
14
0
06 Jul 2021
Instant One-Shot Word-Learning for Context-Specific Neural
  Sequence-to-Sequence Speech Recognition
Instant One-Shot Word-Learning for Context-Specific Neural Sequence-to-Sequence Speech Recognition
Christian Huber
Juan Hussain
Sebastian Stüker
A. Waibel
73
27
0
05 Jul 2021
Modeling Target-side Inflection in Placeholder Translation
Modeling Target-side Inflection in Placeholder Translation
Ryokan Ri
Toshiaki Nakazawa
Yoshimasa Tsuruoka
46
1
0
01 Jul 2021
On joint training with interfaces for spoken language understanding
On joint training with interfaces for spoken language understanding
A. Raju
Milind Rao
Gautam Tiwari
Pranav Dheram
Bryan Anderson
Zhe Zhang
Chul Lee
Bach Bui
Ariya Rastrow
VLM
55
11
0
30 Jun 2021
XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44
  Languages
XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages
Tahmid Hasan
Abhik Bhattacharjee
Md. Saiful Islam
Kazi Samin Mubasshir
Yuan-Fang Li
Yong-Bin Kang
M. Rahman
Rifat Shahriyar
104
373
0
25 Jun 2021
Adapt-and-Distill: Developing Small, Fast and Effective Pretrained
  Language Models for Domains
Adapt-and-Distill: Developing Small, Fast and Effective Pretrained Language Models for Domains
Yunzhi Yao
Shaohan Huang
Wenhui Wang
Li Dong
Furu Wei
VLMALM
77
49
0
25 Jun 2021
Charformer: Fast Character Transformers via Gradient-based Subword
  Tokenization
Charformer: Fast Character Transformers via Gradient-based Subword Tokenization
Yi Tay
Vinh Q. Tran
Sebastian Ruder
Jai Gupta
Hyung Won Chung
Dara Bahri
Zhen Qin
Simon Baumgartner
Cong Yu
Donald Metzler
157
162
0
23 Jun 2021
Information Retrieval for ZeroSpeech 2021: The Submission by University
  of Wroclaw
Information Retrieval for ZeroSpeech 2021: The Submission by University of Wroclaw
J. Chorowski
Grzegorz Ciesielski
Jaroslaw Dzikowski
Adrian Lañcucki
R. Marxer
Mateusz Opala
P. Pusz
Paweł Rychlikowski
Michal Stypulkowski
72
12
0
22 Jun 2021
Distributed Deep Learning in Open Collaborations
Distributed Deep Learning in Open Collaborations
Michael Diskin
Alexey Bukhtiyarov
Max Ryabinin
Lucile Saulnier
Quentin Lhoest
...
Denis Mazur
Ilia Kobelev
Yacine Jernite
Thomas Wolf
Gennady Pekhimenko
FedML
129
59
0
18 Jun 2021
Modeling Worlds in Text
Modeling Worlds in Text
Prithviraj Ammanabrolu
Mark O. Riedl
VGenLM&Ro
63
14
0
17 Jun 2021
Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition
Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition
Yosuke Higuchi
Niko Moritz
Jonathan Le Roux
Takaaki Hori
VLM
129
52
0
16 Jun 2021
Consistency Regularization for Cross-Lingual Fine-Tuning
Consistency Regularization for Cross-Lingual Fine-Tuning
Bo Zheng
Li Dong
Shaohan Huang
Wenhui Wang
Zewen Chi
Saksham Singhal
Wanxiang Che
Ting Liu
Xia Song
Furu Wei
60
58
0
15 Jun 2021
Overcoming Domain Mismatch in Low Resource Sequence-to-Sequence ASR
  Models using Hybrid Generated Pseudotranscripts
Overcoming Domain Mismatch in Low Resource Sequence-to-Sequence ASR Models using Hybrid Generated Pseudotranscripts
Chak-Fai Li
Francis Keith
William Hartmann
M. Snover
O. Kimball
57
4
0
14 Jun 2021
Evaluating Various Tokenizers for Arabic Text Classification
Evaluating Various Tokenizers for Arabic Text Classification
Zaid Alyafeai
Maged S. Al-Shaibani
Mustafa Ghaleb
Irfan Ahmad
84
44
0
14 Jun 2021
Bridging Subword Gaps in Pretrain-Finetune Paradigm for Natural Language
  Generation
Bridging Subword Gaps in Pretrain-Finetune Paradigm for Natural Language Generation
Xin Liu
Baosong Yang
Dayiheng Liu
Haibo Zhang
Weihua Luo
Min Zhang
Haiying Zhang
Jinsong Su
63
18
0
11 Jun 2021
Diverse Pretrained Context Encodings Improve Document Translation
Diverse Pretrained Context Encodings Improve Document Translation
Domenic Donato
Lei Yu
Chris Dyer
54
16
0
07 Jun 2021
Dual Script E2E framework for Multilingual and Code-Switching ASR
Dual Script E2E framework for Multilingual and Code-Switching ASR
Mari Ganesh Kumar
Jom Kuriakose
Anand Thyagachandran
A. Arunkumar
Ashish Seth
L. D. Prasad
Saish Jaiswal
Anusha Prakash
H. Murthy
92
10
0
02 Jun 2021
Sub-Character Tokenization for Chinese Pretrained Language Models
Sub-Character Tokenization for Chinese Pretrained Language Models
Chenglei Si
Zhengyan Zhang
Yingfa Chen
Fanchao Qi
Xiaozhi Wang
Zhiyuan Liu
Yasheng Wang
Qun Liu
Maosong Sun
53
12
0
01 Jun 2021
Lightweight Cross-Lingual Sentence Representation Learning
Lightweight Cross-Lingual Sentence Representation Learning
Zhuoyuan Mao
Prakhar Gupta
Pei Wang
Chenhui Chu
Martin Jaggi
Sadao Kurohashi
VLM
124
9
0
28 May 2021
ByT5: Towards a token-free future with pre-trained byte-to-byte models
ByT5: Towards a token-free future with pre-trained byte-to-byte models
Linting Xue
Aditya Barua
Noah Constant
Rami Al-Rfou
Sharan Narang
Mihir Kale
Adam Roberts
Colin Raffel
149
508
0
28 May 2021
Joint Optimization of Tokenization and Downstream Model
Joint Optimization of Tokenization and Downstream Model
Tatsuya Hiraoka
Sho Takase
Kei Uchiumi
Atsushi Keyaki
Naoaki Okazaki
66
17
0
26 May 2021
IntelliCAT: Intelligent Machine Translation Post-Editing with Quality
  Estimation and Translation Suggestion
IntelliCAT: Intelligent Machine Translation Post-Editing with Quality Estimation and Translation Suggestion
Dongjun Lee
Junhyeong Ahn
Heesoo Park
Jaemin Jo
28
18
0
25 May 2021
Understanding the Properties of Minimum Bayes Risk Decoding in Neural
  Machine Translation
Understanding the Properties of Minimum Bayes Risk Decoding in Neural Machine Translation
Mathias Müller
Rico Sennrich
63
62
0
18 May 2021
A Deep Metric Learning Approach to Account Linking
A Deep Metric Learning Approach to Account Linking
Aleem Khan
Elizabeth Fleming
N. Schofield
M. Bishop
Nicholas Andrews
59
23
0
15 May 2021
A Novel Estimator of Mutual Information for Learning to Disentangle
  Textual Representations
A Novel Estimator of Mutual Information for Learning to Disentangle Textual Representations
Pierre Colombo
Chloé Clavel
Pablo Piantanida
AAMLDRL
186
51
0
06 May 2021
Streaming end-to-end speech recognition with jointly trained neural
  feature enhancement
Streaming end-to-end speech recognition with jointly trained neural feature enhancement
Chanwoo Kim
Abhinav Garg
Dhananjaya N. Gowda
Seongkyu Mun
C. Han
AuLLM
56
6
0
04 May 2021
Generating abstractive summaries of Lithuanian news articles using a
  transformer model
Generating abstractive summaries of Lithuanian news articles using a transformer model
Lukas Stankevicius
M. Lukoševičius
50
3
0
23 Apr 2021
Acoustic Data-Driven Subword Modeling for End-to-End Speech Recognition
Acoustic Data-Driven Subword Modeling for End-to-End Speech Recognition
Wei Zhou
Mohammad Zeineldeen
Zuoyun Zheng
Ralf Schluter
Hermann Ney
75
14
0
19 Apr 2021
Zero-shot Cross-lingual Transfer of Neural Machine Translation with
  Multilingual Pretrained Encoders
Zero-shot Cross-lingual Transfer of Neural Machine Translation with Multilingual Pretrained Encoders
Guanhua Chen
Shuming Ma
Yun-Nung Chen
Li Dong
Dongdong Zhang
Jianxiong Pan
Wenping Wang
Furu Wei
73
41
0
18 Apr 2021
AMMU : A Survey of Transformer-based Biomedical Pretrained Language
  Models
AMMU : A Survey of Transformer-based Biomedical Pretrained Language Models
Katikapalli Subramanyam Kalyan
A. Rajasekharan
S. Sangeetha
LM&MAMedIm
115
170
0
16 Apr 2021
Robust Open-Vocabulary Translation from Visual Text Representations
Robust Open-Vocabulary Translation from Visual Text Representations
Elizabeth Salesky
David Etter
Matt Post
VLM
74
42
0
16 Apr 2021
On the Robustness of Intent Classification and Slot Labeling in
  Goal-oriented Dialog Systems to Real-world Noise
On the Robustness of Intent Classification and Slot Labeling in Goal-oriented Dialog Systems to Real-world Noise
Sailik Sengupta
Jason Krone
Saab Mansour
NoLa
22
13
0
14 Apr 2021
Domain Adaptation and Multi-Domain Adaptation for Neural Machine
  Translation: A Survey
Domain Adaptation and Multi-Domain Adaptation for Neural Machine Translation: A Survey
Danielle Saunders
AI4CE
132
91
0
14 Apr 2021
Previous
123...1011121389
Next