ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1804.10959
  4. Cited By
Subword Regularization: Improving Neural Network Translation Models with
  Multiple Subword Candidates

Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates

29 April 2018
Taku Kudo
ArXivPDFHTML

Papers citing "Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates"

50 / 617 papers shown
Title
Knowledge-Aware Audio-Grounded Generative Slot Filling for Limited
  Annotated Data
Knowledge-Aware Audio-Grounded Generative Slot Filling for Limited Annotated Data
Guangzhi Sun
C. Zhang
Ivan Vulić
Paweł Budzianowski
P. Woodland
39
6
0
04 Jul 2023
Should you marginalize over possible tokenizations?
Should you marginalize over possible tokenizations?
Nadezhda Chirkova
Germán Kruszewski
Jos Rozen
Marc Dymetman
22
10
0
30 Jun 2023
Tokenization and the Noiseless Channel
Tokenization and the Noiseless Channel
Vilém Zouhar
Clara Meister
Juan Luis Gastaldi
Li Du
Mrinmaya Sachan
Ryan Cotterell
30
31
0
29 Jun 2023
Federated Self-Learning with Weak Supervision for Speech Recognition
Federated Self-Learning with Weak Supervision for Speech Recognition
Milind Rao
Gopinath Chennupati
Gautam Tiwari
Anit Kumar Sahu
A. Raju
Ariya Rastrow
J. Droppo
26
3
0
21 Jun 2023
MIR-GAN: Refining Frame-Level Modality-Invariant Representations with
  Adversarial Network for Audio-Visual Speech Recognition
MIR-GAN: Refining Frame-Level Modality-Invariant Representations with Adversarial Network for Audio-Visual Speech Recognition
Yuchen Hu
Chen Chen
Ruizhe Li
Heqing Zou
Chng Eng Siong
GAN
42
9
0
18 Jun 2023
Hearing Lips in Noise: Universal Viseme-Phoneme Mapping and Transfer for
  Robust Audio-Visual Speech Recognition
Hearing Lips in Noise: Universal Viseme-Phoneme Mapping and Transfer for Robust Audio-Visual Speech Recognition
Yuchen Hu
Ruizhe Li
Cheng Chen
Chengwei Qin
Qiu-shi Zhu
Eng Siong Chng
39
5
0
18 Jun 2023
How do different tokenizers perform on downstream tasks in scriptio
  continua languages?: A case study in Japanese
How do different tokenizers perform on downstream tasks in scriptio continua languages?: A case study in Japanese
T. Fujii
Koki Shibata
Atsuki Yamaguchi
Terufumi Morishita
Yasuhiro Sogawa
26
13
0
16 Jun 2023
Learning Cross-lingual Mappings for Data Augmentation to Improve
  Low-Resource Speech Recognition
Learning Cross-lingual Mappings for Data Augmentation to Improve Low-Resource Speech Recognition
Muhammad Umar Farooq
Thomas Hain
22
2
0
14 Jun 2023
Better Generalization with Semantic IDs: A Case Study in Ranking for
  Recommendations
Better Generalization with Semantic IDs: A Case Study in Ranking for Recommendations
Anima Singh
Trung Vu
Nikhil Mehta
Raghunandan H. Keshavan
M. Sathiamoorthy
...
Lukasz Heldt
Li Wei
Devansh Tandon
Ed H. Chi
Xinyang Yi
29
19
0
13 Jun 2023
AutoML in the Age of Large Language Models: Current Challenges, Future
  Opportunities and Risks
AutoML in the Age of Large Language Models: Current Challenges, Future Opportunities and Risks
Alexander Tornede
Difan Deng
Theresa Eimer
Joseph Giovanelli
Aditya Mohan
...
Sarah Segel
Daphne Theodorakopoulos
Tanja Tornede
Henning Wachsmuth
Marius Lindauer
36
23
0
13 Jun 2023
Improving Long Context Document-Level Machine Translation
Improving Long Context Document-Level Machine Translation
Christian Herold
Hermann Ney
20
10
0
08 Jun 2023
On Search Strategies for Document-Level Neural Machine Translation
On Search Strategies for Document-Level Neural Machine Translation
Christian Herold
Hermann Ney
10
1
0
08 Jun 2023
Improving Language Model Integration for Neural Machine Translation
Improving Language Model Integration for Neural Machine Translation
Christian Herold
Yingbo Gao
Mohammad Zeineldeen
Hermann Ney
29
2
0
08 Jun 2023
Assessing the Importance of Frequency versus Compositionality for
  Subword-based Tokenization in NMT
Assessing the Importance of Frequency versus Compositionality for Subword-based Tokenization in NMT
Benoist Wolleb
Romain Silvestri
Giorgos Vernikos
Ljiljana Dolamic
Ljiljana Dolamic Andrei Popescu-Belis
22
4
0
02 Jun 2023
Strategies for improving low resource speech to text translation relying
  on pre-trained ASR models
Strategies for improving low resource speech to text translation relying on pre-trained ASR models
Santosh Kesiraju
Marek Sarvaš
T. Pavlíček
Cécile Macaire
Alejandro Ciuba
15
4
0
31 May 2023
Exploration of Efficient End-to-End ASR using Discretized Input from
  Self-Supervised Learning
Exploration of Efficient End-to-End ASR using Discretized Input from Self-Supervised Learning
Xuankai Chang
Brian Yan
Yuya Fujita
Takashi Maekaku
Shinji Watanabe
24
37
0
29 May 2023
Byte-Level Grammatical Error Correction Using Synthetic and Curated
  Corpora
Byte-Level Grammatical Error Correction Using Synthetic and Curated Corpora
Svanhvít Lilja Ingólfsdóttir
Pétur Orri Ragnarsson
H. Jónsson
Haukur Barri Símonarson
Vilhjálmur Þorsteinsson
Vésteinn Snæbjarnarson
SyDa
38
9
0
29 May 2023
An Open-Source Gloss-Based Baseline for Spoken to Signed Language
  Translation
An Open-Source Gloss-Based Baseline for Spoken to Signed Language Translation
Amit Moryossef
Mathias Müller
Anne Gohring
Zifan Jiang
Yoav Goldberg
Sarah Ebling
SLR
23
11
0
28 May 2023
From Characters to Words: Hierarchical Pre-trained Language Model for
  Open-vocabulary Language Understanding
From Characters to Words: Hierarchical Pre-trained Language Model for Open-vocabulary Language Understanding
Li Sun
F. Luisier
Kayhan Batmanghelich
D. Florêncio
Changrong Zhang
VLM
23
6
0
23 May 2023
Multilingual Pixel Representations for Translation and Effective
  Cross-lingual Transfer
Multilingual Pixel Representations for Translation and Effective Cross-lingual Transfer
Elizabeth Salesky
Neha Verma
Philipp Koehn
Matt Post
26
14
0
23 May 2023
CompoundPiece: Evaluating and Improving Decompounding Performance of
  Language Models
CompoundPiece: Evaluating and Improving Decompounding Performance of Language Models
Benjamin Minixhofer
Jonas Pfeiffer
Ivan Vulić
32
6
0
23 May 2023
BM25 Query Augmentation Learned End-to-End
BM25 Query Augmentation Learned End-to-End
Xiaoyin Chen
Sam Wiseman
33
1
0
23 May 2023
Do All Languages Cost the Same? Tokenization in the Era of Commercial
  Language Models
Do All Languages Cost the Same? Tokenization in the Era of Commercial Language Models
Orevaoghene Ahia
Sachin Kumar
Hila Gonen
Jungo Kasai
David R. Mortensen
Noah A. Smith
Yulia Tsvetkov
53
82
0
23 May 2023
Machine Translation by Projecting Text into the Same
  Phonetic-Orthographic Space Using a Common Encoding
Machine Translation by Projecting Text into the Same Phonetic-Orthographic Space Using a Common Encoding
Amit Kumar
Shantipriya Parida
A. Pratap
Anil Kumar Singh
16
1
0
21 May 2023
Glot500: Scaling Multilingual Corpora and Language Models to 500
  Languages
Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages
Ayyoob Imani
Peiqin Lin
Amir Hossein Kargaran
Silvia Severini
Masoud Jalili Sabet
...
Chunlan Ma
Helmut Schmid
André F. T. Martins
François Yvon
Hinrich Schütze
ALM
LRM
42
95
0
20 May 2023
Pseudo-Label Training and Model Inertia in Neural Machine Translation
Pseudo-Label Training and Model Inertia in Neural Machine Translation
B. Hsu
Anna Currey
Xing Niu
Maria Nuadejde
Georgiana Dinu
ODL
58
2
0
19 May 2023
Accelerating Transformer Inference for Translation via Parallel Decoding
Accelerating Transformer Inference for Translation via Parallel Decoding
Andrea Santilli
Silvio Severino
Emilian Postolache
Valentino Maiorca
Michele Mancusi
R. Marin
Emanuele Rodolà
41
79
0
17 May 2023
Language Model Tokenizers Introduce Unfairness Between Languages
Language Model Tokenizers Introduce Unfairness Between Languages
Aleksandar Petrov
Emanuele La Malfa
Philip Torr
Adel Bibi
45
97
0
17 May 2023
Cross-Modal Global Interaction and Local Alignment for Audio-Visual
  Speech Recognition
Cross-Modal Global Interaction and Local Alignment for Audio-Visual Speech Recognition
Yuchen Hu
Ruizhe Li
Chen Chen
Heqing Zou
Qiu-shi Zhu
Eng Siong Chng
34
7
0
16 May 2023
Subword Segmental Machine Translation: Unifying Segmentation and Target
  Sentence Generation
Subword Segmental Machine Translation: Unifying Segmentation and Target Sentence Generation
Francois Meyer
Jan Buys
43
8
0
11 May 2023
Effects of sub-word segmentation on performance of transformer language
  models
Effects of sub-word segmentation on performance of transformer language models
Jue Hou
Anisia Katinskaia
Anh Vu
R. Yangarber
21
4
0
09 May 2023
Robust Acoustic and Semantic Contextual Biasing in Neural Transducers
  for Speech Recognition
Robust Acoustic and Semantic Contextual Biasing in Neural Transducers for Speech Recognition
Xuandi Fu
Kanthashree Mysore Sathyendra
Ankur Gandhe
Jing Liu
Grant P. Strimel
Ross McGowan
Athanasios Mouchtaris
33
14
0
09 May 2023
Target-Side Augmentation for Document-Level Machine Translation
Target-Side Augmentation for Document-Level Machine Translation
Guangsheng Bao
Zhiyang Teng
Yue Zhang
49
10
0
08 May 2023
What changes when you randomly choose BPE merge operations? Not much
What changes when you randomly choose BPE merge operations? Not much
Jonne Saleva
Constantine Lignos
33
6
0
04 May 2023
Training and Evaluation of a Multilingual Tokenizer for GPT-SW3
Training and Evaluation of a Multilingual Tokenizer for GPT-SW3
Felix Stollenwerk
31
7
0
28 Apr 2023
Semantic Tokenizer for Enhanced Natural Language Processing
Semantic Tokenizer for Enhanced Natural Language Processing
Sandeep Mehta
Darpan Shah
Ravindra Kulkarni
Cornelia Caragea
VLM
18
3
0
24 Apr 2023
Downstream Task-Oriented Neural Tokenizer Optimization with Vocabulary
  Restriction as Post Processing
Downstream Task-Oriented Neural Tokenizer Optimization with Vocabulary Restriction as Post Processing
Tatsuya Hiraoka
Tomoya Iwakura
20
0
0
21 Apr 2023
From Words to Music: A Study of Subword Tokenization Techniques in
  Symbolic Music Generation
From Words to Music: A Study of Subword Tokenization Techniques in Symbolic Music Generation
Adarsh Kumar
Pedro Sarmento
36
4
0
18 Apr 2023
Dual-Attention Neural Transducers for Efficient Wake Word Spotting in
  Speech Recognition
Dual-Attention Neural Transducers for Efficient Wake Word Spotting in Speech Recognition
Saumya Yashmohini Sahai
Jing Liu
Thejaswi Muniyappa
Kanthashree Mysore Sathyendra
Anastasios Alexandridis
...
Ross McGowan
Ariya Rastrow
Feng-Ju Chang
Athanasios Mouchtaris
Siegfried Kunzmann
39
5
0
03 Apr 2023
Vision Transformers with Mixed-Resolution Tokenization
Vision Transformers with Mixed-Resolution Tokenization
Tomer Ronen
Omer Levy
A. Golbert
ViT
11
21
0
01 Apr 2023
Dialog act guided contextual adapter for personalized speech recognition
Dialog act guided contextual adapter for personalized speech recognition
Feng-Ju Chang
Thejaswi Muniyappa
Kanthashree Mysore Sathyendra
Kailin Wei
Grant P. Strimel
Ross McGowan
24
4
0
31 Mar 2023
BloombergGPT: A Large Language Model for Finance
BloombergGPT: A Large Language Model for Finance
Shijie Wu
Ozan Irsoy
Steven Lu
Vadim Dabravolski
Mark Dredze
Sebastian Gehrmann
P. Kambadur
David S. Rosenberg
Gideon Mann
AIFin
99
789
0
30 Mar 2023
SynthVSR: Scaling Up Visual Speech Recognition With Synthetic
  Supervision
SynthVSR: Scaling Up Visual Speech Recognition With Synthetic Supervision
Xubo Liu
Egor Lakomkin
Konstantinos Vougioukas
Pingchuan Ma
Honglie Chen
...
Niko Moritz
J. Kolár
Stavros Petridis
M. Pantic
Christian Fuegen
52
19
0
30 Mar 2023
TreePiece: Faster Semantic Parsing via Tree Tokenization
TreePiece: Faster Semantic Parsing via Tree Tokenization
Sida I. Wang
Akshat Shrivastava
S. Livshits
20
5
0
30 Mar 2023
Exploring Natural Language Processing Methods for Interactive Behaviour
  Modelling
Exploring Natural Language Processing Methods for Interactive Behaviour Modelling
Guanhua Zhang
Matteo Bortoletto
Zhiming Hu
Lei Shi
Mihai Bâce
Andreas Bulling
17
3
0
28 Mar 2023
Auto-AVSR: Audio-Visual Speech Recognition with Automatic Labels
Auto-AVSR: Audio-Visual Speech Recognition with Automatic Labels
Pingchuan Ma
A. Haliassos
Adriana Fernandez-Lopez
Honglie Chen
Stavros Petridis
M. Pantic
27
107
0
25 Mar 2023
SwissBERT: The Multilingual Language Model for Switzerland
SwissBERT: The Multilingual Language Model for Switzerland
Jannis Vamvas
Johannes Graen
Rico Sennrich
38
6
0
23 Mar 2023
Language Model Behavior: A Comprehensive Survey
Language Model Behavior: A Comprehensive Survey
Tyler A. Chang
Benjamin Bergen
VLM
LRM
LM&MA
27
103
0
20 Mar 2023
On-the-fly Text Retrieval for End-to-End ASR Adaptation
On-the-fly Text Retrieval for End-to-End ASR Adaptation
Bolaji Yusuf
Aditya Gourav
Ankur Gandhe
I. Bulyko
KELM
RALM
43
4
0
20 Mar 2023
An Overview on Language Models: Recent Developments and Outlook
An Overview on Language Models: Recent Developments and Outlook
Chengwei Wei
Yun Cheng Wang
Bin Wang
C.-C. Jay Kuo
33
42
0
10 Mar 2023
Previous
123456...111213
Next