Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1808.06226
Cited By
SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing
19 August 2018
Taku Kudo
John Richardson
Re-assign community
ArXiv (abs)
PDF
HTML
Github (10925★)
Papers citing
"SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing"
50 / 1,950 papers shown
Title
Incorporating Context into Subword Vocabularies
Shaked Yehezkel
Yuval Pinter
100
10
0
13 Oct 2022
CLASP: Few-Shot Cross-Lingual Data Augmentation for Semantic Parsing
Andrew Rosenbaum
Saleh Soltan
Wael Hamza
Amir Saffari
Macro Damonte
Isabel Groves
97
32
0
13 Oct 2022
Foundation Transformers
Hongyu Wang
Shuming Ma
Shaohan Huang
Li Dong
Wenhui Wang
...
Barun Patra
Zhun Liu
Vishrav Chaudhary
Xia Song
Furu Wei
AI4CE
91
27
0
12 Oct 2022
SilverAlign: MT-Based Silver Data Algorithm For Evaluating Word Alignment
Abdullatif Köksal
Silvia Severini
Hinrich Schütze
73
0
0
12 Oct 2022
Exploring Segmentation Approaches for Neural Machine Translation of Code-Switched Egyptian Arabic-English Text
Marwa Gaser
Manuel Mager
Injy Hamed
Nizar Habash
Slim Abdennadher
Ngoc Thang Vu
81
7
0
11 Oct 2022
Decoupled Context Processing for Context Augmented Language Modeling
Zonglin Li
Ruiqi Guo
Surinder Kumar
RALM
KELM
82
24
0
11 Oct 2022
Better Than Whitespace: Information Retrieval for Languages without Custom Tokenizers
Odunayo Ogundepo
Xinyu Crystina Zhang
Jimmy J. Lin
41
2
0
11 Oct 2022
CTC Alignments Improve Autoregressive Translation
Brian Yan
Siddharth Dalmia
Yosuke Higuchi
Graham Neubig
Florian Metze
A. Black
Shinji Watanabe
93
33
0
11 Oct 2022
Improving Robustness of Retrieval Augmented Translation via Shuffling of Suggestions
Cuong Hoang
Devendra Singh Sachan
Prashant Mathur
Brian Thompson
Marcello Federico
77
2
0
11 Oct 2022
Improving Retrieval Augmented Neural Machine Translation by Controlling Source and Fuzzy-Match Interactions
Cuong Hoang
Devendra Singh Sachan
Prashant Mathur
Brian Thompson
Marcello Federico
149
9
0
10 Oct 2022
Automatic Evaluation and Analysis of Idioms in Neural Machine Translation
Christos Baziotis
Prashant Mathur
Eva Hasler
69
9
0
10 Oct 2022
SpeechUT: Bridging Speech and Text with Hidden-Unit for Encoder-Decoder Based Speech-Text Pre-training
Zi-Hua Zhang
Long Zhou
Junyi Ao
Shujie Liu
Lirong Dai
Jinyu Li
Furu Wei
128
58
0
07 Oct 2022
A New Path: Scaling Vision-and-Language Navigation with Synthetic Instructions and Imitation Learning
Aishwarya Kamath
Peter Anderson
Su Wang
Jing Yu Koh
Alexander Ku
Austin Waters
Yinfei Yang
Jason Baldridge
Zarana Parekh
LM&Ro
100
48
0
06 Oct 2022
Toxicity in Multilingual Machine Translation at Scale
Marta R. Costa-jussá
Eric Michael Smith
C. Ropers
Daniel Licht
Jean Maillard
Javier Ferrando
Carlos Escolano
96
27
0
06 Oct 2022
JoeyS2T: Minimalistic Speech-to-Text Modeling with JoeyNMT
Mayumi Ohta
Julia Kreutzer
Stefan Riezler
53
0
0
05 Oct 2022
Revisiting Syllables in Language Modelling and their Application on Low-Resource Machine Translation
Arturo Oncevay
Kervy Rivas Rojas
Liz Karen Chavez Sanchez
Roberto Zariquiey
63
0
0
05 Oct 2022
Honest Students from Untrusted Teachers: Learning an Interpretable Question-Answering Pipeline from a Pretrained Language Model
Jacob Eisenstein
D. Andor
Bernd Bohnet
Michael Collins
David M. Mimno
LRM
290
25
0
05 Oct 2022
Code-Switching without Switching: Language Agnostic End-to-End Speech Translation
Christian Huber
Enes Yavuz Ugan
A. Waibel
45
14
0
04 Oct 2022
Recitation-Augmented Language Models
Zhiqing Sun
Xuezhi Wang
Yi Tay
Yiming Yang
Denny Zhou
RALM
275
65
0
04 Oct 2022
Enriching Vulnerability Reports Through Automated and Augmented Description Summarization
Hattan Althebeiti
David A. Mohaisen
32
4
0
03 Oct 2022
The boundaries of meaning: a case study in neural machine translation
Yuri Balashov
33
2
0
02 Oct 2022
QUAK: A Synthetic Quality Estimation Dataset for Korean-English Neural Machine Translation
Sugyeong Eo
Chanjun Park
Hyeonseok Moon
Jaehyung Seo
Gyeongmin Kim
Jungseob Lee
Heu-Jeoung Lim
51
1
0
30 Sep 2022
Language-Family Adapters for Low-Resource Multilingual Neural Machine Translation
Alexandra Chronopoulou
Dario Stojanovski
Alexander Fraser
161
19
0
30 Sep 2022
polyBERT: A chemical language model to enable fully machine-driven ultrafast polymer informatics
Christopher Kuenneth
R. Ramprasad
101
113
0
29 Sep 2022
TVLT: Textless Vision-Language Transformer
Zineng Tang
Jaemin Cho
Yixin Nie
Joey Tianyi Zhou
VLM
137
31
0
28 Sep 2022
Revamping Multilingual Agreement Bidirectionally via Switched Back-translation for Multilingual Neural Machine Translation
Hongyuan Lu
Haoyang Huang
Shuming Ma
Dongdong Zhang
Furu Wei
Wai Lam
52
0
0
28 Sep 2022
Structured Summarization: Unified Text Segmentation and Segment Labeling as a Generation Task
Hakan Inan
Rashi Rungta
Yashar Mehdad
67
9
0
28 Sep 2022
Improving Multilingual Neural Machine Translation System for Indic Languages
Sudhansu Bala Das
Atharv Biradar
Tapas Kumar Mishra
B. Patra
109
31
0
27 Sep 2022
Meta-Learning a Cross-lingual Manifold for Semantic Parsing
Tom Sherborne
Mirella Lapata
103
13
0
26 Sep 2022
WinoDict: Probing language models for in-context word acquisition
Julian Martin Eisenschlos
Jeremy R. Cole
Fangyu Liu
William W. Cohen
KELM
58
13
0
25 Sep 2022
Dodging the Data Bottleneck: Automatic Subtitling with Automatically Segmented ST Corpora
Sara Papi
Alina Karakanta
Matteo Negri
Marco Turchi
73
8
0
21 Sep 2022
Show, Interpret and Tell: Entity-aware Contextualised Image Captioning in Wikipedia
K. Nguyen
Ali Furkan Biten
Andrés Mafla
Lluís Gómez
Dimosthenis Karatzas
56
11
0
21 Sep 2022
WeLM: A Well-Read Pre-trained Language Model for Chinese
Hui Su
Xiao Zhou
Houjin Yu
Xiaoyu Shen
Yuwen Chen
Zilin Zhu
Yang Yu
Jie Zhou
87
23
0
21 Sep 2022
LINGUIST: Language Model Instruction Tuning to Generate Annotated Utterances for Intent Classification and Slot Tagging
Andrew Rosenbaum
Saleh Soltan
Wael Hamza
Yannick Versley
M. Boese
75
44
0
20 Sep 2022
Relaxed Attention for Transformer Models
Timo Lohrenz
Björn Möller
Zhengyang Li
Tim Fingscheidt
KELM
53
12
0
20 Sep 2022
Vega-MT: The JD Explore Academy Translation System for WMT22
Changtong Zan
Keqin Peng
Liang Ding
Baopu Qiu
Boan Liu
...
Zhenghang Zhang
Chuang Liu
Weifeng Liu
Yibing Zhan
Dacheng Tao
VLM
91
14
0
20 Sep 2022
Distribution Aware Metrics for Conditional Natural Language Generation
David M. Chan
Yiming Ni
David A. Ross
Sudheendra Vijayanarasimhan
Austin Myers
John F. Canny
77
4
0
15 Sep 2022
Rethinking Round-Trip Translation for Machine Translation Evaluation
Terry Yue Zhuo
Xingliang Yuan
Xuanli He
Trevor Cohn
LRM
47
2
0
15 Sep 2022
Simple and Effective Gradient-Based Tuning of Sequence-to-Sequence Models
Jared Lichtarge
Chris Alberti
Shankar Kumar
88
4
0
10 Sep 2022
Adapting to Non-Centered Languages for Zero-shot Multilingual Translation
Zhi Qu
Taro Watanabe
110
7
0
09 Sep 2022
MaxMatch-Dropout: Subword Regularization for WordPiece
Tatsuya Hiraoka
93
9
0
09 Sep 2022
Pre-Training a Graph Recurrent Network for Language Representation
Yile Wang
Linyi Yang
Zhiyang Teng
M. Zhou
Yue Zhang
GNN
81
1
0
08 Sep 2022
On the Complementarity between Pre-Training and Random-Initialization for Resource-Rich Machine Translation
Changtong Zan
Liang Ding
Li Shen
Yu Cao
Weifeng Liu
Dacheng Tao
97
21
0
07 Sep 2022
Improving the Cross-Lingual Generalisation in Visual Question Answering
Farhad Nooralahzadeh
Rico Sennrich
89
6
0
07 Sep 2022
Adam Mickiewicz University at WMT 2022: NER-Assisted and Quality-Aware Neural Machine Translation
Artur Nowakowski
Gabriela Pałka
Kamil Guttmann
Miko Pokrywka
62
5
0
07 Sep 2022
Informative Language Representation Learning for Massively Multilingual Neural Machine Translation
Renren Jin
Deyi Xiong
76
5
0
04 Sep 2022
Topic Detection in Continuous Sign Language Videos
Álvaro Budria
Laia Tarrés
Gerard I. Gállego
Francesc Moreno-Noguer
Jordi Torres
Xavier Giró-i-Nieto
SLR
VLM
91
1
0
01 Sep 2022
Transformers are Sample-Efficient World Models
Vincent Micheli
Eloi Alonso
Franccois Fleuret
VLM
OffRL
185
189
0
01 Sep 2022
Attention Enhanced Citrinet for Speech Recognition
Xianchao Wu
77
1
0
01 Sep 2022
Deep Sparse Conformer for Speech Recognition
Xianchao Wu
41
2
0
01 Sep 2022
Previous
1
2
3
...
20
21
22
...
37
38
39
Next