ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1808.06226
  4. Cited By
SentencePiece: A simple and language independent subword tokenizer and
  detokenizer for Neural Text Processing

SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing

19 August 2018
Taku Kudo
John Richardson
ArXivPDFHTML

Papers citing "SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing"

50 / 1,923 papers shown
Title
Modelling Latent Skills for Multitask Language Generation
Modelling Latent Skills for Multitask Language Generation
Kris Cao
Dani Yogatama
14
3
0
21 Feb 2020
Imputer: Sequence Modelling via Imputation and Dynamic Programming
Imputer: Sequence Modelling via Imputation and Dynamic Programming
William Chan
Chitwan Saharia
Geoffrey E. Hinton
Mohammad Norouzi
Navdeep Jaitly
BDL
AI4TS
21
114
0
20 Feb 2020
Estimating Training Data Influence by Tracing Gradient Descent
Estimating Training Data Influence by Tracing Gradient Descent
G. Pruthi
Frederick Liu
Mukund Sundararajan
Satyen Kale
TDI
10
380
0
19 Feb 2020
Controlling Computation versus Quality for Neural Sequence Models
Controlling Computation versus Quality for Neural Sequence Models
Ankur Bapna
N. Arivazhagan
Orhan Firat
27
30
0
17 Feb 2020
Speech Corpus of Ainu Folklore and End-to-end Speech Recognition for
  Ainu Language
Speech Corpus of Ainu Folklore and End-to-end Speech Recognition for Ainu Language
Kohei Matsuura
Sei Ueno
Masato Mimura
S. Sakai
Tatsuya Kawahara
CVBM
18
13
0
16 Feb 2020
FQuAD: French Question Answering Dataset
FQuAD: French Question Answering Dataset
Martin d'Hoffschmidt
Wacim Belblidia
Tom Brendlé
Quentin Heinrich
Maxime Vidal
26
98
0
14 Feb 2020
fastai: A Layered API for Deep Learning
fastai: A Layered API for Deep Learning
Jeremy Howard
Sylvain Gugger
AI4CE
20
857
0
11 Feb 2020
Learning Coupled Policies for Simultaneous Machine Translation using
  Imitation Learning
Learning Coupled Policies for Simultaneous Machine Translation using Imitation Learning
Philip Arthur
Trevor Cohn
Gholamreza Haffari
14
18
0
11 Feb 2020
Accelerating RNN Transducer Inference via One-Step Constrained Beam
  Search
Accelerating RNN Transducer Inference via One-Step Constrained Beam Search
Juntae Kim
Yoonhan Lee
20
22
0
10 Feb 2020
A Multilingual View of Unsupervised Machine Translation
A Multilingual View of Unsupervised Machine Translation
Xavier Garcia
Pierre Foret
Thibault Sellam
Ankur P. Parikh
54
37
0
07 Feb 2020
A deep-learning view of chemical space designed to facilitate drug
  discovery
A deep-learning view of chemical space designed to facilitate drug discovery
P. Maragakis
Hunter M. Nisonoff
B. Cole
D. Shaw
49
28
0
07 Feb 2020
Graph Constrained Reinforcement Learning for Natural Language Action
  Spaces
Graph Constrained Reinforcement Learning for Natural Language Action Spaces
Prithviraj Ammanabrolu
Matthew J. Hausknecht
AI4CE
LLMAG
17
127
0
23 Jan 2020
Pre-training via Leveraging Assisting Languages and Data Selection for
  Neural Machine Translation
Pre-training via Leveraging Assisting Languages and Data Selection for Neural Machine Translation
Haiyue Song
Raj Dabre
Zhuoyuan Mao
Fei Cheng
Sadao Kurohashi
Eiichiro Sumita
16
2
0
23 Jan 2020
Multilingual Denoising Pre-training for Neural Machine Translation
Multilingual Denoising Pre-training for Neural Machine Translation
Yinhan Liu
Jiatao Gu
Naman Goyal
Xian Li
Sergey Edunov
Marjan Ghazvininejad
M. Lewis
Luke Zettlemoyer
AI4CE
AIMat
55
1,773
0
22 Jan 2020
Normalization of Input-output Shared Embeddings in Text Generation
  Models
Normalization of Input-output Shared Embeddings in Text Generation Models
Jinyang Liu
Yujia Zhai
Zizhong Chen
28
0
0
22 Jan 2020
Unsupervised Sentiment Analysis for Code-mixed Data
Unsupervised Sentiment Analysis for Code-mixed Data
Siddharth Yadav
Tanmoy Chakraborty
21
15
0
20 Jan 2020
Streaming automatic speech recognition with the transformer model
Streaming automatic speech recognition with the transformer model
Niko Moritz
Takaaki Hori
Jonathan Le Roux
21
184
0
08 Jan 2020
Language Models Are An Effective Patient Representation Learning
  Technique For Electronic Health Record Data
Language Models Are An Effective Patient Representation Learning Technique For Electronic Health Record Data
E. Steinberg
Kenneth Jung
Jason Alan Fries
Conor K. Corbin
Stephen R. Pfohl
N. Shah
29
103
0
06 Jan 2020
Exploring Benefits of Transfer Learning in Neural Machine Translation
Exploring Benefits of Transfer Learning in Neural Machine Translation
Tom Kocmi
29
17
0
06 Jan 2020
A Comprehensive Survey of Multilingual Neural Machine Translation
A Comprehensive Survey of Multilingual Neural Machine Translation
Raj Dabre
Chenhui Chu
Anoop Kunchukuttan
LRM
36
33
0
04 Jan 2020
TED: A Pretrained Unsupervised Summarization Model with Theme Modeling
  and Denoising
TED: A Pretrained Unsupervised Summarization Model with Theme Modeling and Denoising
Ziyi Yang
Chenguang Zhu
R. Gmyr
Michael Zeng
Xuedong Huang
Eric Darve
23
61
0
03 Jan 2020
Leveraging Lead Bias for Zero-shot Abstractive News Summarization
Leveraging Lead Bias for Zero-shot Abstractive News Summarization
Chenguang Zhu
Ziyi Yang
R. Gmyr
Michael Zeng
Xuedong Huang
16
19
0
25 Dec 2019
BERTje: A Dutch BERT Model
BERTje: A Dutch BERT Model
Wietse de Vries
Andreas van Cranenburgh
Arianna Bisazza
Tommaso Caselli
Gertjan van Noord
Malvina Nissim
VLM
SSeg
25
291
0
19 Dec 2019
Multilingual is not enough: BERT for Finnish
Multilingual is not enough: BERT for Finnish
Antti Virtanen
Jenna Kanerva
Rami Ilo
Jouni Luoma
Juhani Luotolahti
T. Salakoski
Filip Ginter
S. Pyysalo
36
277
0
15 Dec 2019
Personalized Patent Claim Generation and Measurement
Personalized Patent Claim Generation and Measurement
Jieh-Sheng Lee
16
4
0
07 Dec 2019
Neural Machine Translation: A Review and Survey
Neural Machine Translation: A Review and Survey
Felix Stahlberg
3DV
AI4TS
MedIm
28
313
0
04 Dec 2019
Using Sequence-to-Sequence Learning for Repairing C Vulnerabilities
Using Sequence-to-Sequence Learning for Repairing C Vulnerabilities
Zimin Chen
Steve Kommrusch
Monperrus Martin
11
5
0
04 Dec 2019
Leveraging Contextual Embeddings for Detecting Diachronic Semantic Shift
Leveraging Contextual Embeddings for Detecting Diachronic Semantic Shift
Matej Martinc
Petra Kralj Novak
Senja Pollak
8
72
0
02 Dec 2019
Fiction Sentence Expansion and Enhancement via Focused Objective and
  Novelty Curve Sampling
Fiction Sentence Expansion and Enhancement via Focused Objective and Novelty Curve Sampling
Yuri Safovich
A. Azaria
9
7
0
02 Dec 2019
Jejueo Datasets for Machine Translation and Speech Synthesis
Jejueo Datasets for Machine Translation and Speech Synthesis
Kyubyong Park
Yo Joong Choe
Jiyeon Ham
14
5
0
27 Nov 2019
Simultaneous Neural Machine Translation using Connectionist Temporal
  Classification
Simultaneous Neural Machine Translation using Connectionist Temporal Classification
Katsuki Chousa
Katsuhito Sudoh
Satoshi Nakamura
23
5
0
27 Nov 2019
JParaCrawl: A Large Scale Web-Based English-Japanese Parallel Corpus
JParaCrawl: A Large Scale Web-Based English-Japanese Parallel Corpus
Makoto Morishita
Jun Suzuki
Masaaki Nagata
LRM
38
64
0
25 Nov 2019
End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern
  Architectures
End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern Architectures
Gabriel Synnaeve
Qiantong Xu
Jacob Kahn
Tatiana Likhomanenko
Edouard Grave
Vineel Pratap
Anuroop Sriram
Vitaliy Liptchinsky
R. Collobert
SSL
AI4TS
36
246
0
19 Nov 2019
The Eighth Dialog System Technology Challenge
The Eighth Dialog System Technology Challenge
Seokhwan Kim
Michel Galley
Chulaka Gunasekara
Sungjin Lee
Adam Atkinson
...
Tim K. Marks
Abhinav Rastogi
Xiaoxue Zang
Srinivas Sunkara
Raghav Gupta
VLM
18
65
0
14 Nov 2019
CCMatrix: Mining Billions of High-Quality Parallel Sentences on the WEB
CCMatrix: Mining Billions of High-Quality Parallel Sentences on the WEB
Holger Schwenk
Guillaume Wenzek
Sergey Edunov
Edouard Grave
Armand Joulin
33
256
0
10 Nov 2019
A Bilingual Generative Transformer for Semantic Sentence Embedding
A Bilingual Generative Transformer for Semantic Sentence Embedding
John Wieting
Graham Neubig
Taylor Berg-Kirkpatrick
22
28
0
10 Nov 2019
CamemBERT: a Tasty French Language Model
CamemBERT: a Tasty French Language Model
Louis Martin
Benjamin Muller
Pedro Ortiz Suarez
Yoann Dupont
Laurent Romary
Eric Villemonte de la Clergerie
Djamé Seddah
Benoît Sagot
42
956
0
10 Nov 2019
CCAligned: A Massive Collection of Cross-Lingual Web-Document Pairs
CCAligned: A Massive Collection of Cross-Lingual Web-Document Pairs
Ahmed El-Kishky
Vishrav Chaudhary
Francisco Guzman
Philipp Koehn
25
198
0
10 Nov 2019
Enforcing Encoder-Decoder Modularity in Sequence-to-Sequence Models
Enforcing Encoder-Decoder Modularity in Sequence-to-Sequence Models
Siddharth Dalmia
Abdel-rahman Mohamed
M. Lewis
Florian Metze
Luke Zettlemoyer
19
10
0
09 Nov 2019
A Simplified Fully Quantized Transformer for End-to-end Speech
  Recognition
A Simplified Fully Quantized Transformer for End-to-end Speech Recognition
Alex Bie
Bharat Venkitesh
João Monteiro
Md. Akmal Haidar
Mehdi Rezagholizadeh
MQ
32
27
0
09 Nov 2019
Unsupervised Cross-lingual Representation Learning at Scale
Unsupervised Cross-lingual Representation Learning at Scale
Alexis Conneau
Kartikay Khandelwal
Naman Goyal
Vishrav Chaudhary
Guillaume Wenzek
Francisco Guzmán
Edouard Grave
Myle Ott
Luke Zettlemoyer
Veselin Stoyanov
44
6,393
0
05 Nov 2019
RNN-T For Latency Controlled ASR With Improved Beam Search
RNN-T For Latency Controlled ASR With Improved Beam Search
Mahaveer Jain
Kjell Schubert
Jay Mahadeokar
Ching-Feng Yeh
Kaustubh Kalgaonkar
Anuroop Sriram
Christian Fuegen
M. Seltzer
14
44
0
05 Nov 2019
Machine Translation of Restaurant Reviews: New Corpus for Domain
  Adaptation and Robustness
Machine Translation of Restaurant Reviews: New Corpus for Domain Adaptation and Robustness
Alexandre Berard
Ioan Calapodescu
Marc Dymetman
Claude Roux
Jean-Luc Meunier
Vassilina Nikoulina
13
27
0
31 Oct 2019
Naver Labs Europe's Systems for the Document-Level Generation and
  Translation Task at WNGT 2019
Naver Labs Europe's Systems for the Document-Level Generation and Translation Task at WNGT 2019
Fahimeh Saleh
Alexandre Berard
Ioan Calapodescu
Laurent Besacier
VLM
23
14
0
31 Oct 2019
Fill in the Blanks: Imputing Missing Sentences for Larger-Context Neural
  Machine Translation
Fill in the Blanks: Imputing Missing Sentences for Larger-Context Neural Machine Translation
Sébastien Jean
Ankur Bapna
Orhan Firat
19
7
0
30 Oct 2019
Transformer-based Cascaded Multimodal Speech Translation
Transformer-based Cascaded Multimodal Speech Translation
Zixiu "Alex" Wu
Ozan Caglayan
Julia Ive
Josiah Wang
Lucia Specia
25
7
0
29 Oct 2019
Big Bidirectional Insertion Representations for Documents
Big Bidirectional Insertion Representations for Documents
Lala Li
William Chan
6
4
0
29 Oct 2019
Transformer-Transducer: End-to-End Speech Recognition with
  Self-Attention
Transformer-Transducer: End-to-End Speech Recognition with Self-Attention
Ching-Feng Yeh
Jay Mahadeokar
Kaustubh Kalgaonkar
Yongqiang Wang
Duc Le
Mahaveer Jain
Kjell Schubert
Christian Fuegen
M. Seltzer
27
147
0
28 Oct 2019
Evaluating Lottery Tickets Under Distributional Shifts
Evaluating Lottery Tickets Under Distributional Shifts
Shrey Desai
Hongyuan Zhan
Ahmed Aly
UQCV
OOD
21
41
0
28 Oct 2019
Modeling Inter-Speaker Relationship in XLNet for Contextual Spoken
  Language Understanding
Modeling Inter-Speaker Relationship in XLNet for Contextual Spoken Language Understanding
Jonggu Kim
Jong-Hyeok Lee
11
1
0
28 Oct 2019
Previous
123...36373839
Next