Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1808.06226
Cited By
SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing
19 August 2018
Taku Kudo
John Richardson
Re-assign community
ArXiv (abs)
PDF
HTML
Github (10925★)
Papers citing
"SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing"
50 / 1,950 papers shown
Title
LongFNT: Long-form Speech Recognition with Factorized Neural Transducer
Xun Gong
Yu-Huan Wu
Jinyu Li
Shujie Liu
Rui Zhao
Xie Chen
Y. Qian
RALM
67
11
0
17 Nov 2022
Speaker Adaptation for End-To-End Speech Recognition Systems in Noisy Environments
Dominik Wagner
Ilja Baumann
Sebastian P. Bayerl
Korbinian Riedhammer
Tobias Bocklet
77
2
0
16 Nov 2022
A Stable, Fast, and Fully Automatic Learning Algorithm for Predictive Coding Networks
Tommaso Salvatori
Yuhang Song
Yordan Yordanov
Beren Millidge
Zheng R. Xu
Lei Sha
Cornelius Emde
Rafal Bogacz
Thomas Lukasiewicz
99
13
0
16 Nov 2022
Findings of the Covid-19 MLIA Machine Translation Task
F. Casacuberta
Alexandru Ceausu
K. Choukri
Miltos Deligiannis
Miguel Domingo
...
V. Papavassiliou
Stelios Piperidis
Prokopis Prokopidis
Dimitris Roussis
M. Salah
30
0
0
14 Nov 2022
Calibrated Interpretation: Confidence Estimation in Semantic Parsing
Elias Stengel-Eskin
Benjamin Van Durme
UQLM
160
25
0
14 Nov 2022
ALBERT with Knowledge Graph Encoder Utilizing Semantic Similarity for Commonsense Question Answering
Byeongmin Choi
Yong-Sook Lee
Yeunwoong Kyung
Eunchan Kim
55
10
0
14 Nov 2022
Addressing Segmentation Ambiguity in Neural Linguistic Steganography
Jumon Nozaki
Yugo Murawaki
44
5
0
12 Nov 2022
Speech-to-Speech Translation For A Real-world Unwritten Language
Peng-Jen Chen
Ke M. Tran
Yilin Yang
Jingfei Du
Justine T. Kao
...
Sravya Popuri
Changhan Wang
J. Pino
Wei-Ning Hsu
Ann Lee
93
26
0
11 Nov 2022
Using Developer Discussions to Guide Fixing Bugs in Software
Sheena Panthaplackel
Miloš Gligorić
Junyi Jessy Li
Raymond J. Mooney
56
5
0
11 Nov 2022
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BigScience Workshop
:
Teven Le Scao
Angela Fan
Christopher Akiki
...
Zhongli Xie
Zifan Ye
M. Bras
Younes Belkada
Thomas Wolf
VLM
474
2,398
0
09 Nov 2022
Self-conditioned Embedding Diffusion for Text Generation
Robin Strudel
Corentin Tallec
Florent Altché
Yilun Du
Yaroslav Ganin
...
Will Grathwohl
Nikolay Savinov
Sander Dieleman
Laurent Sifre
Rémi Leblond
DiffM
89
88
0
08 Nov 2022
Conciseness: An Overlooked Language Task
Felix Stahlberg
Aashish Kumar
Chris Alberti
Shankar Kumar
40
1
0
08 Nov 2022
Streaming, fast and accurate on-device Inverse Text Normalization for Automatic Speech Recognition
Yashesh Gaur
Nick Kibre
Jian Xue
Kangyuan Shu
Yuhui Wang
Issac Alphonso
Jinyu Li
Jiawei Liu
34
7
0
07 Nov 2022
Predictive Coding beyond Gaussian Distributions
Luca Pinchetti
Tommaso Salvatori
Yordan Yordanov
Beren Millidge
Yuhang Song
Thomas Lukasiewicz
UQCV
BDL
74
11
0
07 Nov 2022
Biased Self-supervised learning for ASR
Florian Kreyssig
Yangyang Shi
Jinxi Guo
Leda Sari
Abdel-rahman Mohamed
P. Woodland
SSL
87
3
0
04 Nov 2022
Phonetic-assisted Multi-Target Units Modeling for Improving Conformer-Transducer ASR system
Li Li
Dongxing Xu
Haoran Wei
Yanhua Long
98
2
0
03 Nov 2022
Continual Learning of Neural Machine Translation within Low Forgetting Risk Regions
Shuhao Gu
Bojie Hu
Yang Feng
CLL
85
15
0
03 Nov 2022
Conversation-oriented ASR with multi-look-ahead CBS architecture
Huaibo Zhao
S. Fujie
Tetsuji Ogawa
Jin Sakuma
Yusuke Kida
Tetsunori Kobayashi
92
3
0
02 Nov 2022
Fast and parallel decoding for transducer
Wei Kang
Liyong Guo
Fangjun Kuang
Long Lin
Mingshuang Luo
Zengwei Yao
Xiaoyu Yang
Piotr Żelasko
Daniel Povey
AI4TS
80
17
0
31 Oct 2022
Efficient Speech Translation with Dynamic Latent Perceivers
Ioannis Tsiamas
Gerard I. Gállego
José A. R. Fonollosa
Marta R. Costa-jussá
52
3
0
28 Oct 2022
Modeling structure-building in the brain with CCG parsing and large language models
Miloš Stanojević
Jonathan Brennan
Donald Dunagan
Mark Steedman
John T. Hale
44
14
0
28 Oct 2022
Random Utterance Concatenation Based Data Augmentation for Improving Short-video Speech Recognition
Yist Y. Lin
Tao Han
Haihua Xu
Van Tung Pham
Yerbolat Khassanov
Tze Yuang Chong
Yi He
Lu Lu
Zejun Ma
65
2
0
28 Oct 2022
Residual Adapters for Few-Shot Text-to-Speech Speaker Adaptation
Nobuyuki Morioka
Heiga Zen
Nanxin Chen
Yu Zhang
Yifan Ding
98
16
0
28 Oct 2022
Domain Adaptation of Machine Translation with Crowdworkers
Makoto Morishita
Jun Suzuki
Masaaki Nagata
44
3
0
28 Oct 2022
Token-level Sequence Labeling for Spoken Language Understanding using Compositional End-to-End Models
Siddhant Arora
Siddharth Dalmia
Brian Yan
Florian Metze
A. Black
Shinji Watanabe
37
12
0
27 Oct 2022
ACES: Translation Accuracy Challenge Sets for Evaluating Machine Translation Metrics
Chantal Amrhein
Nikita Moghe
Liane Guillou
ELM
106
23
0
27 Oct 2022
Make More of Your Data: Minimal Effort Data Augmentation for Automatic Speech Recognition and Translation
Tsz Kin Lam
Shigehiko Schamoni
Stefan Riezler
VLM
86
10
0
27 Oct 2022
Can language models handle recursively nested grammatical structures? A case study on comparing models and humans
Andrew Kyle Lampinen
ReLM
ELM
121
36
0
27 Oct 2022
Weight Averaging: A Simple Yet Effective Method to Overcome Catastrophic Forgetting in Automatic Speech Recognition
Steven Vander Eeckt
Hugo Van hamme
CLL
MoMe
113
15
0
27 Oct 2022
Too Brittle To Touch: Comparing the Stability of Quantization and Distillation Towards Developing Lightweight Low-Resource MT Models
Harshita Diddee
Sandipan Dandapat
Monojit Choudhury
T. Ganu
Kalika Bali
79
5
0
27 Oct 2022
End-to-End Speech to Intent Prediction to improve E-commerce Customer Support Voicebot in Hindi and English
Abhinav Goyal
Ashutosh Kumar Singh
Nikesh Garera
42
4
0
26 Oct 2022
Beyond English-Centric Bitexts for Better Multilingual Language Representation Learning
Barun Patra
Saksham Singhal
Shaohan Huang
Zewen Chi
Li Dong
Furu Wei
Vishrav Chaudhary
Xia Song
127
24
0
26 Oct 2022
Towards automatic generation of Piping and Instrumentation Diagrams (P&IDs) with Artificial Intelligence
Edwin Hirtreiter
Lukas Schulze Balhorn
Artur M. Schweidtmann
AI4CE
48
20
0
26 Oct 2022
Towards Better Few-Shot and Finetuning Performance with Forgetful Causal Language Models
Hao Liu
Xinyang Geng
Lisa Lee
Igor Mordatch
Sergey Levine
Sharan Narang
Pieter Abbeel
KELM
CLL
89
2
0
24 Oct 2022
ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition
Sanchit Gandhi
Patrick von Platen
Alexander M. Rush
70
25
0
24 Oct 2022
Finding Memo: Extractive Memorization in Constrained Sequence Generation Tasks
Vikas Raunak
Arul Menezes
71
14
0
24 Oct 2022
Bootstrapping meaning through listening: Unsupervised learning of spoken sentence embeddings
Jian Zhu
Zuoyu Tian
Yadong Liu
Cong Zhang
Chia-wen Lo
SSL
82
2
0
23 Oct 2022
Translation Word-Level Auto-Completion: What can we achieve out of the box?
Yasmin Moslem
Rejwanul Haque
Andy Way
100
5
0
23 Oct 2022
Additive Interventions Yield Robust Multi-Domain Machine Translation Models
Elijah Matthew Rippeth
Matt Post
20
0
0
23 Oct 2022
Information-Transport-based Policy for Simultaneous Translation
Shaolei Zhang
Yang Feng
104
52
0
22 Oct 2022
Guided contrastive self-supervised pre-training for automatic speech recognition
Aparna Khare
Minhua Wu
Saurabhchand Bhati
J. Droppo
Roland Maas
SSL
59
0
0
22 Oct 2022
Audio-to-Intent Using Acoustic-Textual Subword Representations from End-to-End ASR
Pranay Dighe
Prateeth Nayak
Oggi Rudovic
Erik Marchi
Xiaochuan Niu
Ahmed H. Tewfik
84
4
0
21 Oct 2022
m
4
A
d
a
p
t
e
r
m^4Adapter
m
4
A
d
a
pt
er
: Multilingual Multi-Domain Adaptation for Machine Translation with a Meta-Adapter
Wen Lai
Alexandra Chronopoulou
Alexander Fraser
80
3
0
21 Oct 2022
SIT at MixMT 2022: Fluent Translation Built on Giant Pre-trained Models
A. Khan
Hrishikesh Kanade
G. Budhrani
Preet Jhanglani
Jia Xu
138
2
0
21 Oct 2022
Separating Grains from the Chaff: Using Data Filtering to Improve Multilingual Translation for Low-Resourced African Languages
Idris Abdulmumin
Michael Beukman
Jesujoba Oluwadara Alabi
Chris C. Emezue
Everlyn Asiko
...
Shamsuddeen Hassan Muhammad
Mofetoluwa Adeyemi
Oreen Yousuf
Sahib Singh
T. Gwadabe
105
9
0
19 Oct 2022
Simultaneous Translation for Unsegmented Input: A Sliding Window Approach
Sukanta Sen
Ondrej Bojar
Barry Haddow
21
4
0
18 Oct 2022
Acoustic-aware Non-autoregressive Spell Correction with Mask Sample Decoding
Ruchao Fan
Guoli Ye
Yashesh Gaur
Jinyu Li
38
4
0
16 Oct 2022
A Policy-based Approach to the SpecAugment Method for Low Resource E2E ASR
Rui Li
Guodong Ma
Dexin Zhao
Ranran Zeng
Xiaoyu Li
Haolin Huang
69
2
0
16 Oct 2022
HashFormers: Towards Vocabulary-independent Pre-trained Transformers
Huiyin Xue
Nikolaos Aletras
51
4
0
14 Oct 2022
On Compressing Sequences for Self-Supervised Speech Models
Yen Meng
Hsuan-Jui Chen
Jiatong Shi
Shinji Watanabe
Paola García
Hung-yi Lee
Hao Tang
SSL
56
15
0
13 Oct 2022
Previous
1
2
3
...
19
20
21
...
37
38
39
Next