Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1312.3005
Cited By
One Billion Word Benchmark for Measuring Progress in Statistical Language Modeling
11 December 2013
Ciprian Chelba
Tomas Mikolov
M. Schuster
Qi Ge
T. Brants
P. Koehn
T. Robinson
Re-assign community
ArXiv
PDF
HTML
Papers citing
"One Billion Word Benchmark for Measuring Progress in Statistical Language Modeling"
20 / 20 papers shown
Title
Beyond Masked and Unmasked: Discrete Diffusion Models via Partial Masking
Chen-Hao Chao
Wei-Fang Sun
Hanwen Liang
Chun-Yi Lee
Rahul G. Krishnan
DiffM
226
0
0
24 May 2025
Variational Autoencoding Discrete Diffusion with Enhanced Dimensional Correlations Modeling
Tianyu Xie
Shuchen Xue
Zijin Feng
Tianyang Hu
Jiacheng Sun
Zhenguo Li
Cheng Zhang
DiffM
726
0
0
23 May 2025
Insertion Language Models: Sequence Generation with Arbitrary-Position Insertions
Dhruvesh Patel
Aishwarya Sahoo
Avinash Amballa
Tahira Naseem
Tim G. J. Rudner
Andrew McCallum
KELM
79
0
0
09 May 2025
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Marianne Arriola
Aaron Gokaslan
Justin T Chiu
Zhihan Yang
Zhixuan Qi
Jiaqi Han
Subham Sekhar Sahoo
Volodymyr Kuleshov
DiffM
144
11
0
12 Mar 2025
Adversarial Attacks on AI-Generated Text Detection Models: A Token Probability-Based Approach Using Embeddings
Ahmed K. Kadhim
Lei Jiao
Rishad Shafik
Ole-Christoffer Granmo
DeLMO
128
0
0
31 Jan 2025
Merino: Entropy-driven Design for Generative Language Models on IoT Devices
Youpeng Zhao
Ming Lin
Huadong Tang
Qiang Wu
Jun Wang
98
0
0
28 Jan 2025
Simplified and Generalized Masked Diffusion for Discrete Data
Jiaxin Shi
Kehang Han
Zehao Wang
Arnaud Doucet
Michalis K. Titsias
DiffM
115
89
0
17 Jan 2025
Selective Attention Improves Transformer
Yaniv Leviathan
Matan Kalman
Yossi Matias
79
10
0
03 Oct 2024
Undesirable Memorization in Large Language Models: A Survey
Ali Satvaty
Suzan Verberne
Fatih Turkmen
ELM
PILM
124
7
0
03 Oct 2024
Accelerating Large Language Model Pretraining via LFR Pedagogy: Learn, Focus, and Review
Neha Prakriya
Jui-Nan Yen
Cho-Jui Hsieh
Jason Cong
KELM
AI4CE
LRM
53
1
0
10 Sep 2024
Large Vocabulary Size Improves Large Language Models
Sho Takase
Ryokan Ri
Shun Kiyono
Takuya Kato
65
4
0
24 Jun 2024
Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean Data
Jingyang Ou
Shen Nie
Kaiwen Xue
Fengqi Zhu
Jiacheng Sun
Zhenguo Li
Chongxuan Li
DiffM
70
44
0
06 Jun 2024
IMUTube: Automatic Extraction of Virtual on-body Accelerometry from Video for Human Activity Recognition
Hyeokhyen Kwon
C. Tong
H. Haresamudram
Yan Gao
G. Abowd
Nicholas D. Lane
Thomas Ploetz
60
83
0
29 May 2020
Sentence Meta-Embeddings for Unsupervised Semantic Textual Similarity
Nina Poerner
Ulli Waltinger
Hinrich Schütze
AI4TS
98
20
0
09 Nov 2019
Fast Transformer Decoding: One Write-Head is All You Need
Noam M. Shazeer
69
452
0
06 Nov 2019
Conversational Emotion Analysis via Attention Mechanisms
Zheng Lian
J. Tao
Bin Liu
Jian Huang
31
27
0
24 Oct 2019
Frustratingly Short Attention Spans in Neural Language Modeling
Michal Daniluk
Tim Rocktaschel
Johannes Welbl
Sebastian Riedel
76
111
0
15 Feb 2017
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer
Noam M. Shazeer
Azalia Mirhoseini
Krzysztof Maziarz
Andy Davis
Quoc V. Le
Geoffrey E. Hinton
J. Dean
MoE
158
2,614
0
23 Jan 2017
Pointer Sentinel Mixture Models
Stephen Merity
Caiming Xiong
James Bradbury
R. Socher
RALM
166
2,814
0
26 Sep 2016
Exploring the Limits of Language Modeling
Rafal Jozefowicz
Oriol Vinyals
M. Schuster
Noam M. Shazeer
Yonghui Wu
120
1,143
0
07 Feb 2016
1