Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1609.07843
Cited By
Pointer Sentinel Mixture Models
26 September 2016
Stephen Merity
Caiming Xiong
James Bradbury
R. Socher
RALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Pointer Sentinel Mixture Models"
50 / 705 papers shown
Title
Nearest Neighbor Speculative Decoding for LLM Generation and Attribution
Minghan Li
Xilun Chen
Ari Holtzman
Beidi Chen
Jimmy Lin
Wen-tau Yih
Xi Lin
RALM
BDL
108
10
0
29 May 2024
fMRI predictors based on language models of increasing complexity recover brain left lateralization
Laurent Bonnasse-Gahot
Christophe Pallier
52
3
0
28 May 2024
I-LLM: Efficient Integer-Only Inference for Fully-Quantized Low-Bit Large Language Models
Xing Hu
Yuan Cheng
Dawei Yang
Zhihang Yuan
Jiangyong Yu
Chen Xu
Sifan Zhou
MQ
40
8
0
28 May 2024
Linguistic Collapse: Neural Collapse in (Large) Language Models
Robert Wu
Vardan Papyan
53
13
0
28 May 2024
Dual Process Learning: Controlling Use of In-Context vs. In-Weights Strategies with Weight Forgetting
Suraj Anand
Michael A. Lepori
Jack Merullo
Ellie Pavlick
CLL
36
6
0
28 May 2024
AdaFisher: Adaptive Second Order Optimization via Fisher Information
Damien Martins Gomes
Yanlei Zhang
Eugene Belilovsky
Guy Wolf
Mahdi S. Hosseini
ODL
78
2
0
26 May 2024
BiSup: Bidirectional Quantization Error Suppression for Large Language Models
Minghui Zou
Ronghui Guo
Sai Zhang
Xiaowang Zhang
Zhiyong Feng
MQ
44
1
0
24 May 2024
Emergence of a High-Dimensional Abstraction Phase in Language Transformers
Emily Cheng
Diego Doimo
Corentin Kervadec
Iuri Macocco
Jade Yu
Alessandro Laio
Marco Baroni
112
11
0
24 May 2024
OAC: Output-adaptive Calibration for Accurate Post-training Quantization
Ali Edalati
Alireza Ghaffari
M. Asgharian
Lu Hou
Boxing Chen
Vahid Partovi Nia
V. Nia
MQ
93
0
0
23 May 2024
PyramidInfer: Pyramid KV Cache Compression for High-throughput LLM Inference
Dongjie Yang
Xiaodong Han
Yan Gao
Yao Hu
Shilin Zhang
Hai Zhao
41
53
0
21 May 2024
SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization
Jialong Guo
Xinghao Chen
Yehui Tang
Yunhe Wang
ViT
49
9
0
19 May 2024
FlashBack:Efficient Retrieval-Augmented Language Modeling for Long Context Inference
Runheng Liu
Xingchen Xiao
Heyan Huang
Zewen Chi
Zhijing Wu
RALM
KELM
34
0
0
07 May 2024
QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving
Chengyue Wu
Haotian Tang
Shang Yang
Zhekai Zhang
Guangxuan Xiao
Chuang Gan
Song Han
90
77
0
07 May 2024
Q-Newton: Hybrid Quantum-Classical Scheduling for Accelerating Neural Network Training with Newton's Gradient Descent
Pingzhi Li
Junyu Liu
Hanrui Wang
Tianlong Chen
96
1
0
30 Apr 2024
SOUL: Unlocking the Power of Second-Order Optimization for LLM Unlearning
Jinghan Jia
Yihua Zhang
Yimeng Zhang
Jiancheng Liu
Bharat Runwal
James Diffenderfer
B. Kailkhura
Sijia Liu
MU
45
35
0
28 Apr 2024
Temporal Scaling Law for Large Language Models
Yizhe Xiong
Xiansheng Chen
Xin Ye
Hui Chen
Zijia Lin
...
Zhenpeng Su
Wei Huang
Jianwei Niu
Jiawei Han
Guiguang Ding
45
10
0
27 Apr 2024
Mamba-360: Survey of State Space Models as Transformer Alternative for Long Sequence Modelling: Methods, Applications, and Challenges
Badri N. Patro
Vijay Srinivas Agneeswaran
Mamba
48
38
0
24 Apr 2024
An empirical study of LLaMA3 quantization: from LLMs to MLLMs
Wei Huang
Xingyu Zheng
Xudong Ma
Haotong Qin
Chengtao Lv
Hong Chen
Jie Luo
Xiaojuan Qi
Xianglong Liu
Michele Magno
MQ
63
38
0
22 Apr 2024
Adaptive Memory Replay for Continual Learning
James Seale Smith
Lazar Valkov
Shaunak Halbe
V. Gutta
Rogerio Feris
Z. Kira
Leonid Karlinsky
46
6
0
18 Apr 2024
σ-GPTs: A New Approach to Autoregressive Models
Arnaud Pannatier
Evann Courdier
Franccois Fleuret
AI4TS
28
7
0
15 Apr 2024
LATTE: Low-Precision Approximate Attention with Head-wise Trainable Threshold for Efficient Transformer
Jiing-Ping Wang
Ming-Guang Lin
An-Yeu Wu
Wu
32
1
0
11 Apr 2024
Continuous Language Model Interpolation for Dynamic and Controllable Text Generation
Sara Kangaslahti
David Alvarez-Melis
KELM
39
0
0
10 Apr 2024
Privacy Preserving Prompt Engineering: A Survey
Kennedy Edemacu
Xintao Wu
63
18
0
09 Apr 2024
Multilingual Brain Surgeon: Large Language Models Can be Compressed Leaving No Language Behind
Hongchuan Zeng
Hongshen Xu
Lu Chen
Kai Yu
59
5
0
06 Apr 2024
Cherry on Top: Parameter Heterogeneity and Quantization in Large Language Models
Wanyun Cui
Qianle Wang
MQ
42
2
0
03 Apr 2024
Laying Anchors: Semantically Priming Numerals in Language Modeling
Mandar Sharma
Rutuja Murlidhar Taware
Pravesh Koirala
Nikhil Muralidhar
Naren Ramakrishnan
31
2
0
02 Apr 2024
Accurate Block Quantization in LLMs with Outliers
Nikita Trukhanov
I. Soloveychik
MQ
31
4
0
29 Mar 2024
The Role of
n
n
n
-gram Smoothing in the Age of Neural Networks
Luca Malagutti
Andrius Buinovskij
Anej Svete
Clara Meister
Afra Amini
Ryan Cotterell
43
6
0
25 Mar 2024
SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression
Xin Wang
Yu Zheng
Zhongwei Wan
Mi Zhang
MQ
57
44
0
12 Mar 2024
GEAR: An Efficient KV Cache Compression Recipe for Near-Lossless Generative Inference of LLM
Hao Kang
Qingru Zhang
Souvik Kundu
Geonhwa Jeong
Zaoxing Liu
Tushar Krishna
Tuo Zhao
MQ
43
79
0
08 Mar 2024
Do You Trust Your Model? Emerging Malware Threats in the Deep Learning Ecosystem
Dorjan Hitaj
Giulio Pagnotta
Fabio De Gaspari
Sediola Ruko
Briland Hitaj
Luigi V. Mancini
Fernando Perez-Cruz
42
4
0
06 Mar 2024
Merging Text Transformer Models from Different Initializations
Neha Verma
Maha Elbayad
MoMe
67
7
0
01 Mar 2024
Heavy-Tailed Class Imbalance and Why Adam Outperforms Gradient Descent on Language Models
Frederik Kunstner
Robin Yadav
Alan Milligan
Mark Schmidt
Alberto Bietti
39
26
0
29 Feb 2024
CLLMs: Consistency Large Language Models
Siqi Kou
Lanxiang Hu
Zhe He
Zhijie Deng
Hao Zhang
49
28
0
28 Feb 2024
SparseLLM: Towards Global Pruning for Pre-trained Language Models
Guangji Bai
Yijiang Li
Chen Ling
Kibaek Kim
Liang Zhao
33
7
0
28 Feb 2024
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Shuming Ma
Hongyu Wang
Lingxiao Ma
Lei Wang
Wenhui Wang
Shaohan Huang
Lifeng Dong
Ruiping Wang
Jilong Xue
Furu Wei
MQ
45
208
0
27 Feb 2024
A Comprehensive Evaluation of Quantization Strategies for Large Language Models
Renren Jin
Jiangcun Du
Wuwei Huang
Wei Liu
Jian Luan
Bin Wang
Deyi Xiong
MQ
32
31
0
26 Feb 2024
GPTVQ: The Blessing of Dimensionality for LLM Quantization
M. V. Baalen
Andrey Kuzmin
Markus Nagel
Peter Couperus
Cédric Bastoul
E. Mahurin
Tijmen Blankevoort
Paul N. Whatmough
MQ
38
28
0
23 Feb 2024
Mudjacking: Patching Backdoor Vulnerabilities in Foundation Models
Hongbin Liu
Michael K. Reiter
Neil Zhenqiang Gong
AAML
44
2
0
22 Feb 2024
WKVQuant: Quantizing Weight and Key/Value Cache for Large Language Models Gains More
Yuxuan Yue
Zhihang Yuan
Haojie Duanmu
Sifan Zhou
Jianlong Wu
Liqiang Nie
MQ
40
42
0
19 Feb 2024
Machine-Generated Text Localization
Zhongping Zhang
Wenda Qin
Bryan A. Plummer
DeLMO
36
5
0
19 Feb 2024
Prompt-Based Bias Calibration for Better Zero/Few-Shot Learning of Language Models
Kang He
Yinghan Long
Kaushik Roy
28
2
0
15 Feb 2024
Quantized Embedding Vectors for Controllable Diffusion Language Models
Cheng Kang
Xinye Chen
Yong Hu
Daniel Novak
31
0
0
15 Feb 2024
Towards Next-Level Post-Training Quantization of Hyper-Scale Transformers
Junhan Kim
Kyungphil Park
Chungman Lee
Ho-Young Kim
Joonyoung Kim
Yongkweon Jeon
MQ
28
2
0
14 Feb 2024
RepQuant: Towards Accurate Post-Training Quantization of Large Transformer Models via Scale Reparameterization
Zhikai Li
Xuewen Liu
Jing Zhang
Qingyi Gu
MQ
51
7
0
08 Feb 2024
Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes
Lucio Dery
Steven Kolawole
Jean-Francois Kagey
Virginia Smith
Graham Neubig
Ameet Talwalkar
47
28
0
08 Feb 2024
The RL/LLM Taxonomy Tree: Reviewing Synergies Between Reinforcement Learning and Large Language Models
M. Pternea
Prerna Singh
Abir Chakraborty
Y. Oruganti
M. Milletarí
Sayli Bapat
Kebei Jiang
OffRL
33
7
0
02 Feb 2024
Positional Encoding Helps Recurrent Neural Networks Handle a Large Vocabulary
Takashi Morita
24
3
0
31 Jan 2024
TQCompressor: improving tensor decomposition methods in neural networks via permutations
V. Abronin
A. Naumov
D. Mazur
D. Bystrov
K. Tsarova
Ar. Melnikov
Ivan Oseledets
S. Dolgov
R. Brasher
M. Perelshtein
28
6
0
29 Jan 2024
FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design
Haojun Xia
Zhen Zheng
Xiaoxia Wu
Shiyang Chen
Zhewei Yao
...
Donglin Zhuang
Zhongzhu Zhou
Olatunji Ruwase
Yuxiong He
Shuaiwen Leon Song
MQ
38
14
0
25 Jan 2024
Previous
1
2
3
4
5
6
...
13
14
15
Next