Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.01068
Cited By
OPT: Open Pre-trained Transformer Language Models
2 May 2022
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
Shuohui Chen
Christopher Dewan
Mona T. Diab
Xian Li
Xi Lin
Todor Mihaylov
Myle Ott
Sam Shleifer
Kurt Shuster
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
VLM
OSLM
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"OPT: Open Pre-trained Transformer Language Models"
50 / 767 papers shown
Title
Constrained Decoding with Speculative Lookaheads
Nishanth Nakshatri
Shamik Roy
Rajarshi Das
Suthee Chaidaroon
Leonid Boytsov
Rashmi Gangadharaiah
79
0
0
09 Dec 2024
Taming Sensitive Weights : Noise Perturbation Fine-tuning for Robust LLM Quantization
Dongwei Wang
Huanrui Yang
MQ
90
1
0
08 Dec 2024
Beyond Text-Visual Attention: Exploiting Visual Cues for Effective Token Pruning in VLMs
Qizhe Zhang
Aosong Cheng
Ming Lu
Zhiyong Zhuo
Minqi Wang
Jiajun Cao
Shaobo Guo
Qi She
Shanghang Zhang
VLM
92
11
0
02 Dec 2024
Efficient LLM Inference using Dynamic Input Pruning and Cache-Aware Masking
Marco Federici
Davide Belli
M. V. Baalen
Amir Jalalirad
Andrii Skliar
Bence Major
Markus Nagel
Paul N. Whatmough
76
0
0
02 Dec 2024
Lifelong Knowledge Editing for Vision Language Models with Low-Rank Mixture-of-Experts
Qizhou Chen
Chengyu Wang
Dakan Wang
Taolin Zhang
Wangyue Li
Xiaofeng He
KELM
83
1
0
23 Nov 2024
Hardware Scaling Trends and Diminishing Returns in Large-Scale Distributed Training
Jared Fernandez
Luca Wehrstedt
Leonid Shamis
Mostafa Elhoushi
Kalyan Saladi
Yonatan Bisk
Emma Strubell
Jacob Kahn
245
3
0
20 Nov 2024
BitMoD: Bit-serial Mixture-of-Datatype LLM Acceleration
Yuzong Chen
Ahmed F. AbouElhamayed
Xilai Dai
Yang Wang
Marta Andronic
George A. Constantinides
Mohamed S. Abdelfattah
MQ
108
1
0
18 Nov 2024
Beyond the Safety Bundle: Auditing the Helpful and Harmless Dataset
Khaoula Chehbouni
Jonathan Colaço-Carr
Yash More
Jackie CK Cheung
G. Farnadi
78
0
0
12 Nov 2024
Zeroth-Order Adaptive Neuron Alignment Based Pruning without Re-Training
Elia Cunegatti
Leonardo Lucio Custode
Giovanni Iacca
52
0
0
11 Nov 2024
Prompt-Guided Internal States for Hallucination Detection of Large Language Models
Fujie Zhang
Peiqi Yu
Biao Yi
Baolei Zhang
Tong Li
Zheli Liu
HILM
LRM
57
0
0
07 Nov 2024
What Goes Into a LM Acceptability Judgment? Rethinking the Impact of Frequency and Length
Lindia Tjuatja
Graham Neubig
Tal Linzen
Sophie Hao
47
1
0
04 Nov 2024
TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters
Haiyang Wang
Yue Fan
Muhammad Ferjad Naeem
Yongqin Xian
J. E. Lenssen
Liwei Wang
F. Tombari
Bernt Schiele
49
2
0
30 Oct 2024
ProMoE: Fast MoE-based LLM Serving using Proactive Caching
Xiaoniu Song
Zihang Zhong
Rong Chen
Haibo Chen
MoE
65
4
0
29 Oct 2024
FACTS: A Factored State-Space Framework For World Modelling
Li Nanbo
Firas Laakom
Yucheng Xu
Wenyi Wang
Jürgen Schmidhuber
AI4TS
196
0
0
28 Oct 2024
SWITCH: Studying with Teacher for Knowledge Distillation of Large Language Models
Jahyun Koo
Yerin Hwang
Yongil Kim
Taegwan Kang
Hyunkyung Bae
Kyomin Jung
60
0
0
25 Oct 2024
Towards Better Open-Ended Text Generation: A Multicriteria Evaluation Framework
Esteban Garces Arias
Hannah Blocher
Julian Rodemann
Meimingwei Li
Christian Heumann
Matthias Aßenmacher
28
1
0
24 Oct 2024
Self-calibration for Language Model Quantization and Pruning
Miles Williams
G. Chrysostomou
Nikolaos Aletras
MQ
186
0
0
22 Oct 2024
Rethinking Transformer for Long Contextual Histopathology Whole Slide Image Analysis
Honglin Li
Yunlong Zhang
Pingyi Chen
Zhongyi Shui
Chenglu Zhu
Lin Yang
MedIm
49
4
0
18 Oct 2024
From Gradient Clipping to Normalization for Heavy Tailed SGD
Florian Hübler
Ilyas Fatkhullin
Niao He
40
5
0
17 Oct 2024
HerO at AVeriTeC: The Herd of Open Large Language Models for Verifying Real-World Claims
Yejun Yoon
Jaeyoon Jung
Seunghyun Yoon
Kunwoo Park
VLM
LRM
29
2
0
16 Oct 2024
DAQ: Density-Aware Post-Training Weight-Only Quantization For LLMs
Yingsong Luo
Ling Chen
MQ
23
0
0
16 Oct 2024
Reconstruction of Differentially Private Text Sanitization via Large Language Models
Shuchao Pang
Zhigang Lu
Haoran Wang
Peng Fu
Yongbin Zhou
Minhui Xue
AAML
61
4
0
16 Oct 2024
Scaling laws for post-training quantized large language models
Zifei Xu
Alexander Lan
W. Yazar
T. Webb
Sayeh Sharify
Xin Wang
MQ
33
0
0
15 Oct 2024
Converging to a Lingua Franca: Evolution of Linguistic Regions and Semantics Alignment in Multilingual Large Language Models
Hongchuan Zeng
Senyu Han
Lu Chen
Kai Yu
62
6
0
15 Oct 2024
ControlMM: Controllable Masked Motion Generation
Ekkasit Pinyoanuntapong
Muhammad Usama Saleem
Korrawe Karunratanakul
Pu Wang
Hongfei Xue
Chong Chen
Chuan Guo
Junli Cao
J. Ren
Sergey Tulyakov
VGen
37
4
0
14 Oct 2024
ImagineNav: Prompting Vision-Language Models as Embodied Navigator through Scene Imagination
Xinxin Zhao
Wenzhe Cai
Likun Tang
Teng Wang
LM&Ro
40
3
0
13 Oct 2024
Simultaneous Reward Distillation and Preference Learning: Get You a Language Model Who Can Do Both
Abhijnan Nath
Changsoo Jung
Ethan Seefried
Nikhil Krishnaswamy
179
1
0
11 Oct 2024
StablePrompt: Automatic Prompt Tuning using Reinforcement Learning for Large Language Models
Minchan Kwon
Gaeun Kim
Jongsuk Kim
Haeil Lee
Junmo Kim
OffRL
LRM
LLMAG
26
2
0
10 Oct 2024
No Free Lunch: Retrieval-Augmented Generation Undermines Fairness in LLMs, Even for Vigilant Users
Mengxuan Hu
Hongyi Wu
Zihan Guan
Ronghang Zhu
Dongliang Guo
Daiqing Qi
Sheng Li
SILM
41
3
0
10 Oct 2024
Detecting Training Data of Large Language Models via Expectation Maximization
Gyuwan Kim
Yang Li
Evangelia Spiliopoulou
Jie Ma
Miguel Ballesteros
William Yang Wang
MIALM
98
4
2
10 Oct 2024
Fine-tuning can Help Detect Pretraining Data from Large Language Models
Han Zhang
Songxin Zhang
Bingyi Jing
Hongxin Wei
43
0
0
09 Oct 2024
Rodimus*: Breaking the Accuracy-Efficiency Trade-Off with Efficient Attentions
Zhihao He
Hang Yu
Zi Gong
Shizhan Liu
J. Li
Weiyao Lin
VLM
38
1
0
09 Oct 2024
An Eye for an Ear: Zero-shot Audio Description Leveraging an Image Captioner using Audiovisual Distribution Alignment
Hugo Malard
Michel Olvera
Stéphane Lathuilière
S. Essid
VLM
39
0
0
08 Oct 2024
DEPT: Decoupled Embeddings for Pre-training Language Models
Alex Iacob
Lorenzo Sani
Meghdad Kurmanji
William F. Shen
Xinchi Qiu
Dongqi Cai
Yan Gao
Nicholas D. Lane
VLM
191
0
0
07 Oct 2024
Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective
Jinhao Li
Jiaming Xu
Shan Huang
Yonghua Chen
Wen Li
...
Jiayi Pan
Li Ding
Hao Zhou
Yu Wang
Guohao Dai
62
16
0
06 Oct 2024
Can Watermarked LLMs be Identified by Users via Crafted Prompts?
Aiwei Liu
Sheng Guan
Yong-Jin Liu
L. Pan
Yifei Zhang
Liancheng Fang
Lijie Wen
Philip S. Yu
Xuming Hu
WaLM
167
2
0
04 Oct 2024
ARB-LLM: Alternating Refined Binarizations for Large Language Models
Zhiteng Li
Xinyu Yan
Tianao Zhang
Haotong Qin
Dong Xie
Jiang Tian
Zhongchao Shi
Linghe Kong
Yulun Zhang
Xiaokang Yang
MQ
37
2
0
04 Oct 2024
X-ALMA: Plug & Play Modules and Adaptive Rejection for Quality Translation at Scale
Haoran Xu
Kenton W. Murray
Philipp Koehn
Hieu T. Hoang
Akiko Eriguchi
Huda Khayrallah
37
8
0
04 Oct 2024
Undesirable Memorization in Large Language Models: A Survey
Ali Satvaty
Suzan Verberne
Fatih Turkmen
ELM
PILM
86
7
0
03 Oct 2024
Agent-Oriented Planning in Multi-Agent Systems
Ao Li
Yuexiang Xie
Songze Li
Fugee Tsung
Bolin Ding
Yaliang Li
AIFin
143
6
0
03 Oct 2024
U-shaped and Inverted-U Scaling behind Emergent Abilities of Large Language Models
Tung-Yu Wu
Pei-Yu Lo
ReLM
LRM
46
2
0
02 Oct 2024
Recent Advances in Speech Language Models: A Survey
Wenqian Cui
Dianzhi Yu
Xiaoqi Jiao
Ziqiao Meng
Guangyan Zhang
Qichao Wang
Yiwen Guo
Irwin King
AuLLM
61
14
0
01 Oct 2024
CoTKR: Chain-of-Thought Enhanced Knowledge Rewriting for Complex Knowledge Graph Question Answering
Yike Wu
Yi Huang
Nan Hu
Yuncheng Hua
Guilin Qi
Jiaoyan Chen
Jeff Z. Pan
44
6
0
29 Sep 2024
Accumulator-Aware Post-Training Quantization
Ian Colbert
Fabian Grob
Giuseppe Franco
Jinjie Zhang
Rayan Saab
MQ
35
4
0
25 Sep 2024
Pretraining Data Detection for Large Language Models: A Divergence-based Calibration Method
Weichao Zhang
Ruqing Zhang
Jiafeng Guo
Maarten de Rijke
Yixing Fan
Xueqi Cheng
38
8
0
23 Sep 2024
TinyVLA: Towards Fast, Data-Efficient Vision-Language-Action Models for Robotic Manipulation
Junjie Wen
Bo Li
Jinming Li
Minjie Zhu
Kun Wu
...
Ran Cheng
Chaomin Shen
Yaxin Peng
Feifei Feng
Jian Tang
LM&Ro
74
48
0
19 Sep 2024
Large Language Models are Good Multi-lingual Learners : When LLMs Meet Cross-lingual Prompts
Teng Wang
Zhenqi He
Wing-Yin Yu
Xiaojin Fu
Xiongwei Han
LRM
59
5
0
17 Sep 2024
Fit and Prune: Fast and Training-free Visual Token Pruning for Multi-modal Large Language Models
Weihao Ye
Qiong Wu
Wenhao Lin
Yiyi Zhou
VLM
41
10
0
16 Sep 2024
Your Weak LLM is Secretly a Strong Teacher for Alignment
Leitian Tao
Yixuan Li
88
5
0
13 Sep 2024
Multi-modal Speech Transformer Decoders: When Do Multiple Modalities Improve Accuracy?
Yiwen Guan
V. Trinh
Vivek Voleti
Jacob Whitehill
42
1
0
13 Sep 2024
Previous
1
2
3
4
5
6
...
14
15
16
Next