Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.01068
Cited By
OPT: Open Pre-trained Transformer Language Models
2 May 2022
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
Shuohui Chen
Christopher Dewan
Mona T. Diab
Xian Li
Xi Lin
Todor Mihaylov
Myle Ott
Sam Shleifer
Kurt Shuster
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
VLM
OSLM
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"OPT: Open Pre-trained Transformer Language Models"
50 / 2,439 papers shown
Title
Ripple: Accelerating LLM Inference on Smartphones with Correlation-Aware Neuron Management
Tuowei Wang
Ruwen Fan
Minxing Huang
Zixu Hao
Kun Li
Ting Cao
Youyou Lu
Yaoxue Zhang
Ju Ren
53
2
0
25 Oct 2024
SWITCH: Studying with Teacher for Knowledge Distillation of Large Language Models
Jahyun Koo
Yerin Hwang
Yongil Kim
Taegwan Kang
Hyunkyung Bae
Kyomin Jung
60
0
0
25 Oct 2024
Visual Text Matters: Improving Text-KVQA with Visual Text Entity Knowledge-aware Large Multimodal Assistant
A. S. Penamakuri
Anand Mishra
32
1
0
24 Oct 2024
On the Crucial Role of Initialization for Matrix Factorization
Bingcong Li
Liang Zhang
Aryan Mokhtari
Niao He
31
1
0
24 Oct 2024
Provably Robust Watermarks for Open-Source Language Models
Miranda Christ
Sam Gunn
Tal Malkin
Mariana Raykova
WaLM
45
2
0
24 Oct 2024
Delving into the Reversal Curse: How Far Can Large Language Models Generalize?
Zhengkai Lin
Z. Fu
Kai Liu
Liang Xie
Binbin Lin
Wenxiao Wang
D. Cai
Yue Wu
Jieping Ye
LRM
25
3
0
24 Oct 2024
Towards Better Open-Ended Text Generation: A Multicriteria Evaluation Framework
Esteban Garces Arias
Hannah Blocher
Julian Rodemann
Meimingwei Li
Christian Heumann
Matthias Aßenmacher
28
1
0
24 Oct 2024
CoreInfer: Accelerating Large Language Model Inference with Semantics-Inspired Adaptive Sparse Activation
Qinsi Wang
Saeed Vahidian
Hancheng Ye
Jianyang Gu
Jianyi Zhang
Yiran Chen
16
3
0
23 Oct 2024
Influential Language Data Selection via Gradient Trajectory Pursuit
Zhiwei Deng
Tao Li
Yang Li
26
1
0
22 Oct 2024
Self-calibration for Language Model Quantization and Pruning
Miles Williams
G. Chrysostomou
Nikolaos Aletras
MQ
195
0
0
22 Oct 2024
MoRE: Multi-Modal Contrastive Pre-training with Transformers on X-Rays, ECGs, and Diagnostic Report
Samrajya Thapa
Koushik Howlader
Subhankar Bhattacharjee
Wei le
MedIm
32
1
0
21 Oct 2024
Pruning Foundation Models for High Accuracy without Retraining
Pu Zhao
Fei Sun
Xuan Shen
Pinrui Yu
Zhenglun Kong
Yanzhi Wang
Xue Lin
41
10
0
21 Oct 2024
Taming Mambas for Voxel Level 3D Medical Image Segmentation
Luca Lumetti
Vittorio Pipoli
Kevin Marchesini
Elisa Ficarra
C. Grana
Federico Bolelli
MedIm
Mamba
29
0
0
20 Oct 2024
Neural Normalized Compression Distance and the Disconnect Between Compression and Classification
John Hurwitz
Charles K. Nicholas
Edward Raff
21
0
0
20 Oct 2024
GDPO: Learning to Directly Align Language Models with Diversity Using GFlowNets
Oh Joon Kwon
Daiki E. Matsunaga
Kee-Eung Kim
AI4CE
31
0
0
19 Oct 2024
Group Diffusion Transformers are Unsupervised Multitask Learners
Lianghua Huang
Wei Wang
Zhi-Fan Wu
Huanzhang Dou
Yupeng Shi
Yutong Feng
C. Liang
Yu Liu
Jingren Zhou
VLM
49
12
0
19 Oct 2024
Implicit Regularization of Sharpness-Aware Minimization for Scale-Invariant Problems
Bingcong Li
Liang Zhang
Niao He
58
3
0
18 Oct 2024
The Propensity for Density in Feed-forward Models
Nandi Schoots
Alex Jackson
Ali Kholmovaia
Peter McBurney
Murray Shanahan
CVBM
26
0
0
18 Oct 2024
Rethinking Transformer for Long Contextual Histopathology Whole Slide Image Analysis
Honglin Li
Yunlong Zhang
Pingyi Chen
Zhongyi Shui
Chenglu Zhu
Lin Yang
MedIm
55
4
0
18 Oct 2024
From Babbling to Fluency: Evaluating the Evolution of Language Models in Terms of Human Language Acquisition
Qiyuan Yang
Pengda Wang
Luke D. Plonsky
Frederick L. Oswald
Hanjie Chen
ELM
28
2
0
17 Oct 2024
From Gradient Clipping to Normalization for Heavy Tailed SGD
Florian Hübler
Ilyas Fatkhullin
Niao He
40
5
0
17 Oct 2024
FALCON: Pinpointing and Mitigating Stragglers for Large-Scale Hybrid-Parallel Training
Tianyuan Wu
Wei Wang
Yinghao Yu
Siran Yang
Wenchao Wu
Qinkai Duan
Guodong Yang
Jiamang Wang
Lin Qu
Liping Zhang
43
6
0
16 Oct 2024
MlingConf: A Comprehensive Study of Multilingual Confidence Estimation on Large Language Models
Boyang Xue
Hongru Wang
Rui Wang
Sheng Wang
Zezhong Wang
Yiming Du
Bin Liang
Kam-Fai Wong
34
0
0
16 Oct 2024
HerO at AVeriTeC: The Herd of Open Large Language Models for Verifying Real-World Claims
Yejun Yoon
Jaeyoon Jung
Seunghyun Yoon
Kunwoo Park
VLM
LRM
29
2
0
16 Oct 2024
DAQ: Density-Aware Post-Training Weight-Only Quantization For LLMs
Yingsong Luo
Ling Chen
MQ
23
0
0
16 Oct 2024
Channel-Wise Mixed-Precision Quantization for Large Language Models
Zihan Chen
Bike Xie
Jundong Li
Cong Shen
MQ
39
2
0
16 Oct 2024
Reconstruction of Differentially Private Text Sanitization via Large Language Models
Shuchao Pang
Zhigang Lu
Haoran Wang
Peng Fu
Yongbin Zhou
Minhui Xue
AAML
61
4
0
16 Oct 2024
Scaling laws for post-training quantized large language models
Zifei Xu
Alexander Lan
W. Yazar
T. Webb
Sayeh Sharify
Xin Wang
MQ
35
0
0
15 Oct 2024
DISP-LLM: Dimension-Independent Structural Pruning for Large Language Models
Shangqian Gao
Chi-Heng Lin
Ting Hua
Tang Zheng
Yilin Shen
Hongxia Jin
Yen-Chang Hsu
30
3
0
15 Oct 2024
On the Training Convergence of Transformers for In-Context Classification
Wei Shen
Ruida Zhou
Jing Yang
Cong Shen
31
3
0
15 Oct 2024
Layer-wise Importance Matters: Less Memory for Better Performance in Parameter-efficient Fine-tuning of Large Language Models
Kai Yao
P. Gao
Lichun Li
Yuan Zhao
Xiaofeng Wang
Wei Wang
Jianke Zhu
26
1
0
15 Oct 2024
Converging to a Lingua Franca: Evolution of Linguistic Regions and Semantics Alignment in Multilingual Large Language Models
Hongchuan Zeng
Senyu Han
Lu Chen
Kai Yu
62
6
0
15 Oct 2024
LLM Unlearning via Loss Adjustment with Only Forget Data
Yaxuan Wang
Jiaheng Wei
Chris Liu
Jinlong Pang
Qiang Liu
A. Shah
Yujia Bao
Yang Liu
Wei Wei
KELM
MU
43
8
0
14 Oct 2024
ControlMM: Controllable Masked Motion Generation
Ekkasit Pinyoanuntapong
Muhammad Usama Saleem
Korrawe Karunratanakul
Pu Wang
Hongfei Xue
Chong Chen
Chuan Guo
Junli Cao
J. Ren
Sergey Tulyakov
VGen
37
4
0
14 Oct 2024
MLP-SLAM: Multilayer Perceptron-Based Simultaneous Localization and Mapping With a Dynamic and Static Object Discriminator
Taozhe Li
Wei Sun
34
0
0
14 Oct 2024
Safety-Aware Fine-Tuning of Large Language Models
Hyeong Kyu Choi
Xuefeng Du
Yixuan Li
45
12
0
13 Oct 2024
ImagineNav: Prompting Vision-Language Models as Embodied Navigator through Scene Imagination
Xinxin Zhao
Wenzhe Cai
Likun Tang
Teng Wang
LM&Ro
40
3
0
13 Oct 2024
Simultaneous Computation and Memory Efficient Zeroth-Order Optimizer for Fine-Tuning Large Language Models
Fei Wang
Li Shen
Liang Ding
Chao Xue
Ye Liu
Changxing Ding
32
0
0
13 Oct 2024
Surgical-LLaVA: Toward Surgical Scenario Understanding via Large Language and Vision Models
Juseong Jin
Chang Wook Jeong
33
3
0
13 Oct 2024
Skipping Computations in Multimodal LLMs
Mustafa Shukor
Matthieu Cord
31
2
0
12 Oct 2024
Prompting Video-Language Foundation Models with Domain-specific Fine-grained Heuristics for Video Question Answering
Ting Yu
Kunhao Fu
Shuhui Wang
Qingming Huang
Jun Yu
49
0
0
12 Oct 2024
Multi-granularity Contrastive Cross-modal Collaborative Generation for End-to-End Long-term Video Question Answering
Ting Yu
Kunhao Fu
Jian Zhang
Qingming Huang
Jun Yu
39
2
0
12 Oct 2024
Zero-shot Commonsense Reasoning over Machine Imagination
Hyuntae Park
Yeachan Kim
Jun-Hyung Park
S. Lee
ReLM
VLM
LRM
29
1
0
12 Oct 2024
DeltaDQ: Ultra-High Delta Compression for Fine-Tuned LLMs via Group-wise Dropout and Separate Quantization
Yanfeng Jiang
Zelan Yang
B. Chen
Shen Li
Yong Li
Tao Li
MQ
36
0
0
11 Oct 2024
QEFT: Quantization for Efficient Fine-Tuning of LLMs
Changhun Lee
Jun-gyu Jin
Younghyun Cho
Eunhyeok Park
MQ
48
1
0
11 Oct 2024
Simultaneous Reward Distillation and Preference Learning: Get You a Language Model Who Can Do Both
Abhijnan Nath
Changsoo Jung
Ethan Seefried
Nikhil Krishnaswamy
191
1
0
11 Oct 2024
The Large Language Model GreekLegalRoBERTa
Vasileios Saketos
D. Pantazi
Manolis Koubarakis
AILaw
34
0
0
10 Oct 2024
A Target-Aware Analysis of Data Augmentation for Hate Speech Detection
Camilla Casula
Sara Tonelli
31
0
0
10 Oct 2024
StablePrompt: Automatic Prompt Tuning using Reinforcement Learning for Large Language Models
Minchan Kwon
Gaeun Kim
Jongsuk Kim
Haeil Lee
Junmo Kim
OffRL
LRM
LLMAG
26
2
0
10 Oct 2024
No Free Lunch: Retrieval-Augmented Generation Undermines Fairness in LLMs, Even for Vigilant Users
Mengxuan Hu
Hongyi Wu
Zihan Guan
Ronghang Zhu
Dongliang Guo
Daiqing Qi
Sheng Li
SILM
41
3
0
10 Oct 2024
Previous
1
2
3
...
6
7
8
...
47
48
49
Next