Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.01068
Cited By
OPT: Open Pre-trained Transformer Language Models
2 May 2022
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
Shuohui Chen
Christopher Dewan
Mona T. Diab
Xian Li
Xi Lin
Todor Mihaylov
Myle Ott
Sam Shleifer
Kurt Shuster
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
VLM
OSLM
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"OPT: Open Pre-trained Transformer Language Models"
50 / 2,456 papers shown
Title
When Only Time Will Tell: Interpreting How Transformers Process Local Ambiguities Through the Lens of Restart-Incrementality
Brielen Madureira
Patrick Kahardipraja
David Schlangen
44
2
0
20 Feb 2024
Comparing Inferential Strategies of Humans and Large Language Models in Deductive Reasoning
Philipp Mondorf
Barbara Plank
LRM
40
9
0
20 Feb 2024
Exploring the Impact of Table-to-Text Methods on Augmenting LLM-based Question Answering with Domain Hybrid Data
Dehai Min
Nan Hu
Rihui Jin
Nuo Lin
Jiaoyan Chen
...
Yu Li
Guilin Qi
Yun Li
Nijun Li
Qianren Wang
LMTD
33
14
0
20 Feb 2024
Instruction-tuned Language Models are Better Knowledge Learners
Zhengbao Jiang
Zhiqing Sun
Weijia Shi
Pedro Rodriguez
Chunting Zhou
Graham Neubig
Xi Lin
Wen-tau Yih
Srinivasan Iyer
KELM
46
34
0
20 Feb 2024
PromptKD: Distilling Student-Friendly Knowledge for Generative Language Models via Prompt Tuning
Gyeongman Kim
Doohyuk Jang
Eunho Yang
VLM
46
13
0
20 Feb 2024
Standardize: Aligning Language Models with Expert-Defined Standards for Content Generation
Joseph Marvin Imperial
Gail Forey
Harish Tayyar Madabushi
ALM
50
3
0
19 Feb 2024
Do Pre-Trained Language Models Detect and Understand Semantic Underspecification? Ask the DUST!
Frank Wildenburg
Michael Hanna
Sandro Pezzelle
44
3
0
19 Feb 2024
The Revolution of Multimodal Large Language Models: A Survey
Davide Caffagni
Federico Cocchi
Luca Barsellotti
Nicholas Moratelli
Sara Sarto
Lorenzo Baraldi
Lorenzo Baraldi
Marcella Cornia
Rita Cucchiara
LRM
VLM
66
43
0
19 Feb 2024
Amplifying Training Data Exposure through Fine-Tuning with Pseudo-Labeled Memberships
Myung Gyo Oh
Hong Eun Ahn
L. Park
T.-H. Kwon
MIALM
AAML
37
0
0
19 Feb 2024
Is It a Free Lunch for Removing Outliers during Pretraining?
Baohao Liao
Christof Monz
MQ
37
1
0
19 Feb 2024
WKVQuant: Quantizing Weight and Key/Value Cache for Large Language Models Gains More
Yuxuan Yue
Zhihang Yuan
Haojie Duanmu
Sifan Zhou
Jianlong Wu
Liqiang Nie
MQ
40
42
0
19 Feb 2024
Language Model Adaptation to Specialized Domains through Selective Masking based on Genre and Topical Characteristics
Anas Belfathi
Ygor Gallina
Nicolas Hernandez
Richard Dufour
Laura Monceaux
44
1
0
19 Feb 2024
Investigating Multi-Hop Factual Shortcuts in Knowledge Editing of Large Language Models
Tianjie Ju
Yijin Chen
Xinwei Yuan
Zhuosheng Zhang
Wei Du
Yubin Zheng
Gongshen Liu
KELM
33
8
0
19 Feb 2024
Discerning and Resolving Knowledge Conflicts through Adaptive Decoding with Contextual Information-Entropy Constraint
Xiaowei Yuan
Zhao Yang
Yequan Wang
Shengping Liu
Jun Zhao
Kang Liu
26
9
0
19 Feb 2024
Revisiting Knowledge Distillation for Autoregressive Language Models
Qihuang Zhong
Liang Ding
Li Shen
Juhua Liu
Bo Du
Dacheng Tao
KELM
49
19
0
19 Feb 2024
Machine-Generated Text Localization
Zhongping Zhang
Wenda Qin
Bryan A. Plummer
DeLMO
36
5
0
19 Feb 2024
Stumbling Blocks: Stress Testing the Robustness of Machine-Generated Text Detectors Under Attacks
Yichen Wang
Shangbin Feng
Abe Bohan Hou
Xiao Pu
Chao Shen
Xiaoming Liu
Yulia Tsvetkov
Tianxing He
DeLMO
48
17
0
18 Feb 2024
Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark
Yihua Zhang
Pingzhi Li
Junyuan Hong
Jiaxiang Li
Yimeng Zhang
...
Wotao Yin
Mingyi Hong
Zhangyang Wang
Sijia Liu
Tianlong Chen
38
45
0
18 Feb 2024
BESA: Pruning Large Language Models with Blockwise Parameter-Efficient Sparsity Allocation
Peng Xu
Wenqi Shao
Yonghong Tian
Shitao Tang
Kai-Chuang Zhang
Peng Gao
Fengwei An
Yu Qiao
Ping Luo
MoE
40
27
0
18 Feb 2024
In-Context Example Ordering Guided by Label Distributions
Zhichao Xu
Daniel Cohen
Bei Wang
Vivek Srikumar
44
7
0
18 Feb 2024
Why Lift so Heavy? Slimming Large Language Models by Cutting Off the Layers
Shuzhou Yuan
Ercong Nie
Bolei Ma
Michael Farber
47
3
0
18 Feb 2024
k-SemStamp: A Clustering-Based Semantic Watermark for Detection of Machine-Generated Text
Abe Bohan Hou
Jingyu Zhang
Yichen Wang
Daniel Khashabi
Tianxing He
WaLM
89
16
0
17 Feb 2024
Controlled Text Generation for Large Language Model with Dynamic Attribute Graphs
Xun Liang
Hanyu Wang
Shichao Song
Mengting Hu
Xunzhi Wang
Zhiyu Li
Zhiyu Li
Simin Niu
28
9
0
17 Feb 2024
Exploring ChatGPT for Next-generation Information Retrieval: Opportunities and Challenges
Yizheng Huang
Jimmy X. Huang
45
10
0
17 Feb 2024
Disclosure and Mitigation of Gender Bias in LLMs
Xiangjue Dong
Yibo Wang
Philip S. Yu
James Caverlee
12
30
0
17 Feb 2024
Speculative Streaming: Fast LLM Inference without Auxiliary Models
Nikhil Bhendawade
Irina Belousova
Qichen Fu
Henry Mason
Mohammad Rastegari
Mahyar Najibi
LRM
36
29
0
16 Feb 2024
EdgeQAT: Entropy and Distribution Guided Quantization-Aware Training for the Acceleration of Lightweight LLMs on the Edge
Xuan Shen
Zhenglun Kong
Changdi Yang
Zhaoyang Han
Lei Lu
...
Zhihao Shu
Wei Niu
Miriam Leeser
Pu Zhao
Yanzhi Wang
MQ
56
18
0
16 Feb 2024
Rethinking Human-like Translation Strategy: Integrating Drift-Diffusion Model with Large Language Models for Machine Translation
Hongbin Na
Zimu Wang
M. Maimaiti
Tong Chen
Wei Wang
Tao Shen
Ling Chen
LRM
30
5
0
16 Feb 2024
Exploring Precision and Recall to assess the quality and diversity of LLMs
Florian Le Bronnec
Alexandre Verine
Benjamin Négrevergne
Y. Chevaleyre
Alexandre Allauzen
46
14
0
16 Feb 2024
Can Separators Improve Chain-of-Thought Prompting?
Yoonjeong Park
Hyunjin Kim
Chanyeol Choi
Junseong Kim
Jy-yong Sohn
LRM
ReLM
26
2
0
16 Feb 2024
Smaller Language Models are capable of selecting Instruction-Tuning Training Data for Larger Language Models
Dheeraj Mekala
Alex Nguyen
Jingbo Shang
ALM
33
19
0
16 Feb 2024
Measuring and Reducing LLM Hallucination without Gold-Standard Answers
Jiaheng Wei
Yuanshun Yao
Jean-François Ton
Hongyi Guo
Andrew Estornell
Yang Liu
HILM
55
18
0
16 Feb 2024
Uncertainty Quantification for In-Context Learning of Large Language Models
Chen Ling
Xujiang Zhao
Xuchao Zhang
Wei Cheng
Yanchi Liu
...
Katsushi Matsuda
Jie Ji
Guangji Bai
Liang Zhao
Haifeng Chen
29
14
0
15 Feb 2024
TOAD: Task-Oriented Automatic Dialogs with Diverse Response Styles
Yinhong Liu
Yimai Fang
David Vandyke
Nigel Collier
49
3
0
15 Feb 2024
Quantized Embedding Vectors for Controllable Diffusion Language Models
Cheng Kang
Xinye Chen
Yong Hu
Daniel Novak
31
0
0
15 Feb 2024
Towards Safer Large Language Models through Machine Unlearning
Zheyuan Liu
Guangyao Dou
Zhaoxuan Tan
Yijun Tian
Meng Jiang
KELM
MU
43
73
0
15 Feb 2024
NutePrune: Efficient Progressive Pruning with Numerous Teachers for Large Language Models
Shengrui Li
Junzhe Chen
Xueting Han
Jing Bai
24
6
0
15 Feb 2024
Model Compression and Efficient Inference for Large Language Models: A Survey
Wenxiao Wang
Wei Chen
Yicong Luo
Yongliu Long
Zhengkai Lin
Liye Zhang
Binbin Lin
Deng Cai
Xiaofei He
MQ
46
48
0
15 Feb 2024
An Accelerated Distributed Stochastic Gradient Method with Momentum
Kun-Yen Huang
Shi Pu
Angelia Nedić
35
8
0
15 Feb 2024
How to Train Data-Efficient LLMs
Noveen Sachdeva
Benjamin Coleman
Wang-Cheng Kang
Jianmo Ni
Lichan Hong
Ed H. Chi
James Caverlee
Julian McAuley
D. Cheng
34
52
0
15 Feb 2024
Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation
Xiaoying Zhang
Baolin Peng
Ye Tian
Jingyan Zhou
Lifeng Jin
Linfeng Song
Haitao Mi
Helen Meng
HILM
45
45
0
14 Feb 2024
Towards Next-Level Post-Training Quantization of Hyper-Scale Transformers
Junhan Kim
Kyungphil Park
Chungman Lee
Ho-Young Kim
Joonyoung Kim
Yongkweon Jeon
MQ
28
2
0
14 Feb 2024
Improving Generalization in Semantic Parsing by Increasing Natural Language Variation
Irina Saparina
Mirella Lapata
27
1
0
13 Feb 2024
Visually Dehallucinative Instruction Generation
Sungguk Cha
Jusung Lee
Younghyun Lee
Cheoljong Yang
MLLM
22
5
0
13 Feb 2024
Eliciting Personality Traits in Large Language Models
Airlie Hilliard
Cristian Muñoz
Zekun Wu
Adriano Soares Koshiyama
19
7
0
13 Feb 2024
LoTa-Bench: Benchmarking Language-oriented Task Planners for Embodied Agents
Jae-Woo Choi
Youngwoo Yoon
Hyobin Ong
Jaehong Kim
Minsu Jang
19
13
0
13 Feb 2024
Differentially Private Zeroth-Order Methods for Scalable Large Language Model Finetuning
Zhicheng Liu
Jian Lou
Wenxuan Bao
Yihan Hu
Baochun Li
Zengchang Qin
K. Ren
37
7
0
12 Feb 2024
TransGPT: Multi-modal Generative Pre-trained Transformer for Transportation
Peng Wang
Xiang Wei
Fangxu Hu
Wenjuan Han
38
17
0
11 Feb 2024
Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts Models
Keisuke Kamahori
Tian Tang
Yile Gu
Kan Zhu
Baris Kasikci
71
20
0
10 Feb 2024
SuperBench: Improving Cloud AI Infrastructure Reliability with Proactive Validation
Yifan Xiong
Yuting Jiang
Ziyue Yang
L. Qu
Guoshuai Zhao
...
Luke Melton
Joe Chau
Peng Cheng
Yongqiang Xiong
Lidong Zhou
60
6
0
09 Feb 2024
Previous
1
2
3
...
21
22
23
...
48
49
50
Next