Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.01068
Cited By
OPT: Open Pre-trained Transformer Language Models
2 May 2022
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
Shuohui Chen
Christopher Dewan
Mona T. Diab
Xian Li
Xi Lin
Todor Mihaylov
Myle Ott
Sam Shleifer
Kurt Shuster
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
VLM
OSLM
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"OPT: Open Pre-trained Transformer Language Models"
50 / 2,492 papers shown
Title
R2GenGPT: Radiology Report Generation with Frozen LLMs
Zhanyu Wang
Lingqiao Liu
Lei Wang
Luping Zhou
MedIm
LM&MA
VLM
38
69
0
18 Sep 2023
Fabricator: An Open Source Toolkit for Generating Labeled Training Data with Teacher LLMs
Jonas Golde
Patrick Haller
Felix Hamborg
Julian Risch
Alan Akbik
73
8
0
18 Sep 2023
Summarization is (Almost) Dead
Xiao Pu
Mingqi Gao
Xiaojun Wan
HILM
83
40
0
18 Sep 2023
ODSum: New Benchmarks for Open Domain Multi-Document Summarization
Yijie Zhou
Kejian Shi
Wencai Zhang
Yixin Liu
Yilun Zhao
Arman Cohan
RALM
42
2
0
16 Sep 2023
Self-Assessment Tests are Unreliable Measures of LLM Personality
Akshat Gupta
Xiaoyang Song
Gopala Anumanchipalli
29
19
0
15 Sep 2023
Oobleck: Resilient Distributed Training of Large Models Using Pipeline Templates
Insu Jang
Zhenning Yang
Zhen Zhang
Xin Jin
Mosharaf Chowdhury
MoE
AI4CE
OODD
49
44
0
15 Sep 2023
A Data Source for Reasoning Embodied Agents
Jack Lanchantin
Sainbayar Sukhbaatar
Gabriel Synnaeve
Yuxuan Sun
Kavya Srinet
Arthur Szlam
LM&Ro
LRM
38
5
0
14 Sep 2023
Generative AI Text Classification using Ensemble LLM Approaches
Harika Abburi
Michael Suesserman
Nirmala Pudota
Balaji Veeramani
Edward Bowen
Sanmitra Bhattacharya
DeLMO
39
47
0
14 Sep 2023
Tree of Uncertain Thoughts Reasoning for Large Language Models
Shentong Mo
Miao Xin
LRM
AI4CE
22
12
0
14 Sep 2023
SwitchGPT: Adapting Large Language Models for Non-Text Outputs
Xinyu Wang
Bohan Zhuang
Qi Wu
MLLM
52
3
0
14 Sep 2023
A Fast Optimization View: Reformulating Single Layer Attention in LLM Based on Tensor and SVM Trick, and Solving It in Matrix Multiplication Time
Yeqi Gao
Zhao Song
Weixin Wang
Junze Yin
37
26
0
14 Sep 2023
Voxtlm: unified decoder-only models for consolidating speech recognition/synthesis and speech/text continuation tasks
Soumi Maiti
Yifan Peng
Shukjae Choi
Jee-weon Jung
Xuankai Chang
Shinji Watanabe
VLM
AuLLM
38
60
0
14 Sep 2023
Sensitivity, Performance, Robustness: Deconstructing the Effect of Sociodemographic Prompting
Tilman Beck
Hendrik Schuff
Anne Lauscher
Iryna Gurevych
62
35
0
13 Sep 2023
Cognitive Mirage: A Review of Hallucinations in Large Language Models
Hongbin Ye
Tong Liu
Aijia Zhang
Wei Hua
Weiqiang Jia
HILM
55
77
0
13 Sep 2023
The Grand Illusion: The Myth of Software Portability and Implications for ML Progress
Fraser Mince
Dzung Dinh
Jonas Kgomo
Neil Thompson
Sara Hooker
21
6
0
12 Sep 2023
A Distributed Data-Parallel PyTorch Implementation of the Distributed Shampoo Optimizer for Training Neural Networks At-Scale
Hao-Jun Michael Shi
Tsung-Hsien Lee
Shintaro Iwasaki
Jose Gallego-Posada
Zhijing Li
Kaushik Rangadurai
Dheevatsa Mudigere
Michael Rabbat
ODL
37
24
0
12 Sep 2023
Efficient Memory Management for Large Language Model Serving with PagedAttention
Woosuk Kwon
Zhuohan Li
Siyuan Zhuang
Ying Sheng
Lianmin Zheng
Cody Hao Yu
Joseph E. Gonzalez
Haotong Zhang
Ion Stoica
VLM
75
1,981
0
12 Sep 2023
BHASA: A Holistic Southeast Asian Linguistic and Cultural Evaluation Suite for Large Language Models
Wei Qi Leong
Jian Gang Ngui
Yosephine Susanto
Hamsawardhini Rengarajan
Kengatharaiyer Sarveswaran
William-Chandra Tjhi
40
9
0
12 Sep 2023
MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning
Xiang Yue
Xingwei Qu
Ge Zhang
Yao Fu
Wenhao Huang
Huan Sun
Yu-Chuan Su
Wenhu Chen
AIMat
LRM
85
378
0
11 Sep 2023
Incorporating Pre-trained Model Prompting in Multimodal Stock Volume Movement Prediction
Ruibo Chen
Zhiyuan Zhang
Yi Liu
Ruihan Bao
Keiko Harimoto
Xu Sun
AIFin
AI4TS
49
0
0
11 Sep 2023
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs
Wenhua Cheng
Weiwei Zhang
Haihao Shen
Yiyang Cai
Xin He
Kaokao Lv
Yi. Liu
MQ
53
24
0
11 Sep 2023
Evaluating the Deductive Competence of Large Language Models
S. M. Seals
V. Shalin
ELM
ReLM
LRM
27
8
0
11 Sep 2023
DePT: Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning
Zhengxiang Shi
Aldo Lipani
VLM
51
31
0
11 Sep 2023
Chat2Brain: A Method for Mapping Open-Ended Semantic Queries to Brain Activation Maps
Yaonai Wei
Tuo Zhang
Han Zhang
Tianyang Zhong
Lin Zhao
...
Muheng Shang
Lei Du
Xiao Li
Tianming Liu
Jun-Feng Han
54
2
0
10 Sep 2023
Neurons in Large Language Models: Dead, N-gram, Positional
Elena Voita
Javier Ferrando
Christoforos Nalmpantis
MILM
69
51
0
09 Sep 2023
Towards Robust Model Watermark via Reducing Parametric Vulnerability
Guanhao Gan
Yiming Li
Dongxian Wu
Shu-Tao Xia
AAML
32
12
0
09 Sep 2023
EPA: Easy Prompt Augmentation on Large Language Models via Multiple Sources and Multiple Targets
Hongyuan Lu
Wai Lam
50
1
0
09 Sep 2023
Can NLP Models Ídentify', 'Distinguish', and 'Justify' Questions that Don't have a Definitive Answer?
Ayushi Agarwal
Nisarg Patel
Neeraj Varshney
Mihir Parmar
Pavan Mallina
Aryan Bhavin Shah
Srihari Sangaraju
Tirth Patel
Nihar Thakkar
Chitta Baral
ELM
45
3
0
08 Sep 2023
Context-Aware Prompt Tuning for Vision-Language Model with Dual-Alignment
Hongyu Hu
Tiancheng Lin
Jie Wang
Zhenbang Sun
Yi Xu
MLLM
VLM
VPVLM
41
1
0
08 Sep 2023
FLM-101B: An Open LLM and How to Train It with
100
K
B
u
d
g
e
t
100K Budget
100
K
B
u
d
g
e
t
Xiang Li
Yiqun Yao
Xin Jiang
Xuezhi Fang
Xuying Meng
...
Li Du
Bowen Qin
Zheng Zhang
Aixin Sun
Yequan Wang
62
22
0
07 Sep 2023
From Base to Conversational: Japanese Instruction Dataset and Tuning Large Language Models
Masahiro Suzuki
Masanori Hirano
Hiroki Sakaji
56
6
0
07 Sep 2023
J-Guard: Journalism Guided Adversarially Robust Detection of AI-generated News
Tharindu Kumarage
Amrita Bhattacharjee
Djordje Padejski
Kristy Roschke
Dan Gillmor
Scott W. Ruston
Huan Liu
Joshua Garland
DeLMO
42
10
0
06 Sep 2023
Norm Tweaking: High-performance Low-bit Quantization of Large Language Models
Liang Li
Qingyuan Li
Bo Zhang
Xiangxiang Chu
MQ
52
29
0
06 Sep 2023
GPT Can Solve Mathematical Problems Without a Calculator
Zhiyong Yang
Ming Ding
Qingsong Lv
Zhihuan Jiang
Zehai He
Yuyi Guo
Jinfeng Bai
Jie Tang
RALM
LRM
52
53
0
06 Sep 2023
HC3 Plus: A Semantic-Invariant Human ChatGPT Comparison Corpus
Zhenpeng Su
Xing Wu
Wei Zhou
Guangyuan Ma
Song Hu
DeLMO
33
13
0
06 Sep 2023
Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning
L. Yu
Bowen Shi
Ramakanth Pasunuru
Benjamin Muller
O. Yu. Golovneva
...
Yaniv Taigman
Maryam Fazel-Zarandi
Asli Celikyilmaz
Luke Zettlemoyer
Armen Aghajanyan
MLLM
43
136
0
05 Sep 2023
CIEM: Contrastive Instruction Evaluation Method for Better Instruction Tuning
Hongyu Hu
Jiyuan Zhang
Minyi Zhao
Zhenbang Sun
MLLM
30
44
0
05 Sep 2023
PromptTTS 2: Describing and Generating Voices with Text Prompt
Yichong Leng
Zhifang Guo
Kai Shen
Xu Tan
Zeqian Ju
...
Lei He
Xiang-Yang Li
Sheng Zhao
Tao Qin
Jiang Bian
VLM
DiffM
67
44
0
05 Sep 2023
Data-Juicer: A One-Stop Data Processing System for Large Language Models
Daoyuan Chen
Yilun Huang
Zhijian Ma
Hesen Chen
Xuchen Pan
...
Zhaoyang Liu
Jinyang Gao
Yaliang Li
Bolin Ding
Jingren Zhou
SyDa
VLM
42
32
0
05 Sep 2023
NICE: CVPR 2023 Challenge on Zero-shot Image Captioning
Taehoon Kim
Pyunghwan Ahn
Sangyun Kim
Sihaeng Lee
Mark A Marsden
...
Yujin Wang
Yimu Wang
Tiancheng Gu
Xingchang Lv
Mingmao Sun
VLM
55
5
0
05 Sep 2023
QuantEase: Optimization-based Quantization for Language Models
Kayhan Behdin
Ayan Acharya
Aman Gupta
Qingquan Song
Siyu Zhu
S. Keerthi
Rahul Mazumder
MQ
35
21
0
05 Sep 2023
Softmax Bias Correction for Quantized Generative Models
N. Pandey
Marios Fournarakis
Chirag I. Patel
Markus Nagel
DiffM
30
11
0
04 Sep 2023
DeViL: Decoding Vision features into Language
Meghal Dani
Isabel Rio-Torto
Stephan Alaniz
Zeynep Akata
VLM
58
8
0
04 Sep 2023
Memory Efficient Optimizers with 4-bit States
Bingrui Li
Jianfei Chen
Jun Zhu
MQ
30
34
0
04 Sep 2023
Explainability for Large Language Models: A Survey
Haiyan Zhao
Hanjie Chen
Fan Yang
Ninghao Liu
Huiqi Deng
Hengyi Cai
Shuaiqiang Wang
Dawei Yin
Jundong Li
LRM
42
426
0
02 Sep 2023
eDKM: An Efficient and Accurate Train-time Weight Clustering for Large Language Models
Minsik Cho
Keivan Alizadeh Vahid
Qichen Fu
Saurabh N. Adya
C. C. D. Mundo
Mohammad Rastegari
Devang Naik
Peter Zatloukal
MQ
43
6
0
02 Sep 2023
Multilingual Text Representation
Fahim Faisal
32
0
0
02 Sep 2023
RenAIssance: A Survey into AI Text-to-Image Generation in the Era of Large Model
Fengxiang Bie
Yibo Yang
Zhongzhu Zhou
Adam Ghanem
Minjia Zhang
...
Pareesa Ameneh Golnari
David A. Clifton
Yuxiong He
Dacheng Tao
Shuaiwen Leon Song
EGVM
48
20
0
02 Sep 2023
Efficient RLHF: Reducing the Memory Usage of PPO
Michael Santacroce
Yadong Lu
Han Yu
Yuan-Fang Li
Yelong Shen
40
27
0
01 Sep 2023
FederatedScope-LLM: A Comprehensive Package for Fine-tuning Large Language Models in Federated Learning
Weirui Kuang
Bingchen Qian
Zitao Li
Daoyuan Chen
Dawei Gao
Xuchen Pan
Yuexiang Xie
Yaliang Li
Bolin Ding
Jingren Zhou
FedML
46
116
0
01 Sep 2023
Previous
1
2
3
...
33
34
35
...
48
49
50
Next