Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.01068
Cited By
OPT: Open Pre-trained Transformer Language Models
2 May 2022
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
Shuohui Chen
Christopher Dewan
Mona T. Diab
Xian Li
Xi Lin
Todor Mihaylov
Myle Ott
Sam Shleifer
Kurt Shuster
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
VLM
OSLM
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"OPT: Open Pre-trained Transformer Language Models"
50 / 2,456 papers shown
Title
SCM: Enhancing Large Language Model with Self-Controlled Memory Framework
Bin Wang
Xinnian Liang
Jian Yang
Huijia Huang
Shuangzhi Wu
Peihao Wu
Lu Lu
Zejun Ma
Zhoujun Li
LLMAG
KELM
RALM
98
26
0
26 Apr 2023
Stable and low-precision training for large-scale vision-language models
Mitchell Wortsman
Tim Dettmers
Luke Zettlemoyer
Ari S. Morcos
Ali Farhadi
Ludwig Schmidt
MQ
MLLM
VLM
29
39
0
25 Apr 2023
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
Rongjie Huang
Mingze Li
Dongchao Yang
Jiatong Shi
Xuankai Chang
...
Jia-Bin Huang
Jinglin Liu
Yixiang Ren
Zhou Zhao
Shinji Watanabe
LM&MA
AuLLM
51
203
0
25 Apr 2023
PEFT-Ref: A Modular Reference Architecture and Typology for Parameter-Efficient Finetuning Techniques
Mohammed Sabry
Anya Belz
48
8
0
24 Apr 2023
Better Question-Answering Models on a Budget
Yudhanjaya Wijeratne
Ishan Marikar
ALM
28
0
0
24 Apr 2023
LLM+P: Empowering Large Language Models with Optimal Planning Proficiency
B. Liu
Yuqian Jiang
Xiaohan Zhang
Qian Liu
Shiqi Zhang
Joydeep Biswas
Peter Stone
LM&Ro
LLMAG
34
388
0
22 Apr 2023
Pipeline MoE: A Flexible MoE Implementation with Pipeline Parallelism
Xin Chen
Hengheng Zhang
Xiaotao Gu
Kaifeng Bi
Lingxi Xie
Qi Tian
MoE
22
4
0
22 Apr 2023
ChatABL: Abductive Learning via Natural Language Interaction with ChatGPT
Tianyang Zhong
Yaonai Wei
Li Yang
Zihao Wu
Zheng Liu
...
Xi Jiang
Jun-Feng Han
Dinggang Shen
Tianming Liu
Tuo Zhang
LRM
24
27
0
21 Apr 2023
MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models
Deyao Zhu
Jun Chen
Xiaoqian Shen
Xiang Li
Mohamed Elhoseiny
VLM
MLLM
75
1,922
0
20 Apr 2023
Learning to Plan with Natural Language
Yiduo Guo
Yaobo Liang
Chenfei Wu
Wenshan Wu
Dongyan Zhao
Nan Duan
LLMAG
LRM
42
6
0
20 Apr 2023
Attention Scheme Inspired Softmax Regression
Yichuan Deng
Zhihang Li
Zhao Song
49
42
0
20 Apr 2023
Scaling Transformer to 1M tokens and beyond with RMT
Aydar Bulatov
Yuri Kuratov
Yermek Kapushev
Andrey Kravchenko
LRM
27
87
0
19 Apr 2023
A Theory on Adam Instability in Large-Scale Machine Learning
Igor Molybog
Peter Albert
Moya Chen
Zach DeVito
David Esiobu
...
Puxin Xu
Yuchen Zhang
Melanie Kambadur
Stephen Roller
Susan Zhang
AI4CE
33
30
0
19 Apr 2023
Outlier Suppression+: Accurate quantization of large language models by equivalent and optimal shifting and scaling
Xiuying Wei
Yunchen Zhang
Yuhang Li
Xiangguo Zhang
Ruihao Gong
Jian Ren
Zhengang Li
MQ
27
31
0
18 Apr 2023
An Evaluation on Large Language Model Outputs: Discourse and Memorization
Adrian de Wynter
Xun Wang
Alex Sokolov
Qilong Gu
Si-Qing Chen
ELM
90
32
0
17 Apr 2023
Visual Instruction Tuning
Haotian Liu
Chunyuan Li
Qingyang Wu
Yong Jae Lee
SyDa
VLM
MLLM
190
4,325
0
17 Apr 2023
LongForm: Effective Instruction Tuning with Reverse Instructions
Abdullatif Köksal
Timo Schick
Anna Korhonen
Hinrich Schütze
SyDa
ALM
31
34
0
17 Apr 2023
Supporting Qualitative Analysis with Large Language Models: Combining Codebook with GPT-3 for Deductive Coding
Ziang Xiao
Xingdi Yuan
Q. V. Liao
Rania Abdelghani
Pierre-Yves Oudeyer
27
135
0
17 Apr 2023
Towards Better Instruction Following Language Models for Chinese: Investigating the Impact of Training Data and Evaluation
Yunjie Ji
Yan Gong
Yong Deng
Yiping Peng
Qiang Niu
Baochang Ma
Xiangang Li
ALM
ELM
30
22
0
16 Apr 2023
On the Opportunities and Challenges of Foundation Models for Geospatial Artificial Intelligence
Gengchen Mai
Weiming Huang
Jin Sun
Suhang Song
Deepak Mishra
...
Yingjie Hu
Chris Cundy
Ziyuan Li
Rui Zhu
Ni Lao
AI4CE
40
123
0
13 Apr 2023
ChatGPT Needs SPADE (Sustainability, PrivAcy, Digital divide, and Ethics) Evaluation: A Review
Sunder Ali Khowaja
P. Khuwaja
K. Dev
Weizheng Wang
Lewis Nkenyereye
29
76
0
13 Apr 2023
Solving Tensor Low Cycle Rank Approximation
Yichuan Deng
Yeqi Gao
Zhao Song
39
6
0
13 Apr 2023
Are LLMs All You Need for Task-Oriented Dialogue?
Vojtvech Hudevcek
Ondrej Dusek
31
57
0
13 Apr 2023
AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models
Wanjun Zhong
Ruixiang Cui
Yiduo Guo
Yaobo Liang
Shuai Lu
Yanlin Wang
Amin Saied
Weizhu Chen
Nan Duan
ALM
ELM
23
496
0
13 Apr 2023
ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation
Jiazheng Xu
Xiao Liu
Yuchen Wu
Yuxuan Tong
Qinkai Li
Ming Ding
Jie Tang
Yuxiao Dong
63
325
0
12 Apr 2023
HiPrompt: Few-Shot Biomedical Knowledge Fusion via Hierarchy-Oriented Prompting
Jiaying Lu
Jiaming Shen
Bo Xiong
Wenjing Ma
Steffen Staab
Carl Yang
32
11
0
12 Apr 2023
ChatGPT Beyond English: Towards a Comprehensive Evaluation of Large Language Models in Multilingual Learning
Viet Dac Lai
Nghia Trung Ngo
Amir Pouran Ben Veyseh
Hieu Man
Franck Dernoncourt
Trung Bui
Thien Huu Nguyen
ELM
LM&MA
35
271
0
12 Apr 2023
User Adaptive Language Learning Chatbots with a Curriculum
Kun Qian
Ryan Shea
Yu Li
Luke K. Fryer
Zhou Yu
35
12
0
11 Apr 2023
Multilingual Machine Translation with Large Language Models: Empirical Results and Analysis
Wenhao Zhu
Hongyi Liu
Qingxiu Dong
Jingjing Xu
Shujian Huang
Lingpeng Kong
Jiajun Chen
Lei Li
LRM
45
142
0
10 Apr 2023
Randomized and Deterministic Attention Sparsification Algorithms for Over-parameterized Feature Dimension
Yichuan Deng
Sridhar Mahadevan
Zhao Song
22
35
0
10 Apr 2023
OpenAGI: When LLM Meets Domain Experts
Yingqiang Ge
Wenyue Hua
Kai Mei
Jianchao Ji
Juntao Tan
Shuyuan Xu
Zelong Li
Yongfeng Zhang
VLM
LRM
57
214
0
10 Apr 2023
Is ChatGPT a Good Sentiment Analyzer? A Preliminary Study
Zengzhi Wang
Qiming Xie
Yi Feng
Zixiang Ding
Zinong Yang
Rui Xia
AI4MH
LLMAG
32
148
0
10 Apr 2023
A Preliminary Evaluation of ChatGPT for Zero-shot Dialogue Understanding
Wenbo Pan
Qiguang Chen
Xiao Xu
Wanxiang Che
Libo Qin
34
44
0
09 Apr 2023
Decoder-Only or Encoder-Decoder? Interpreting Language Model as a Regularized Encoder-Decoder
Z. Fu
W. Lam
Qian Yu
Anthony Man-Cho So
Shengding Hu
Zhiyuan Liu
Nigel Collier
AuLLM
42
41
0
08 Apr 2023
From Retrieval to Generation: Efficient and Effective Entity Set Expansion
Shulin Huang
Shirong Ma
Yongqian Li
Hai-Tao Zheng
Yong-jia Jiang
Haitao Zheng
Ying Shen
39
3
0
07 Apr 2023
Instruction Tuning with GPT-4
Baolin Peng
Chunyuan Li
Pengcheng He
Michel Galley
Jianfeng Gao
SyDa
ALM
LM&MA
171
591
0
06 Apr 2023
Cerebras-GPT: Open Compute-Optimal Language Models Trained on the Cerebras Wafer-Scale Cluster
Nolan Dey
Gurpreet Gosal
Zhiming Chen
Chen
Hemant Khachane
William Marshall
Ribhu Pathria
Marvin Tom
Joel Hestness
MoE
LRM
25
100
0
06 Apr 2023
Zero-Shot Next-Item Recommendation using Large Pretrained Language Models
Lei Wang
Ee-Peng Lim
LRM
28
54
0
06 Apr 2023
Conceptual structure coheres in human cognition but not in large language models
Siddharth Suresh
Kushin Mukherjee
Xizheng Yu
Wei-Chun Huang
Lisa Padua
Timothy T. Rogers
42
11
0
05 Apr 2023
Adopting Two Supervisors for Efficient Use of Large-Scale Remote Deep Neural Networks
Michael Weiss
Paolo Tonella
AI4CE
21
0
0
05 Apr 2023
Effective Theory of Transformers at Initialization
Emily Dinan
Sho Yaida
Susan Zhang
32
14
0
04 Apr 2023
LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models
Zhiqiang Hu
Lei Wang
Yihuai Lan
Wanyu Xu
Ee-Peng Lim
Lidong Bing
Xing Xu
Soujanya Poria
Roy Ka-wei Lee
ALM
51
238
0
04 Apr 2023
Resources and Few-shot Learners for In-context Learning in Slavic Languages
Michal vStefánik
Marek Kadlcík
Piotr Gramacki
Petr Sojka
31
3
0
04 Apr 2023
Mastering Symbolic Operations: Augmenting Language Models with Compiled Neural Networks
Yixuan Weng
Minjun Zhu
Fei Xia
Bin Li
Shizhu He
Kang Liu
Jun Zhao
36
5
0
04 Apr 2023
Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling
Stella Biderman
Hailey Schoelkopf
Quentin G. Anthony
Herbie Bradley
Kyle O'Brien
...
USVSN Sai Prashanth
Edward Raff
Aviya Skowron
Lintang Sutawika
Oskar van der Wal
36
1,185
0
03 Apr 2023
RPTQ: Reorder-based Post-training Quantization for Large Language Models
Zhihang Yuan
Lin Niu
Jia-Wen Liu
Wenyu Liu
Xinggang Wang
Yuzhang Shang
Guangyu Sun
Qiang Wu
Jiaxiang Wu
Bingzhe Wu
MQ
35
79
0
03 Apr 2023
Can the Inference Logic of Large Language Models be Disentangled into Symbolic Concepts?
Wen Shen
Lei Cheng
Yuxiao Yang
Mingjie Li
Quanshi Zhang
LRM
43
8
0
03 Apr 2023
Exploring the Use of Large Language Models for Reference-Free Text Quality Evaluation: An Empirical Study
Yi Chen
Rui Wang
Haiyun Jiang
Shuming Shi
Ruifeng Xu
LM&MA
41
75
0
03 Apr 2023
LLMMaps -- A Visual Metaphor for Stratified Evaluation of Large Language Models
Patrik Puchert
Poonam Poonam
Christian van Onzenoodt
Timo Ropinski
28
8
0
02 Apr 2023
Evaluating Large Language Models on a Highly-specialized Topic, Radiation Oncology Physics
J. Holmes
Zheng Liu
Lian-Cheng Zhang
Yuzhen Ding
Terence T. Sio
...
Jonathan B. Ashman
Xiang Li
Tianming Liu
Jiajian Shen
Wen Liu
LM&MA
AI4CE
ELM
35
120
0
01 Apr 2023
Previous
1
2
3
...
43
44
45
...
48
49
50
Next