Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.01068
Cited By
OPT: Open Pre-trained Transformer Language Models
2 May 2022
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
Shuohui Chen
Christopher Dewan
Mona T. Diab
Xian Li
Xi Lin
Todor Mihaylov
Myle Ott
Sam Shleifer
Kurt Shuster
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
VLM
OSLM
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"OPT: Open Pre-trained Transformer Language Models"
50 / 2,460 papers shown
Title
Prompt-Guided Retrieval Augmentation for Non-Knowledge-Intensive Tasks
Zhicheng Guo
Sijie Cheng
Yile Wang
Peng Li
Yang Liu
RALM
30
18
0
28 May 2023
Integrating Action Knowledge and LLMs for Task Planning and Situation Handling in Open Worlds
Yan Ding
Xiaohan Zhang
S. Amiri
Nieqing Cao
Hao Yang
Andy Kaminski
Chad Esselink
Shiqi Zhang
LM&Ro
40
49
0
27 May 2023
The Curse of Recursion: Training on Generated Data Makes Models Forget
Ilia Shumailov
Zakhar Shumaylov
Yiren Zhao
Y. Gal
Nicolas Papernot
Ross J. Anderson
DiffM
31
285
0
27 May 2023
CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers
Dachuan Shi
Chaofan Tao
Anyi Rao
Zhendong Yang
Chun Yuan
Jiaqi Wang
VLM
45
22
0
27 May 2023
Query-Efficient Black-Box Red Teaming via Bayesian Optimization
Deokjae Lee
JunYeong Lee
Jung-Woo Ha
Jin-Hwa Kim
Sang-Woo Lee
Hwaran Lee
Hyun Oh Song
AAML
32
23
0
27 May 2023
DNA-GPT: Divergent N-Gram Analysis for Training-Free Detection of GPT-Generated Text
Xianjun Yang
Wei Cheng
Yue Wu
Linda R. Petzold
William Yang Wang
Haifeng Chen
DeLMO
39
89
0
27 May 2023
Fine-Tuning Language Models with Just Forward Passes
Sadhika Malladi
Tianyu Gao
Eshaan Nichani
Alexandru Damian
Jason D. Lee
Danqi Chen
Sanjeev Arora
48
180
0
27 May 2023
Augmentation-Adapted Retriever Improves Generalization of Language Models as Generic Plug-In
Zichun Yu
Chenyan Xiong
S. Yu
Zhiyuan Liu
KELM
VLM
35
62
0
27 May 2023
Generating Images with Multimodal Language Models
Jing Yu Koh
Daniel Fried
Ruslan Salakhutdinov
MLLM
44
243
0
26 May 2023
Large Language Models as Tool Makers
Tianle Cai
Xuezhi Wang
Tengyu Ma
Xinyun Chen
Denny Zhou
LLMAG
42
192
0
26 May 2023
Scissorhands: Exploiting the Persistence of Importance Hypothesis for LLM KV Cache Compression at Test Time
Zichang Liu
Aditya Desai
Fangshuo Liao
Weitao Wang
Victor Xie
Zhaozhuo Xu
Anastasios Kyrillidis
Anshumali Shrivastava
33
204
0
26 May 2023
NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models
Gengze Zhou
Yicong Hong
Qi Wu
ELM
LM&Ro
LLMAG
LRM
30
143
0
26 May 2023
MixCE: Training Autoregressive Language Models by Mixing Forward and Reverse Cross-Entropies
Shiyue Zhang
Shijie Wu
Ozan Irsoy
Steven Lu
Joey Tianyi Zhou
Mark Dredze
David S. Rosenberg
29
9
0
26 May 2023
Few-shot Fine-tuning vs. In-context Learning: A Fair Comparison and Evaluation
Marius Mosbach
Tiago Pimentel
Shauli Ravfogel
Dietrich Klakow
Yanai Elazar
55
124
0
26 May 2023
On Evaluating Adversarial Robustness of Large Vision-Language Models
Yunqing Zhao
Tianyu Pang
Chao Du
Xiao Yang
Chongxuan Li
Ngai-man Cheung
Min Lin
VLM
AAML
MLLM
35
166
0
26 May 2023
MultiTool-CoT: GPT-3 Can Use Multiple External Tools with Chain of Thought Prompting
Tatsuro Inaba
Hirokazu Kiyomaru
Fei Cheng
Sadao Kurohashi
KELM
LRM
32
23
0
26 May 2023
Parameter-Efficient Fine-Tuning without Introducing New Latency
Baohao Liao
Yan Meng
Christof Monz
24
49
0
26 May 2023
Efficient Detection of LLM-generated Texts with a Bayesian Surrogate Model
Yibo Miao
Hongcheng Gao
Hao Zhang
Zhijie Deng
DeLMO
38
20
0
26 May 2023
On the Tool Manipulation Capability of Open-source Large Language Models
Qiantong Xu
Fenglu Hong
Yangqiu Song
Changran Hu
Zheng Chen
Jian Zhang
LLMAG
35
70
0
25 May 2023
Scaling Data-Constrained Language Models
Niklas Muennighoff
Alexander M. Rush
Boaz Barak
Teven Le Scao
Aleksandra Piktus
Nouamane Tazi
S. Pyysalo
Thomas Wolf
Colin Raffel
ALM
45
203
0
25 May 2023
Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer
Yuandong Tian
Yiping Wang
Beidi Chen
S. Du
MLT
36
73
0
25 May 2023
Automated Tensor Model Parallelism with Overlapped Communication for Efficient Foundation Model Training
Shengwei Li
Zhiquan Lai
Yanqi Hao
Weijie Liu
Ke-shi Ge
Xiaoge Deng
Dongsheng Li
KaiCheng Lu
24
10
0
25 May 2023
ChatBridge: Bridging Modalities with Large Language Model as a Language Catalyst
Zijia Zhao
Longteng Guo
Tongtian Yue
Si-Qing Chen
Shuai Shao
Xinxin Zhu
Zehuan Yuan
Jing Liu
MLLM
45
53
0
25 May 2023
Self-contradictory Hallucinations of Large Language Models: Evaluation, Detection and Mitigation
Niels Mündler
Jingxuan He
Slobodan Jenko
Martin Vechev
HILM
22
109
0
25 May 2023
RewriteLM: An Instruction-Tuned Large Language Model for Text Rewriting
Lei Shu
Liangchen Luo
Jayakumar Hoskere
Yun Zhu
Canoee Liu
Simon Tong
Jindong Chen
Lei Meng
KELM
LRM
40
44
0
25 May 2023
Flocks of Stochastic Parrots: Differentially Private Prompt Learning for Large Language Models
Haonan Duan
Adam Dziedzic
Nicolas Papernot
Franziska Boenisch
AAML
24
61
0
24 May 2023
The Larger They Are, the Harder They Fail: Language Models do not Recognize Identifier Swaps in Python
Antonio Valerio Miceli Barone
Fazl Barez
Ioannis Konstas
Shay B. Cohen
27
32
0
24 May 2023
Towards Revealing the Mystery behind Chain of Thought: A Theoretical Perspective
Guhao Feng
Bohang Zhang
Yuntian Gu
Haotian Ye
Di He
Liwei Wang
LRM
47
225
0
24 May 2023
Gorilla: Large Language Model Connected with Massive APIs
Shishir G. Patil
Tianjun Zhang
Xin Wang
Joseph E. Gonzalez
ELM
CLL
ALM
SyDa
42
525
0
24 May 2023
Winner-Take-All Column Row Sampling for Memory Efficient Adaptation of Language Model
Zirui Liu
Guanchu Wang
Shaochen Zhong
Zhaozhuo Xu
Daochen Zha
...
Zhimeng Jiang
Kaixiong Zhou
V. Chaudhary
Shuai Xu
Xia Hu
52
12
0
24 May 2023
Revisiting Parallel Context Windows: A Frustratingly Simple Alternative and Chain-of-Thought Deterioration
Kejuan Yang
Xiao Liu
Kaiwen Men
Aohan Zeng
Yuxiao Dong
Jie Tang
LLMAG
LRM
29
3
0
24 May 2023
Spoken Question Answering and Speech Continuation Using Spectrogram-Powered LLM
Eliya Nachmani
Alon Levkovitch
Roy Hirsch
Julián Salazar
Chulayutsh Asawaroengchai
Soroosh Mariooryad
Ehud Rivlin
RJ Skerry-Ryan
Michelle Tadmor Ramanovich
AuLLM
39
35
0
24 May 2023
SAIL: Search-Augmented Instruction Learning
Hongyin Luo
Yung-Sung Chuang
Yuan Gong
Tianhua Zhang
Yoon Kim
Xixin Wu
D. Fox
Helen Meng
James R. Glass
ALM
LRM
RALM
41
23
0
24 May 2023
Visually-Situated Natural Language Understanding with Contrastive Reading Model and Frozen Large Language Models
Geewook Kim
Hodong Lee
D. Kim
Haeji Jung
S. Park
Yoon Kim
Sangdoo Yun
Taeho Kil
Bado Lee
Seunghyun Park
VLM
53
4
0
24 May 2023
PathAsst: A Generative Foundation AI Assistant Towards Artificial General Intelligence of Pathology
Yuxuan Sun
Chenglu Zhu
S. Zheng
Kai Zhang
Xiaoxuan Yu
Zhongyi Shui
Yunlong Zhang
Honglin Li
Lin Yang
LM&MA
MedIm
24
44
0
24 May 2023
AutoPlan: Automatic Planning of Interactive Decision-Making Tasks With Large Language Models
Siqi Ouyang
Lei Li
LM&Ro
LLMAG
22
9
0
24 May 2023
ImageNetVC: Zero- and Few-Shot Visual Commonsense Evaluation on 1000 ImageNet Categories
Heming Xia
Qingxiu Dong
Lei Li
Jingjing Xu
Tianyu Liu
Ziwei Qin
Zhifang Sui
MLLM
VLM
18
3
0
24 May 2023
Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models
Gen Luo
Yiyi Zhou
Tianhe Ren
Shen Chen
Xiaoshuai Sun
Rongrong Ji
VLM
MLLM
31
91
0
24 May 2023
A RelEntLess Benchmark for Modelling Graded Relations between Named Entities
Asahi Ushio
Jose Camacho-Collados
Steven Schockaert
34
1
0
24 May 2023
The Art of SOCRATIC QUESTIONING: Recursive Thinking with Large Language Models
Jingyuan Qi
Zhiyang Xu
Ying Shen
Minqian Liu
dingnan jin
Qifan Wang
Lifu Huang
ReLM
LRM
KELM
27
11
0
24 May 2023
LAraBench: Benchmarking Arabic AI with Large Language Models
Ahmed Abdelali
Hamdy Mubarak
Shammur A. Chowdhury
Maram Hasanain
Basel Mousi
...
Yousseif Elshahawy
Ahmed M. Ali
Nadir Durrani
Natasa Milic-Frayling
Firoj Alam
ELM
LM&MA
25
19
0
24 May 2023
GPTAraEval: A Comprehensive Evaluation of ChatGPT on Arabic NLP
Md. Tawkat Islam Khondaker
Abdul Waheed
El Moatez Billah Nagoudi
Muhammad Abdul-Mageed
ELM
LM&MA
32
63
0
24 May 2023
Tricking LLMs into Disobedience: Formalizing, Analyzing, and Detecting Jailbreaks
Abhinav Rao
S. Vashistha
Atharva Naik
Somak Aditya
Monojit Choudhury
43
17
0
24 May 2023
How Predictable Are Large Language Model Capabilities? A Case Study on BIG-bench
Qinyuan Ye
Harvey Yiyun Fu
Xiang Ren
Robin Jia
ELM
26
22
0
24 May 2023
Do LLMs Understand Social Knowledge? Evaluating the Sociability of Large Language Models with SocKET Benchmark
Minje Choi
Jiaxin Pei
Sagar Kumar
Chang Shu
David Jurgens
ALM
LLMAG
45
70
0
24 May 2023
Improving Probability-based Prompt Selection Through Unified Evaluation and Analysis
Sohee Yang
Jonghyeon Kim
Joel Jang
Seonghyeon Ye
Hyunji Lee
Minjoon Seo
36
9
0
24 May 2023
Mitigating Temporal Misalignment by Discarding Outdated Facts
Michael J.Q. Zhang
Eunsol Choi
KELM
HILM
29
17
0
24 May 2023
Estimating Large Language Model Capabilities without Labeled Test Data
Harvey Yiyun Fu
Qinyuan Ye
Albert Xu
Xiang Ren
Robin Jia
28
8
0
24 May 2023
Adapting Language Models to Compress Contexts
Alexis Chevalier
Alexander Wettig
Anirudh Ajith
Danqi Chen
LLMAG
16
176
0
24 May 2023
Alt-Text with Context: Improving Accessibility for Images on Twitter
Nikita Srivatsan
Sofia Samaniego
Omar U. Florez
Taylor Berg-Kirkpatrick
28
3
0
24 May 2023
Previous
1
2
3
...
39
40
41
...
48
49
50
Next