Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.01068
Cited By
OPT: Open Pre-trained Transformer Language Models
2 May 2022
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
Shuohui Chen
Christopher Dewan
Mona T. Diab
Xian Li
Xi Lin
Todor Mihaylov
Myle Ott
Sam Shleifer
Kurt Shuster
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
VLM
OSLM
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"OPT: Open Pre-trained Transformer Language Models"
50 / 2,456 papers shown
Title
Mind Your Bias: A Critical Review of Bias Detection Methods for Contextual Language Models
Silke Husse
Andreas Spitz
30
6
0
15 Nov 2022
Breadth-First Pipeline Parallelism
J. Lamy-Poirier
GNN
MoE
AI4CE
33
1
0
11 Nov 2022
Measuring Reliability of Large Language Models through Semantic Consistency
Harsh Raj
Domenic Rosati
S. Majumdar
HILM
27
30
0
10 Nov 2022
Collateral facilitation in humans and language models
J. Michaelov
Benjamin Bergen
25
11
0
09 Nov 2022
Grammatical Error Correction: A Survey of the State of the Art
Christopher Bryant
Zheng Yuan
Muhammad Reza Qorib
Hannan Cao
Hwee Tou Ng
Ted Briscoe
3DV
34
79
0
09 Nov 2022
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BigScience Workshop
:
Teven Le Scao
Angela Fan
Christopher Akiki
...
Zhongli Xie
Zifan Ye
M. Bras
Younes Belkada
Thomas Wolf
VLM
160
2,319
0
09 Nov 2022
Active Example Selection for In-Context Learning
Yiming Zhang
Shi Feng
Chenhao Tan
SILM
LRM
32
187
0
08 Nov 2022
Intriguing Properties of Compression on Multilingual Models
Kelechi Ogueji
Orevaoghene Ahia
Gbemileke Onilude
Sebastian Gehrmann
Sara Hooker
Julia Kreutzer
31
12
0
04 Nov 2022
MolE: a molecular foundation model for drug discovery
Oscar Méndez-Lucio
C. Nicolaou
Berton Earnshaw
20
11
0
03 Nov 2022
LMentry: A Language Model Benchmark of Elementary Language Tasks
Avia Efrat
Or Honovich
Omer Levy
34
20
0
03 Nov 2022
Large Language Models Are Human-Level Prompt Engineers
Yongchao Zhou
Andrei Ioan Muresanu
Ziwen Han
Keiran Paster
Silviu Pitis
Harris Chan
Jimmy Ba
ALM
LLMAG
21
839
0
03 Nov 2022
SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular Control
Xiaochuang Han
Sachin Kumar
Yulia Tsvetkov
47
80
0
31 Oct 2022
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers
Elias Frantar
Saleh Ashkboos
Torsten Hoefler
Dan Alistarh
MQ
33
905
0
31 Oct 2022
A Solvable Model of Neural Scaling Laws
A. Maloney
Daniel A. Roberts
J. Sully
52
51
0
30 Oct 2022
Class Based Thresholding in Early Exit Semantic Segmentation Networks
Alperen Görmez
Erdem Koyuncu
23
5
0
27 Oct 2022
TRScore: A Novel GPT-based Readability Scorer for ASR Segmentation and Punctuation model evaluation and selection
Piyush Behre
S.S. Tan
A. Shah
Harini Kesavamoorthy
Shuangyu Chang
Fei Zuo
C. Basoglu
Sayan D. Pathak
21
0
0
27 Oct 2022
Personalized Dialogue Generation with Persona-Adaptive Attention
Qiushi Huang
Yu Zhang
Tom Ko
Xubo Liu
Boyong Wu
Wenwu Wang
Lilian H. Y. Tang
34
19
0
27 Oct 2022
Multi-lingual Evaluation of Code Generation Models
Ben Athiwaratkun
Sanjay Krishna Gouda
Zijian Wang
Xiaopeng Li
Yuchen Tian
...
Baishakhi Ray
Parminder Bhatia
Sudipta Sengupta
Dan Roth
Bing Xiang
ELM
120
160
0
26 Oct 2022
RoMQA: A Benchmark for Robust, Multi-evidence, Multi-answer Question Answering
Victor Zhong
Weijia Shi
Wen-tau Yih
Luke Zettlemoyer
17
19
0
25 Oct 2022
Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models
Hong Liu
Sang Michael Xie
Zhiyuan Li
Tengyu Ma
AI4CE
42
50
0
25 Oct 2022
Weakly Supervised Data Augmentation Through Prompting for Dialogue Understanding
Maximillian Chen
Alexandros Papangelis
Chenyang Tao
Andrew Rosenbaum
Seokhwan Kim
Yang Liu
Zhou Yu
Dilek Z. Hakkani-Tür
44
33
0
25 Oct 2022
Contrastive Search Is What You Need For Neural Text Generation
Yixuan Su
Nigel Collier
25
50
0
25 Oct 2022
Towards Better Few-Shot and Finetuning Performance with Forgetful Causal Language Models
Hao Liu
Xinyang Geng
Lisa Lee
Igor Mordatch
Sergey Levine
Sharan Narang
Pieter Abbeel
KELM
CLL
35
2
0
24 Oct 2022
Neural Theory-of-Mind? On the Limits of Social Intelligence in Large LMs
Maarten Sap
Ronan Le Bras
Daniel Fried
Yejin Choi
32
210
0
24 Oct 2022
The Curious Case of Absolute Position Embeddings
Koustuv Sinha
Amirhossein Kazemnejad
Siva Reddy
J. Pineau
Dieuwke Hupkes
Adina Williams
87
15
0
23 Oct 2022
Augmentation with Projection: Towards an Effective and Efficient Data Augmentation Paradigm for Distillation
Ziqi Wang
Yuexin Wu
Frederick Liu
Daogao Liu
Le Hou
Hongkun Yu
Jing Li
Heng Ji
45
5
0
21 Oct 2022
SMaLL-100: Introducing Shallow Multilingual Machine Translation Model for Low-Resource Languages
Alireza Mohammadshahi
Vassilina Nikoulina
Alexandre Berard
Caroline Brun
James Henderson
Laurent Besacier
VLM
MoE
LRM
29
20
0
20 Oct 2022
lo-fi: distributed fine-tuning without communication
Mitchell Wortsman
Suchin Gururangan
Shen Li
Ali Farhadi
Ludwig Schmidt
Michael G. Rabbat
Ari S. Morcos
39
24
0
19 Oct 2022
Attribution and Obfuscation of Neural Text Authorship: A Data Mining Perspective
Adaku Uchendu
Thai Le
Dongwon Lee
DeLMO
34
41
0
19 Oct 2022
Prompting GPT-3 To Be Reliable
Chenglei Si
Zhe Gan
Zhengyuan Yang
Shuohang Wang
Jianfeng Wang
Jordan L. Boyd-Graber
Lijuan Wang
KELM
LRM
62
283
0
17 Oct 2022
Machine Generated Text: A Comprehensive Survey of Threat Models and Detection Methods
Evan Crothers
Nathalie Japkowicz
H. Viktor
DeLMO
52
107
0
13 Oct 2022
Visual Classification via Description from Large Language Models
Sachit Menon
Carl Vondrick
VLM
38
291
0
13 Oct 2022
On Divergence Measures for Bayesian Pseudocoresets
Balhae Kim
J. Choi
Seanie Lee
Yoonho Lee
Jung-Woo Ha
Juho Lee
DD
19
11
0
12 Oct 2022
Generating Executable Action Plans with Environmentally-Aware Language Models
Maitrey Gramopadhye
D. Szafir
LM&Ro
LLMAG
30
22
0
10 Oct 2022
AlphaTuning: Quantization-Aware Parameter-Efficient Adaptation of Large-Scale Pre-Trained Language Models
S. Kwon
Jeonghoon Kim
Jeongin Bae
Kang Min Yoo
Jin-Hwa Kim
Baeseong Park
Byeongwook Kim
Jung-Woo Ha
Nako Sung
Dongsoo Lee
MQ
38
30
0
08 Oct 2022
Few-Shot Anaphora Resolution in Scientific Protocols via Mixtures of In-Context Experts
Nghia T. Le
Fan Bai
Alan Ritter
42
12
0
07 Oct 2022
State-of-the-art generalisation research in NLP: A taxonomy and review
Dieuwke Hupkes
Mario Giulianelli
Verna Dankers
Mikel Artetxe
Yanai Elazar
...
Leila Khalatbari
Maria Ryskina
Rita Frieske
Ryan Cotterell
Zhijing Jin
129
95
0
06 Oct 2022
Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners
Seonghyeon Ye
Doyoung Kim
Joel Jang
Joongbo Shin
Minjoon Seo
FedML
VLM
UQCV
LRM
26
25
0
06 Oct 2022
Large Language Models are Pretty Good Zero-Shot Video Game Bug Detectors
Mohammad Reza Taesiri
Finlay Macklon
Yihe Wang
Hengshuo Shen
Cor-Paul Bezemer
ELM
LLMAG
MLLM
47
13
0
05 Oct 2022
Ask Me Anything: A simple strategy for prompting language models
Simran Arora
A. Narayan
Mayee F. Chen
Laurel J. Orr
Neel Guha
Kush S. Bhatia
Ines Chami
Frederic Sala
Christopher Ré
ReLM
LRM
235
208
0
05 Oct 2022
GLM-130B: An Open Bilingual Pre-trained Model
Aohan Zeng
Xiao Liu
Zhengxiao Du
Zihan Wang
Hanyu Lai
...
Jidong Zhai
Wenguang Chen
Peng Zhang
Yuxiao Dong
Jie Tang
BDL
LRM
275
1,077
0
05 Oct 2022
Robot Task Planning and Situation Handling in Open Worlds
Yan Ding
Xiaohan Zhang
S. Amiri
Nieqing Cao
Hao Yang
Chad Esselink
Shiqi Zhang
LM&Ro
31
19
0
04 Oct 2022
Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple Tasks
Zhenhailong Wang
Xiaoman Pan
Dian Yu
Dong Yu
Jianshu Chen
Heng Ji
VLM
48
9
0
01 Oct 2022
AudioGen: Textually Guided Audio Generation
Felix Kreuk
Gabriel Synnaeve
Adam Polyak
Uriel Singer
Alexandre Défossez
Jade Copet
Devi Parikh
Yaniv Taigman
Yossi Adi
DiffM
27
290
0
30 Sep 2022
SmallCap: Lightweight Image Captioning Prompted with Retrieval Augmentation
R. Ramos
Bruno Martins
Desmond Elliott
Yova Kementchedjhieva
VLM
35
86
0
30 Sep 2022
Bidirectional Language Models Are Also Few-shot Learners
Ajay Patel
Bryan Li
Mohammad Sadegh Rasooli
Noah Constant
Colin Raffel
Chris Callison-Burch
LRM
70
45
0
29 Sep 2022
Deep Generative Multimedia Children's Literature
Matthew Lyle Olson
24
0
0
27 Sep 2022
Learning to Drop Out: An Adversarial Approach to Training Sequence VAEs
Ðorðe Miladinovic
Kumar Shridhar
Kushal Kumar Jain
Max B. Paulus
J. M. Buhmann
Mrinmaya Sachan
Carl Allen
DRL
30
5
0
26 Sep 2022
Moral Mimicry: Large Language Models Produce Moral Rationalizations Tailored to Political Identity
Gabriel Simmons
108
57
0
24 Sep 2022
Variational Open-Domain Question Answering
Valentin Liévin
Andreas Geert Motzfeldt
Ida Riis Jensen
Ole Winther
OOD
BDL
41
8
0
23 Sep 2022
Previous
1
2
3
...
47
48
49
50
Next