Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.01068
Cited By
OPT: Open Pre-trained Transformer Language Models
2 May 2022
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
Shuohui Chen
Christopher Dewan
Mona T. Diab
Xian Li
Xi Lin
Todor Mihaylov
Myle Ott
Sam Shleifer
Kurt Shuster
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
VLM
OSLM
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"OPT: Open Pre-trained Transformer Language Models"
50 / 2,460 papers shown
Title
Gen-Z: Generative Zero-Shot Text Classification with Contextualized Label Descriptions
Sachin Kumar
Chan Young Park
Yulia Tsvetkov
VLM
34
2
0
13 Nov 2023
Towards the Law of Capacity Gap in Distilling Language Models
Chen Zhang
Dawei Song
Zheyu Ye
Yan Gao
ELM
43
20
0
13 Nov 2023
ViLMA: A Zero-Shot Benchmark for Linguistic and Temporal Grounding in Video-Language Models
.Ilker Kesen
Andrea Pedrotti
Mustafa Dogan
Michele Cafagna
Emre Can Acikgoz
...
Iacer Calixto
Anette Frank
Albert Gatt
Aykut Erdem
Erkut Erdem
41
15
0
13 Nov 2023
Tunable Soft Prompts are Messengers in Federated Learning
Chenhe Dong
Yuexiang Xie
Bolin Ding
Ying Shen
Yaliang Li
FedML
56
7
0
12 Nov 2023
Detecting and Correcting Hate Speech in Multimodal Memes with Large Visual Language Model
Minh-Hao Van
Xintao Wu
VLM
MLLM
40
10
0
12 Nov 2023
Simple and Effective Input Reformulations for Translation
Brian Yu
Hansen Lillemark
Kurt Keutzer
44
0
0
12 Nov 2023
The Shape of Learning: Anisotropy and Intrinsic Dimensions in Transformer-Based Models
Anton Razzhigaev
Matvey Mikhalchuk
Elizaveta Goncharova
Ivan Oseledets
Denis Dimitrov
Andrey Kuznetsov
35
7
0
10 Nov 2023
Let's Reinforce Step by Step
Sarah Pan
Vladislav Lialin
Sherin Muckatira
Anna Rumshisky
ReLM
LRM
24
8
0
10 Nov 2023
A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions
Lei Huang
Weijiang Yu
Weitao Ma
Weihong Zhong
Zhangyin Feng
...
Qianglong Chen
Weihua Peng
Xiaocheng Feng
Bing Qin
Ting Liu
LRM
HILM
61
751
0
09 Nov 2023
PRODIGy: a PROfile-based DIalogue Generation dataset
Daniela Occhipinti
Serra Sinem Tekiroğlu
Marco Guerini
26
3
0
09 Nov 2023
Enhancing Computation Efficiency in Large Language Models through Weight and Activation Quantization
Jangwhan Lee
Minsoo Kim
Seungcheol Baek
Seok Joong Hwang
Wonyong Sung
Jungwook Choi
MQ
21
17
0
09 Nov 2023
Zero-shot Translation of Attention Patterns in VQA Models to Natural Language
Leonard Salewski
A. Sophia Koepke
Hendrik P. A. Lensch
Zeynep Akata
47
2
0
08 Nov 2023
Large GPT-like Models are Bad Babies: A Closer Look at the Relationship between Linguistic Competence and Psycholinguistic Measures
Julius Steuer
Marius Mosbach
Dietrich Klakow
30
10
0
08 Nov 2023
Evaluating multiple large language models in pediatric ophthalmology
J. Holmes
Rui Peng
Yiwei Li
Jinyu Hu
Zheng Liu
...
Wei Liu
Hong Wei
Jie Zou
Tianming Liu
Yi Shao
AI4Ed
ELM
LM&MA
29
0
0
07 Nov 2023
Black-Box Prompt Optimization: Aligning Large Language Models without Model Training
Jiale Cheng
Xiao Liu
Kehan Zheng
Pei Ke
Hongning Wang
Yuxiao Dong
Jie Tang
Minlie Huang
31
80
0
07 Nov 2023
Beyond Imitation: Leveraging Fine-grained Quality Signals for Alignment
Geyang Guo
Ranchi Zhao
Tianyi Tang
Wayne Xin Zhao
Ji-Rong Wen
ALM
45
28
0
07 Nov 2023
Detecting Any Human-Object Interaction Relationship: Universal HOI Detector with Spatial Prompt Learning on Foundation Models
Yichao Cao
Qingfei Tang
Xiu Su
Chen Song
Shan You
Xiaobo Lu
Chang Xu
38
21
0
07 Nov 2023
Instructed Language Models with Retrievers Are Powerful Entity Linkers
Zilin Xiao
Ming Gong
Jie Wu
Xingyao Zhang
Linjun Shou
Jian Pei
Daxin Jiang
LRM
42
12
0
06 Nov 2023
In-Context Learning for Knowledge Base Question Answering for Unmanned Systems based on Large Language Models
Yunlong Chen
Yaming Zhang
Jianfei Yu
Li Yang
Rui Xia
ELM
32
0
0
06 Nov 2023
Octavius: Mitigating Task Interference in MLLMs via LoRA-MoE
Zeren Chen
Ziqin Wang
Zhen Wang
Huayang Liu
Zhen-fei Yin
Si Liu
Lu Sheng
Wanli Ouyang
Yu Qiao
Jing Shao
MoE
51
7
0
05 Nov 2023
Joint Composite Latent Space Bayesian Optimization
Natalie Maus
Zhiyuan Jerry Lin
Maximilian Balandat
E. Bakshy
BDL
53
2
0
03 Nov 2023
Post Turing: Mapping the landscape of LLM Evaluation
Alexey Tikhonov
Ivan P. Yamshchikov
ELM
63
4
0
03 Nov 2023
Active Reasoning in an Open-World Environment
Manjie Xu
Guangyuan Jiang
Weihan Liang
Chi Zhang
Yixin Zhu
LLMAG
LRM
26
10
0
03 Nov 2023
The language of prompting: What linguistic properties make a prompt successful?
Alina Leidinger
R. Rooij
Ekaterina Shutova
46
43
0
03 Nov 2023
Indicative Summarization of Long Discussions
S. Syed
Dominik Schwabe
Khalid Al Khatib
Martin Potthast
36
1
0
03 Nov 2023
Sentiment Analysis through LLM Negotiations
Xiaofei Sun
Xiaoya Li
Shengyu Zhang
Shuhe Wang
Fei Wu
Jiwei Li
Tianwei Zhang
Guoyin Wang
48
16
0
03 Nov 2023
AFPQ: Asymmetric Floating Point Quantization for LLMs
Yijia Zhang
Sicheng Zhang
Shijie Cao
Dayou Du
Jianyu Wei
Ting Cao
Ningyi Xu
MQ
33
6
0
03 Nov 2023
AWEQ: Post-Training Quantization with Activation-Weight Equalization for Large Language Models
Baisong Li
Xingwang Wang
Haixiao Xu
MQ
30
0
0
02 Nov 2023
Learning A Multi-Task Transformer Via Unified And Customized Instruction Tuning For Chest Radiograph Interpretation
Lijian Xu
Ziyu Ni
Xinglong Liu
Xiaosong Wang
Hongsheng Li
Shaoting Zhang
MedIm
LM&MA
32
4
0
02 Nov 2023
Sam-Guided Enhanced Fine-Grained Encoding with Mixed Semantic Learning for Medical Image Captioning
Zhenyu Zhang
Benlu Wang
Weijie Liang
Yizhi Li
Xuechen Guo
Guanhong Wang
Shiyan Li
Gaoang Wang
MedIm
LM&MA
32
7
0
02 Nov 2023
Attention Alignment and Flexible Positional Embeddings Improve Transformer Length Extrapolation
Ta-Chung Chi
Ting-Han Fan
Alexander I. Rudnicky
30
4
0
01 Nov 2023
Efficient LLM Inference on CPUs
Haihao Shen
Hanwen Chang
Bo Dong
Yu Luo
Hengyu Meng
MQ
20
17
0
01 Nov 2023
Learning From Mistakes Makes LLM Better Reasoner
Shengnan An
Zexiong Ma
Zeqi Lin
Nanning Zheng
Jian-Guang Lou
Weizhu Chen
LRM
37
75
0
31 Oct 2023
The Expressibility of Polynomial based Attention Scheme
Zhao Song
Guangyi Xu
Junze Yin
41
5
0
30 Oct 2023
MiLe Loss: a New Loss for Mitigating the Bias of Learning Difficulties in Generative Language Models
Zhenpeng Su
Xing Wu
Xue Bai
Zijia Lin
Hui Chen
Guiguang Ding
Wei Zhou
Songlin Hu
32
5
0
30 Oct 2023
Constituency Parsing using LLMs
Xuefeng Bai
Jialong Wu
Yulong Chen
Zhongqing Wang
Yue Zhang
46
1
0
30 Oct 2023
Mean BERTs make erratic language teachers: the effectiveness of latent bootstrapping in low-resource settings
David Samuel
26
3
0
30 Oct 2023
Pre-trained Recommender Systems: A Causal Debiasing Perspective
Ziqian Lin
Hao Ding
Nghia Hoang
Branislav Kveton
Anoop Deoras
Hao Wang
CML
52
4
0
30 Oct 2023
TeacherLM: Teaching to Fish Rather Than Giving the Fish, Language Modeling Likewise
Nan He
Hanyu Lai
Chenyang Zhao
Zirui Cheng
Junting Pan
...
Zhaohui Hou
Zhiyuan Huang
Shaoqing Lu
Ding Liang
Mingjie Zhan
LRM
29
13
0
29 Oct 2023
The Synergy of Speculative Decoding and Batching in Serving Large Language Models
Qidong Su
Christina Giannoula
Gennady Pekhimenko
27
10
0
28 Oct 2023
TLM: Token-Level Masking for Transformers
Yangjun Wu
Kebin Fang
Dongxian Zhang
Han Wang
Hao Zhang
Gang Chen
31
1
0
28 Oct 2023
Publicly-Detectable Watermarking for Language Models
Jaiden Fairoze
Sanjam Garg
Somesh Jha
Saeed Mahloujifar
Mohammad Mahmoody
Mingyuan Wang
WaLM
139
45
0
27 Oct 2023
Expanding the Set of Pragmatic Considerations in Conversational AI
S. M. Seals
V. Shalin
37
2
0
27 Oct 2023
FP8-LM: Training FP8 Large Language Models
Houwen Peng
Kan Wu
Yixuan Wei
Guoshuai Zhao
Yuxiang Yang
...
Zheng Zhang
Shuguang Liu
Joe Chau
Han Hu
Peng Cheng
MQ
59
40
0
27 Oct 2023
"Honey, Tell Me What's Wrong", Global Explanation of Textual Discriminative Models through Cooperative Generation
Antoine Chaffin
Julien Delaunay
18
0
0
27 Oct 2023
Text Augmented Spatial-aware Zero-shot Referring Image Segmentation
Yuchen Suo
Linchao Zhu
Yi Yang
39
13
0
27 Oct 2023
Transformers as Graph-to-Graph Models
James Henderson
Alireza Mohammadshahi
Andrei Catalin Coman
Lesly Miculicich
GNN
37
6
0
27 Oct 2023
Knowing What LLMs DO NOT Know: A Simple Yet Effective Self-Detection Method
Yukun Zhao
Lingyong Yan
Weiwei Sun
Guoliang Xing
Chong Meng
Shuaiqiang Wang
Zhicong Cheng
Zhaochun Ren
Dawei Yin
38
37
0
27 Oct 2023
Improving Zero-shot Reader by Reducing Distractions from Irrelevant Documents in Open-Domain Question Answering
Sukmin Cho
Jeongyeon Seo
Soyeong Jeong
Jong C. Park
RALM
34
2
0
26 Oct 2023
Automatic Logical Forms improve fidelity in Table-to-Text generation
Iñigo Alonso
Eneko Agirre
LMTD
22
3
0
26 Oct 2023
Previous
1
2
3
...
28
29
30
...
48
49
50
Next