ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.01068
  4. Cited By
OPT: Open Pre-trained Transformer Language Models

OPT: Open Pre-trained Transformer Language Models

2 May 2022
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
Shuohui Chen
Christopher Dewan
Mona T. Diab
Xian Li
Xi Lin
Todor Mihaylov
Myle Ott
Sam Shleifer
Kurt Shuster
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
    VLM
    OSLM
    AI4CE
ArXivPDFHTML

Papers citing "OPT: Open Pre-trained Transformer Language Models"

50 / 2,460 papers shown
Title
Measuring the Knowledge Acquisition-Utilization Gap in Pretrained
  Language Models
Measuring the Knowledge Acquisition-Utilization Gap in Pretrained Language Models
Amirhossein Kazemnejad
Mehdi Rezagholizadeh
Prasanna Parthasarathi
Sarath Chandar
ELM
27
2
0
24 May 2023
David helps Goliath: Inference-Time Collaboration Between Small
  Specialized and Large General Diffusion LMs
David helps Goliath: Inference-Time Collaboration Between Small Specialized and Large General Diffusion LMs
Xiaochuang Han
Sachin Kumar
Yulia Tsvetkov
Marjan Ghazvininejad
DiffM
34
3
0
24 May 2023
DialogVCS: Robust Natural Language Understanding in Dialogue System
  Upgrade
DialogVCS: Robust Natural Language Understanding in Dialogue System Upgrade
Zefan Cai
Xin Zheng
Tianyu Liu
Xu Wang
H. Meng
Jiaqi Han
Gang Yuan
Binghuai Lin
Baobao Chang
Yunbo Cao
28
4
0
24 May 2023
Trusting Your Evidence: Hallucinate Less with Context-aware Decoding
Trusting Your Evidence: Hallucinate Less with Context-aware Decoding
Weijia Shi
Xiaochuang Han
M. Lewis
Yulia Tsvetkov
Luke Zettlemoyer
Scott Yih
HILM
32
192
0
24 May 2023
Have Large Language Models Developed a Personality?: Applicability of
  Self-Assessment Tests in Measuring Personality in LLMs
Have Large Language Models Developed a Personality?: Applicability of Self-Assessment Tests in Measuring Personality in LLMs
Xiaoyang Song
Akshat Gupta
Kiyan Mohebbizadeh
Shujie Hu
Anant Singh
34
25
0
24 May 2023
PEARL: Prompting Large Language Models to Plan and Execute Actions Over
  Long Documents
PEARL: Prompting Large Language Models to Plan and Execute Actions Over Long Documents
Simeng Sun
Yongxu Liu
Shuohang Wang
Chenguang Zhu
Mohit Iyyer
RALM
LRM
ReLM
41
52
0
23 May 2023
MathDial: A Dialogue Tutoring Dataset with Rich Pedagogical Properties
  Grounded in Math Reasoning Problems
MathDial: A Dialogue Tutoring Dataset with Rich Pedagogical Properties Grounded in Math Reasoning Problems
Jakub Macina
Nico Daheim
Sankalan Pal Chowdhury
Tanmay Sinha
Manu Kapur
Iryna Gurevych
Mrinmaya Sachan
LRM
27
56
0
23 May 2023
BAND: Biomedical Alert News Dataset
BAND: Biomedical Alert News Dataset
Z. Fu
Meiru Zhang
Zaiqiao Meng
Yannan Shen
David L. Buckeridge
Nigel Collier
25
3
0
23 May 2023
Prompting Language-Informed Distribution for Compositional Zero-Shot
  Learning
Prompting Language-Informed Distribution for Compositional Zero-Shot Learning
Wentao Bao
Lichang Chen
Heng-Chiao Huang
Yu Kong
CoGe
VLM
38
12
0
23 May 2023
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model
  Pre-training
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training
Hong Liu
Zhiyuan Li
David Leo Wright Hall
Percy Liang
Tengyu Ma
VLM
60
133
0
23 May 2023
QLoRA: Efficient Finetuning of Quantized LLMs
QLoRA: Efficient Finetuning of Quantized LLMs
Tim Dettmers
Artidoro Pagnoni
Ari Holtzman
Luke Zettlemoyer
ALM
73
2,394
0
23 May 2023
WikiChat: Stopping the Hallucination of Large Language Model Chatbots by
  Few-Shot Grounding on Wikipedia
WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on Wikipedia
Sina J. Semnani
Violet Z. Yao
He Zhang
M. Lam
KELM
AI4MH
35
72
0
23 May 2023
Weakly-Supervised Learning of Visual Relations in Multimodal Pretraining
Weakly-Supervised Learning of Visual Relations in Multimodal Pretraining
Emanuele Bugliarello
Aida Nematzadeh
Lisa Anne Hendricks
SSL
37
5
0
23 May 2023
Active Learning Principles for In-Context Learning with Large Language
  Models
Active Learning Principles for In-Context Learning with Large Language Models
Katerina Margatina
Timo Schick
Nikolaos Aletras
Jane Dwivedi-Yu
37
39
0
23 May 2023
Skill-Based Few-Shot Selection for In-Context Learning
Skill-Based Few-Shot Selection for In-Context Learning
Shengnan An
Bo Zhou
Zeqi Lin
Qiang Fu
B. Chen
Nanning Zheng
Weizhu Chen
Jian-Guang Lou
41
31
0
23 May 2023
DetGPT: Detect What You Need via Reasoning
DetGPT: Detect What You Need via Reasoning
Renjie Pi
Jiahui Gao
Shizhe Diao
Rui Pan
Hanze Dong
...
Lewei Yao
Jianhua Han
Hang Xu
Lingpeng Kong Tong Zhang
Tong Zhang
LRM
LM&Ro
29
92
0
23 May 2023
Memory-Efficient Fine-Tuning of Compressed Large Language Models via
  sub-4-bit Integer Quantization
Memory-Efficient Fine-Tuning of Compressed Large Language Models via sub-4-bit Integer Quantization
Jeonghoon Kim
J. H. Lee
Sungdong Kim
Joonsuk Park
Kang Min Yoo
S. Kwon
Dongsoo Lee
MQ
49
100
0
23 May 2023
When Does Monolingual Data Help Multilingual Translation: The Role of
  Domain and Model Scale
When Does Monolingual Data Help Multilingual Translation: The Role of Domain and Model Scale
Christos Baziotis
Biao Zhang
Alexandra Birch
Barry Haddow
37
2
0
23 May 2023
Revisiting Acceptability Judgements
Revisiting Acceptability Judgements
Hai Hu
Ziyin Zhang
Wei Huang
J. Lai
Aini Li
Yi Ma
Jiahui Huang
Peng Zhang
Chien-Jer Charles Lin
Rui Wang
50
2
0
23 May 2023
Can Language Models Understand Physical Concepts?
Can Language Models Understand Physical Concepts?
Lei Li
Jingjing Xu
Qingxiu Dong
Ce Zheng
Qi Liu
Lingpeng Kong
Xu Sun
ALM
38
18
0
23 May 2023
The CoT Collection: Improving Zero-shot and Few-shot Learning of
  Language Models via Chain-of-Thought Fine-Tuning
The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning
Seungone Kim
Se June Joo
Doyoung Kim
Joel Jang
Seonghyeon Ye
Jamin Shin
Minjoon Seo
ALM
RALM
LRM
25
97
0
23 May 2023
Towards A Unified View of Sparse Feed-Forward Network in Pretraining
  Large Language Model
Towards A Unified View of Sparse Feed-Forward Network in Pretraining Large Language Model
Leo Liu
Tim Dettmers
Xi Lin
Ves Stoyanov
Xian Li
MoE
26
9
0
23 May 2023
DetectLLM: Leveraging Log Rank Information for Zero-Shot Detection of
  Machine-Generated Text
DetectLLM: Leveraging Log Rank Information for Zero-Shot Detection of Machine-Generated Text
Jinyan Su
Terry Yue Zhuo
Di Wang
Preslav Nakov
DeLMO
63
126
0
23 May 2023
PaD: Program-aided Distillation Can Teach Small Models Reasoning Better
  than Chain-of-thought Fine-tuning
PaD: Program-aided Distillation Can Teach Small Models Reasoning Better than Chain-of-thought Fine-tuning
Xuekai Zhu
Biqing Qi
Kaiyan Zhang
Xingwei Long
Zhouhan Lin
Bowen Zhou
ALM
LRM
43
19
0
23 May 2023
A Trip Towards Fairness: Bias and De-Biasing in Large Language Models
A Trip Towards Fairness: Bias and De-Biasing in Large Language Models
Leonardo Ranaldi
Elena Sofia Ruzzetti
Davide Venditti
Dario Onorati
Fabio Massimo Zanzotto
45
35
0
23 May 2023
Images in Language Space: Exploring the Suitability of Large Language
  Models for Vision & Language Tasks
Images in Language Space: Exploring the Suitability of Large Language Models for Vision & Language Tasks
Sherzod Hakimov
David Schlangen
VLM
36
5
0
23 May 2023
i-Code Studio: A Configurable and Composable Framework for Integrative
  AI
i-Code Studio: A Configurable and Composable Framework for Integrative AI
Yuwei Fang
Mahmoud Khademi
Chenguang Zhu
Ziyi Yang
Reid Pryzant
...
Yao Qian
Takuya Yoshioka
Lu Yuan
Michael Zeng
Xuedong Huang
38
2
0
23 May 2023
Discrete Prompt Optimization via Constrained Generation for Zero-shot
  Re-ranker
Discrete Prompt Optimization via Constrained Generation for Zero-shot Re-ranker
Sukmin Cho
Soyeong Jeong
Jeongyeon Seo
Jong C. Park
OffRL
73
23
0
23 May 2023
CombLM: Adapting Black-Box Language Models through Small Fine-Tuned
  Models
CombLM: Adapting Black-Box Language Models through Small Fine-Tuned Models
Aitor Ormazabal
Mikel Artetxe
Eneko Agirre
50
19
0
23 May 2023
Do All Languages Cost the Same? Tokenization in the Era of Commercial
  Language Models
Do All Languages Cost the Same? Tokenization in the Era of Commercial Language Models
Orevaoghene Ahia
Sachin Kumar
Hila Gonen
Jungo Kasai
David R. Mortensen
Noah A. Smith
Yulia Tsvetkov
53
83
0
23 May 2023
MemeCap: A Dataset for Captioning and Interpreting Memes
MemeCap: A Dataset for Captioning and Interpreting Memes
EunJeong Hwang
Vered Shwartz
VLM
27
36
0
23 May 2023
Few-Shot Data Synthesis for Open Domain Multi-Hop Question Answering
Few-Shot Data Synthesis for Open Domain Multi-Hop Question Answering
Mingda Chen
Xilun Chen
Wen-tau Yih
SyDa
36
6
0
23 May 2023
Polyglot or Not? Measuring Multilingual Encyclopedic Knowledge in
  Foundation Models
Polyglot or Not? Measuring Multilingual Encyclopedic Knowledge in Foundation Models
Tim Schott
Daniel Furman
Shreshta Bhat
ELM
40
4
0
23 May 2023
ChatGPT as your Personal Data Scientist
ChatGPT as your Personal Data Scientist
Md. Mahadi Hassan
Alex Knipper
Shubhra (Santu) Karmaker
LM&MA
LLMAG
AI4CE
55
18
0
23 May 2023
Small Language Models Improve Giants by Rewriting Their Outputs
Small Language Models Improve Giants by Rewriting Their Outputs
Giorgos Vernikos
Arthur Bravzinskas
Jakub Adamek
Jonathan Mallinson
Aliaksei Severyn
Eric Malmi
BDL
LRM
43
14
0
22 May 2023
Can ChatGPT Detect Intent? Evaluating Large Language Models for Spoken
  Language Understanding
Can ChatGPT Detect Intent? Evaluating Large Language Models for Spoken Language Understanding
Mutian He
Philip N. Garner
ELM
AI4MH
LRM
53
22
0
22 May 2023
Look-back Decoding for Open-Ended Text Generation
Look-back Decoding for Open-Ended Text Generation
Nan Xu
Chunting Zhou
Asli Celikyilmaz
Xuezhe Ma
36
9
0
22 May 2023
Matcher: Segment Anything with One Shot Using All-Purpose Feature
  Matching
Matcher: Segment Anything with One Shot Using All-Purpose Feature Matching
Yang Liu
Muzhi Zhu
Hengtao Li
Hao Chen
Xinlong Wang
Chunhua Shen
VLM
MLLM
88
87
0
22 May 2023
RecurrentGPT: Interactive Generation of (Arbitrarily) Long Text
RecurrentGPT: Interactive Generation of (Arbitrarily) Long Text
Wangchunshu Zhou
Yuchen Eleanor Jiang
Peng Cui
Tiannan Wang
Zhenxin Xiao
Yifan Hou
Ryan Cotterell
Mrinmaya Sachan
RALM
LLMAG
90
59
0
22 May 2023
Measuring Inductive Biases of In-Context Learning with Underspecified
  Demonstrations
Measuring Inductive Biases of In-Context Learning with Underspecified Demonstrations
Chenglei Si
Dan Friedman
Nitish Joshi
Shi Feng
Danqi Chen
He He
15
42
0
22 May 2023
VideoLLM: Modeling Video Sequence with Large Language Models
VideoLLM: Modeling Video Sequence with Large Language Models
Guo Chen
Yin-Dong Zheng
Jiahao Wang
Jilan Xu
Yifei Huang
...
Yi Wang
Yali Wang
Yu Qiao
Tong Lu
Limin Wang
MLLM
103
77
0
22 May 2023
MAGE: Machine-generated Text Detection in the Wild
MAGE: Machine-generated Text Detection in the Wild
Yafu Li
Qintong Li
Leyang Cui
Wei Bi
Zhilin Wang
Longyue Wang
Linyi Yang
Shuming Shi
Yue Zhang
DeLMO
43
46
0
22 May 2023
To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis
To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis
Fuzhao Xue
Yao Fu
Wangchunshu Zhou
Zangwei Zheng
Yang You
88
79
0
22 May 2023
Multi-Task Instruction Tuning of LLaMa for Specific Scenarios: A
  Preliminary Study on Writing Assistance
Multi-Task Instruction Tuning of LLaMa for Specific Scenarios: A Preliminary Study on Writing Assistance
Yue Zhang
Leyang Cui
Deng Cai
Xinting Huang
Tao Fang
Wei Bi
ALM
38
36
0
22 May 2023
Editing Large Language Models: Problems, Methods, and Opportunities
Editing Large Language Models: Problems, Methods, and Opportunities
Yunzhi Yao
Peng Wang
Bo Tian
Shuyang Cheng
Zhoubo Li
Shumin Deng
Huajun Chen
Ningyu Zhang
KELM
39
282
0
22 May 2023
A Pretrainer's Guide to Training Data: Measuring the Effects of Data
  Age, Domain Coverage, Quality, & Toxicity
A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity
Shayne Longpre
Gregory Yauney
Emily Reif
Katherine Lee
Adam Roberts
...
Denny Zhou
Jason W. Wei
Kevin Robinson
David M. Mimno
Daphne Ippolito
33
150
0
22 May 2023
RWKV: Reinventing RNNs for the Transformer Era
RWKV: Reinventing RNNs for the Transformer Era
Bo Peng
Eric Alcaide
Quentin G. Anthony
Alon Albalak
Samuel Arcadinho
...
Qihang Zhao
P. Zhou
Qinghua Zhou
Jian Zhu
Rui-Jie Zhu
97
565
0
22 May 2023
Iterative Forward Tuning Boosts In-Context Learning in Language Models
Iterative Forward Tuning Boosts In-Context Learning in Language Models
Jiaxi Yang
Binyuan Hui
Min Yang
Bailin Wang
Bowen Li
Binhua Li
Fei Huang
Yongbin Li
46
16
0
22 May 2023
Textually Pretrained Speech Language Models
Textually Pretrained Speech Language Models
Michael Hassid
Tal Remez
Tu Nguyen
Itai Gat
Alexis Conneau
...
Alexandre Défossez
Gabriel Synnaeve
Emmanuel Dupoux
Roy Schwartz
Yossi Adi
VLM
SyDa
51
54
0
22 May 2023
GPT-SW3: An Autoregressive Language Model for the Nordic Languages
GPT-SW3: An Autoregressive Language Model for the Nordic Languages
Ariel Ekgren
Amaru Cuba Gyllensten
Felix Stollenwerk
Joey Öhman
T. Isbister
Evangelia Gogoulou
F. Carlsson
Alice Heiman
Judit Casademont
Magnus Sahlgren
34
13
0
22 May 2023
Previous
123...404142...484950
Next