ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.01068
  4. Cited By
OPT: Open Pre-trained Transformer Language Models

OPT: Open Pre-trained Transformer Language Models

2 May 2022
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
Shuohui Chen
Christopher Dewan
Mona T. Diab
Xian Li
Xi Lin
Todor Mihaylov
Myle Ott
Sam Shleifer
Kurt Shuster
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
    VLM
    OSLM
    AI4CE
ArXivPDFHTML

Papers citing "OPT: Open Pre-trained Transformer Language Models"

50 / 2,460 papers shown
Title
Long-form analogies generated by chatGPT lack human-like
  psycholinguistic properties
Long-form analogies generated by chatGPT lack human-like psycholinguistic properties
S. M. Seals
V. Shalin
24
11
0
07 Jun 2023
STEPS: A Benchmark for Order Reasoning in Sequential Tasks
STEPS: A Benchmark for Order Reasoning in Sequential Tasks
Weizhi Wang
Hong Wang
Xi Yan
LRM
37
1
0
07 Jun 2023
MISGENDERED: Limits of Large Language Models in Understanding Pronouns
MISGENDERED: Limits of Large Language Models in Understanding Pronouns
Tamanna Hossain
Sunipa Dev
Sameer Singh
AILaw
48
34
0
06 Jun 2023
Deductive Verification of Chain-of-Thought Reasoning
Deductive Verification of Chain-of-Thought Reasoning
Z. Ling
Yunhao Fang
Xuanlin Li
Zhiao Huang
Mingu Lee
Roland Memisevic
Hao Su
ReLM
LRM
37
126
0
06 Jun 2023
The Emergence of Essential Sparsity in Large Pre-trained Models: The
  Weights that Matter
The Emergence of Essential Sparsity in Large Pre-trained Models: The Weights that Matter
Ajay Jaiswal
Shiwei Liu
Tianlong Chen
Zhangyang Wang
VLM
36
33
0
06 Jun 2023
Early Weight Averaging meets High Learning Rates for LLM Pre-training
Early Weight Averaging meets High Learning Rates for LLM Pre-training
Sunny Sanyal
A. Neerkaje
Jean Kaddour
Abhishek Kumar
Sujay Sanghavi
MoMe
43
18
0
05 Jun 2023
Information Flow Control in Machine Learning through Modular Model
  Architecture
Information Flow Control in Machine Learning through Modular Model Architecture
Trishita Tiwari
Suchin Gururangan
Chuan Guo
Weizhe Hua
Sanjay Kariyappa
Udit Gupta
Wenjie Xiong
Kiwan Maeng
Hsien-Hsin S. Lee
G. E. Suh
26
6
0
05 Jun 2023
On "Scientific Debt" in NLP: A Case for More Rigour in Language Model
  Pre-Training Research
On "Scientific Debt" in NLP: A Case for More Rigour in Language Model Pre-Training Research
Made Nindyatama Nityasya
Haryo Akbarianto Wibowo
Alham Fikri Aji
Genta Indra Winata
Radityo Eko Prasojo
Phil Blunsom
A. Kuncoro
27
8
0
05 Jun 2023
Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video
  Understanding
Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
Hang Zhang
Xin Li
Lidong Bing
MLLM
100
970
0
05 Jun 2023
MCTS: A Multi-Reference Chinese Text Simplification Dataset
MCTS: A Multi-Reference Chinese Text Simplification Dataset
Ruining Chong
Luming Lu
Liner Yang
Jinran Nie
Zhenghao Liu
Shuo Wang
Shuhan Zhou
Yaoxin Li
Erhong Yang
39
0
0
05 Jun 2023
LexGPT 0.1: pre-trained GPT-J models with Pile of Law
LexGPT 0.1: pre-trained GPT-J models with Pile of Law
Jieh-Sheng Lee
AILaw
24
10
0
05 Jun 2023
Efficient GPT Model Pre-training using Tensor Train Matrix
  Representation
Efficient GPT Model Pre-training using Tensor Train Matrix Representation
V. Chekalina
Georgii Sergeevich Novikov
Julia Gusak
Ivan Oseledets
Alexander Panchenko
22
8
0
05 Jun 2023
CELDA: Leveraging Black-box Language Model as Enhanced Classifier
  without Labels
CELDA: Leveraging Black-box Language Model as Enhanced Classifier without Labels
Hyunsoo Cho
Youna Kim
Sang-goo Lee
14
3
0
05 Jun 2023
Introduction to Latent Variable Energy-Based Models: A Path Towards
  Autonomous Machine Intelligence
Introduction to Latent Variable Energy-Based Models: A Path Towards Autonomous Machine Intelligence
Anna Dawid
Yann LeCun
DRL
29
30
0
05 Jun 2023
Prompt to be Consistent is Better than Self-Consistent? Few-Shot and
  Zero-Shot Fact Verification with Pre-trained Language Models
Prompt to be Consistent is Better than Self-Consistent? Few-Shot and Zero-Shot Fact Verification with Pre-trained Language Models
Fengzhu Zeng
Wei Gao
25
5
0
05 Jun 2023
Evaluation of AI Chatbots for Patient-Specific EHR Questions
Evaluation of AI Chatbots for Patient-Specific EHR Questions
Alaleh Hamidi
Kirk Roberts
ELM
LM&MA
AI4MH
21
13
0
05 Jun 2023
bgGLUE: A Bulgarian General Language Understanding Evaluation Benchmark
bgGLUE: A Bulgarian General Language Understanding Evaluation Benchmark
Momchil Hardalov
Pepa Atanasova
Todor Mihaylov
G. Angelova
K. Simov
P. Osenova
Ves Stoyanov
Ivan Koychev
Preslav Nakov
Dragomir R. Radev
ELM
FedML
42
4
0
04 Jun 2023
Temporal Dynamic Quantization for Diffusion Models
Temporal Dynamic Quantization for Diffusion Models
Junhyuk So
Jungwon Lee
Daehyun Ahn
Hyungjun Kim
Eunhyeok Park
DiffM
MQ
31
61
0
04 Jun 2023
A Mathematical Abstraction for Balancing the Trade-off Between
  Creativity and Reality in Large Language Models
A Mathematical Abstraction for Balancing the Trade-off Between Creativity and Reality in Large Language Models
Ritwik Sinha
Zhao Song
Dinesh Manocha
40
24
0
04 Jun 2023
OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and
  Inference of Large Language Models
OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Models
Changhun Lee
Jungyu Jin
Taesu Kim
Hyungjun Kim
Eunhyeok Park
MQ
19
50
0
04 Jun 2023
A Technical Report for Polyglot-Ko: Open-Source Large-Scale Korean
  Language Models
A Technical Report for Polyglot-Ko: Open-Source Large-Scale Korean Language Models
H. Ko
Kichang Yang
Minho Ryu
Taekyoon Choi
Seungmu Yang
Jiwung Hyun
Sung-Yong Park
Kyubyong Park
42
29
0
04 Jun 2023
On Optimal Caching and Model Multiplexing for Large Model Inference
On Optimal Caching and Model Multiplexing for Large Model Inference
Banghua Zhu
Ying Sheng
Lianmin Zheng
Clark W. Barrett
Michael I. Jordan
Jiantao Jiao
35
19
0
03 Jun 2023
Revisiting the Role of Language Priors in Vision-Language Models
Revisiting the Role of Language Priors in Vision-Language Models
Zhiqiu Lin
Xinyue Chen
Deepak Pathak
Pengchuan Zhang
Deva Ramanan
VLM
36
22
0
02 Jun 2023
GAIA Search: Hugging Face and Pyserini Interoperability for NLP Training
  Data Exploration
GAIA Search: Hugging Face and Pyserini Interoperability for NLP Training Data Exploration
Aleksandra Piktus
Odunayo Ogundepo
Christopher Akiki
Akintunde Oladipo
Xinyu Crystina Zhang
Hailey Schoelkopf
Stella Biderman
Martin Potthast
Jimmy J. Lin
CVBM
44
10
0
02 Jun 2023
Responsible Task Automation: Empowering Large Language Models as
  Responsible Task Automators
Responsible Task Automation: Empowering Large Language Models as Responsible Task Automators
Zhizheng Zhang
Xiaoyi Zhang
Wenxuan Xie
Yan Lu
29
13
0
02 Jun 2023
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora
  with Web Data, and Web Data Only
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only
Guilherme Penedo
Quentin Malartic
Daniel Hesslow
Ruxandra-Aimée Cojocaru
Alessandro Cappelli
Hamza Alobeidli
B. Pannier
Ebtesam Almazrouei
Julien Launay
71
755
0
01 Jun 2023
MEWL: Few-shot multimodal word learning with referential uncertainty
MEWL: Few-shot multimodal word learning with referential uncertainty
Guangyuan Jiang
Manjie Xu
Shiji Xin
Weihan Liang
Yujia Peng
Chi Zhang
Yixin Zhu
OffRL
41
16
0
01 Jun 2023
Make Pre-trained Model Reversible: From Parameter to Memory Efficient
  Fine-Tuning
Make Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuning
Baohao Liao
Shaomu Tan
Christof Monz
KELM
23
29
0
01 Jun 2023
CFL: Causally Fair Language Models Through Token-level Attribute
  Controlled Generation
CFL: Causally Fair Language Models Through Token-level Attribute Controlled Generation
Rahul Madhavan
Rishabh Garg
Kahini Wadhawan
S. Mehta
38
5
0
01 Jun 2023
FlexRound: Learnable Rounding based on Element-wise Division for
  Post-Training Quantization
FlexRound: Learnable Rounding based on Element-wise Division for Post-Training Quantization
J. H. Lee
Jeonghoon Kim
S. Kwon
Dongsoo Lee
MQ
35
33
0
01 Jun 2023
A Survey on Large Language Models for Recommendation
A Survey on Large Language Models for Recommendation
Likang Wu
Zhilan Zheng
Zhaopeng Qiu
Hao Wang
Hongchao Gu
...
Chen Zhu
Hengshu Zhu
Qi Liu
Hui Xiong
Enhong Chen
58
367
0
31 May 2023
Dense and Aligned Captions (DAC) Promote Compositional Reasoning in VL
  Models
Dense and Aligned Captions (DAC) Promote Compositional Reasoning in VL Models
Sivan Doveh
Assaf Arbelle
Sivan Harary
Roei Herzig
Donghyun Kim
...
Yikang Shen
Raja Giryes
Rogerio Feris
S. Ullman
Leonid Karlinsky
VLM
CoGe
62
53
0
31 May 2023
The Impact of Positional Encoding on Length Generalization in
  Transformers
The Impact of Positional Encoding on Length Generalization in Transformers
Amirhossein Kazemnejad
Inkit Padhi
Karthikeyan N. Ramamurthy
Payel Das
Siva Reddy
47
182
0
31 May 2023
Intriguing Properties of Quantization at Scale
Intriguing Properties of Quantization at Scale
Arash Ahmadian
Saurabh Dash
Hongyu Chen
Bharat Venkitesh
Stephen Gou
Phil Blunsom
Ahmet Üstün
Sara Hooker
MQ
54
38
0
30 May 2023
Generating with Confidence: Uncertainty Quantification for Black-box
  Large Language Models
Generating with Confidence: Uncertainty Quantification for Black-box Large Language Models
Zhen Lin
Shubhendu Trivedi
Jimeng Sun
HILM
29
129
0
30 May 2023
AlphaBlock: Embodied Finetuning for Vision-Language Reasoning in Robot
  Manipulation
AlphaBlock: Embodied Finetuning for Vision-Language Reasoning in Robot Manipulation
Chuhao Jin
Wenhui Tan
Jiange Yang
Bei Liu
Ruihua Song
Limin Wang
Jianlong Fu
LM&Ro
LRM
30
24
0
30 May 2023
Generate then Select: Open-ended Visual Question Answering Guided by
  World Knowledge
Generate then Select: Open-ended Visual Question Answering Guided by World Knowledge
Xingyu Fu
Shenmin Zhang
Gukyeong Kwon
Pramuditha Perera
Henghui Zhu
...
Zhiguo Wang
Vittorio Castelli
Patrick Ng
Dan Roth
Bing Xiang
37
19
0
30 May 2023
GPT4Tools: Teaching Large Language Model to Use Tools via
  Self-instruction
GPT4Tools: Teaching Large Language Model to Use Tools via Self-instruction
Rui Yang
Lin Song
Yanwei Li
Sijie Zhao
Yixiao Ge
Xiu Li
Ying Shan
SyDa
MLLM
41
211
0
30 May 2023
Contextual Object Detection with Multimodal Large Language Models
Contextual Object Detection with Multimodal Large Language Models
Yuhang Zang
Wei Li
Jun Han
Kaiyang Zhou
Chen Change Loy
ObjD
VLM
MLLM
50
79
0
29 May 2023
Marked Personas: Using Natural Language Prompts to Measure Stereotypes
  in Language Models
Marked Personas: Using Natural Language Prompts to Measure Stereotypes in Language Models
Myra Cheng
Esin Durmus
Dan Jurafsky
33
181
0
29 May 2023
LM-CPPF: Paraphrasing-Guided Data Augmentation for Contrastive
  Prompt-Based Few-Shot Fine-Tuning
LM-CPPF: Paraphrasing-Guided Data Augmentation for Contrastive Prompt-Based Few-Shot Fine-Tuning
Amirhossein Abaskohi
S. Rothe
Yadollah Yaghoobzadeh
VLM
40
16
0
29 May 2023
Code Prompting: a Neural Symbolic Method for Complex Reasoning in Large
  Language Models
Code Prompting: a Neural Symbolic Method for Complex Reasoning in Large Language Models
Yitao Hu
Haotong Yang
Zhouchen Lin
Muhan Zhang
ReLM
LRM
34
17
0
29 May 2023
BigTranslate: Augmenting Large Language Models with Multilingual
  Translation Capability over 100 Languages
BigTranslate: Augmenting Large Language Models with Multilingual Translation Capability over 100 Languages
Wen Yang
Chong Li
Jiajun Zhang
Chengqing Zong
LRM
30
48
0
29 May 2023
Test-Time Adaptation with CLIP Reward for Zero-Shot Generalization in
  Vision-Language Models
Test-Time Adaptation with CLIP Reward for Zero-Shot Generalization in Vision-Language Models
Shuai Zhao
Xiaohan Wang
Linchao Zhu
Yezhou Yang
VLM
39
22
0
29 May 2023
Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation
Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation
Jia-Bin Huang
Yi Ren
Rongjie Huang
Dongchao Yang
Zhenhui Ye
Chen Zhang
Jinglin Liu
Xiang Yin
Zejun Ma
Zhou Zhao
DiffM
37
61
0
29 May 2023
LLM-QAT: Data-Free Quantization Aware Training for Large Language Models
LLM-QAT: Data-Free Quantization Aware Training for Large Language Models
Zechun Liu
Barlas Oğuz
Changsheng Zhao
Ernie Chang
Pierre Stock
Yashar Mehdad
Yangyang Shi
Raghuraman Krishnamoorthi
Vikas Chandra
MQ
60
193
0
29 May 2023
Language Models are Bounded Pragmatic Speakers: Understanding RLHF from
  a Bayesian Cognitive Modeling Perspective
Language Models are Bounded Pragmatic Speakers: Understanding RLHF from a Bayesian Cognitive Modeling Perspective
Khanh Nguyen
LRM
34
8
0
28 May 2023
Mitigating Label Biases for In-context Learning
Mitigating Label Biases for In-context Learning
Yu Fei
Yifan Hou
Zeming Chen
Antoine Bosselut
43
71
0
28 May 2023
FuseCap: Leveraging Large Language Models for Enriched Fused Image
  Captions
FuseCap: Leveraging Large Language Models for Enriched Fused Image Captions
Noam Rotstein
David Bensaid
Shaked Brody
Roy Ganz
Ron Kimmel
VLM
31
27
0
28 May 2023
Knowledge-Augmented Reasoning Distillation for Small Language Models in
  Knowledge-Intensive Tasks
Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-Intensive Tasks
Minki Kang
Seanie Lee
Jinheon Baek
Kenji Kawaguchi
Sung Ju Hwang
ALM
LRM
60
56
0
28 May 2023
Previous
123...383940...484950
Next