ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.01068
  4. Cited By
OPT: Open Pre-trained Transformer Language Models

OPT: Open Pre-trained Transformer Language Models

2 May 2022
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
Shuohui Chen
Christopher Dewan
Mona T. Diab
Xian Li
Xi Lin
Todor Mihaylov
Myle Ott
Sam Shleifer
Kurt Shuster
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
    VLM
    OSLM
    AI4CE
ArXivPDFHTML

Papers citing "OPT: Open Pre-trained Transformer Language Models"

50 / 2,473 papers shown
Title
Accurate Retraining-free Pruning for Pretrained Encoder-based Language
  Models
Accurate Retraining-free Pruning for Pretrained Encoder-based Language Models
Seungcheol Park
Ho-Jin Choi
U. Kang
VLM
49
6
0
07 Aug 2023
RecycleGPT: An Autoregressive Language Model with Recyclable Module
RecycleGPT: An Autoregressive Language Model with Recyclable Module
Yu Jiang
Qiaozhi He
Xiaomin Zhuang
Zhihua Wu
Kunpeng Wang
Wenlai Zhao
Guangwen Yang
KELM
40
3
0
07 Aug 2023
Improving Generalization of Image Captioning with Unsupervised Prompt
  Learning
Improving Generalization of Image Captioning with Unsupervised Prompt Learning
Hongchen Wei
Zhenzhong Chen
VLM
51
3
0
05 Aug 2023
PromptCARE: Prompt Copyright Protection by Watermark Injection and
  Verification
PromptCARE: Prompt Copyright Protection by Watermark Injection and Verification
Hongwei Yao
Jian Lou
Kui Ren
Zhan Qin
AAML
VLM
52
26
0
05 Aug 2023
MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities
MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities
Weihao Yu
Zhengyuan Yang
Linjie Li
Jianfeng Wang
Kevin Qinghong Lin
Zicheng Liu
Xinchao Wang
Lijuan Wang
MLLM
60
639
0
04 Aug 2023
Text2KGBench: A Benchmark for Ontology-Driven Knowledge Graph Generation
  from Text
Text2KGBench: A Benchmark for Ontology-Driven Knowledge Graph Generation from Text
Nandana Mihindukulasooriya
Sanju Tiwari
Carlos F. Enguix
K. Lata
44
55
0
04 Aug 2023
Baby Llama: knowledge distillation from an ensemble of teachers trained
  on a small dataset with no performance penalty
Baby Llama: knowledge distillation from an ensemble of teachers trained on a small dataset with no performance penalty
I. Timiryasov
J. Tastet
31
48
0
03 Aug 2023
The All-Seeing Project: Towards Panoptic Visual Recognition and
  Understanding of the Open World
The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World
Weiyun Wang
Min Shi
Qingyun Li
Wen Wang
Zhenhang Huang
...
Zhiguo Cao
Yushi Chen
Tong Lu
Jifeng Dai
Yu Qiao
LRM
MLLM
55
85
0
03 Aug 2023
RegionBLIP: A Unified Multi-modal Pre-training Framework for Holistic
  and Regional Comprehension
RegionBLIP: A Unified Multi-modal Pre-training Framework for Holistic and Regional Comprehension
Qiang-feng Zhou
Chaohui Yu
Shaofeng Zhang
Sitong Wu
Zhibin Wang
Fan Wang
39
27
0
03 Aug 2023
OpenFlamingo: An Open-Source Framework for Training Large Autoregressive
  Vision-Language Models
OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models
Anas Awadalla
Irena Gao
Josh Gardner
Jack Hessel
Yusuf Hanafy
...
Simon Kornblith
Pang Wei Koh
Gabriel Ilharco
Mitchell Wortsman
Ludwig Schmidt
MLLM
71
406
0
02 Aug 2023
DeepSpeed-Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like
  Models at All Scales
DeepSpeed-Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales
Z. Yao
Reza Yazdani Aminabadi
Olatunji Ruwase
Samyam Rajbhandari
Xiaoxia Wu
...
Heyang Qin
Masahiro Tanaka
Shuai Che
Shuaiwen Leon Song
Yuxiong He
ALM
OffRL
48
69
0
02 Aug 2023
Teaching Smaller Language Models To Generalise To Unseen Compositional
  Questions
Teaching Smaller Language Models To Generalise To Unseen Compositional Questions
Tim Hartill
N. Tan
Michael Witbrock
Patricia J. Riddle
ReLM
KELM
LRM
39
2
0
02 Aug 2023
Tool Documentation Enables Zero-Shot Tool-Usage with Large Language
  Models
Tool Documentation Enables Zero-Shot Tool-Usage with Large Language Models
Cheng-Yu Hsieh
Sibei Chen
Chun-Liang Li
Yasuhisa Fujii
Alexander Ratner
Chen-Yu Lee
Ranjay Krishna
Tomas Pfister
LLMAG
SyDa
59
42
0
01 Aug 2023
Advancing Beyond Identification: Multi-bit Watermark for Large Language
  Models
Advancing Beyond Identification: Multi-bit Watermark for Large Language Models
Kiyoon Yoo
Wonhyuk Ahn
Nojun Kwak
WaLM
45
17
0
01 Aug 2023
Evaluating Correctness and Faithfulness of Instruction-Following Models
  for Question Answering
Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering
Vaibhav Adlakha
Parishad BehnamGhader
Xing Han Lù
Nicholas Meade
Siva Reddy
49
122
0
31 Jul 2023
On the Trustworthiness Landscape of State-of-the-art Generative Models:
  A Survey and Outlook
On the Trustworthiness Landscape of State-of-the-art Generative Models: A Survey and Outlook
Mingyuan Fan
Chengyu Wang
Cen Chen
Yang Liu
Jun Huang
HILM
44
3
0
31 Jul 2023
Scaling Sentence Embeddings with Large Language Models
Scaling Sentence Embeddings with Large Language Models
Ting Jiang
Shaohan Huang
Zhongzhi Luan
Deqing Wang
Fuzhen Zhuang
LRM
49
41
0
31 Jul 2023
Transferable Decoding with Visual Entities for Zero-Shot Image
  Captioning
Transferable Decoding with Visual Entities for Zero-Shot Image Captioning
Junjie Fei
Teng Wang
Jinrui Zhang
Zhenyu He
Chengjie Wang
Feng Zheng
VLM
36
34
0
31 Jul 2023
Camoscio: an Italian Instruction-tuned LLaMA
Camoscio: an Italian Instruction-tuned LLaMA
Andrea Santilli
Emanuele Rodolà
37
26
0
31 Jul 2023
Bridging the Gap: Exploring the Capabilities of Bridge-Architectures for
  Complex Visual Reasoning Tasks
Bridging the Gap: Exploring the Capabilities of Bridge-Architectures for Complex Visual Reasoning Tasks
Kousik Rajesh
Mrigank Raman
M. A. Karim
Pranit Chawla
VLM
30
2
0
31 Jul 2023
An Unforgeable Publicly Verifiable Watermark for Large Language Models
An Unforgeable Publicly Verifiable Watermark for Large Language Models
Aiwei Liu
Leyi Pan
Xuming Hu
Shuang Li
Lijie Wen
Irwin King
Philip S. Yu
WaLM
61
33
0
30 Jul 2023
UnIVAL: Unified Model for Image, Video, Audio and Language Tasks
UnIVAL: Unified Model for Image, Video, Audio and Language Tasks
Mustafa Shukor
Corentin Dancette
Alexandre Ramé
Matthieu Cord
MoMe
MLLM
61
44
0
30 Jul 2023
Okapi: Instruction-tuned Large Language Models in Multiple Languages
  with Reinforcement Learning from Human Feedback
Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback
Viet Dac Lai
Chien Van Nguyen
Nghia Trung Ngo
Thuat Nguyen
Franck Dernoncourt
Ryan Rossi
Thien Huu Nguyen
ALM
55
135
0
29 Jul 2023
Towards Codable Watermarking for Injecting Multi-bits Information to
  LLMs
Towards Codable Watermarking for Injecting Multi-bits Information to LLMs
Lean Wang
Wenkai Yang
Deli Chen
Hao Zhou
Yankai Lin
Fandong Meng
Jie Zhou
Xu Sun
WaLM
48
16
0
29 Jul 2023
RSGPT: A Remote Sensing Vision Language Model and Benchmark
RSGPT: A Remote Sensing Vision Language Model and Benchmark
Yuan Hu
Jianlong Yuan
Congcong Wen
Xiaonan Lu
Xiang Li
VLM
36
104
0
28 Jul 2023
SuperCLUE: A Comprehensive Chinese Large Language Model Benchmark
SuperCLUE: A Comprehensive Chinese Large Language Model Benchmark
Liang Xu
Anqi Li
Lei Zhu
Han Xue
Changtai Zhu
Kangkang Zhao
Hao He
Xuanwei Zhang
Qiyue Kang
Zhenzhong Lan
RALM
ELM
LRM
20
52
0
27 Jul 2023
Incrementally-Computable Neural Networks: Efficient Inference for
  Dynamic Inputs
Incrementally-Computable Neural Networks: Efficient Inference for Dynamic Inputs
Or Sharir
Anima Anandkumar
39
0
0
27 Jul 2023
Exploiting the Potential of Seq2Seq Models as Robust Few-Shot Learners
Exploiting the Potential of Seq2Seq Models as Robust Few-Shot Learners
Jihyeon Janel Lee
Dain Kim
Doohae Jung
Boseop Kim
Kyoung-Woon On
36
0
0
27 Jul 2023
Skill-it! A Data-Driven Skills Framework for Understanding and Training
  Language Models
Skill-it! A Data-Driven Skills Framework for Understanding and Training Language Models
Mayee F. Chen
Nicholas Roberts
Kush S. Bhatia
Jue Wang
Ce Zhang
Frederic Sala
Christopher Ré
SyDa
38
54
0
26 Jul 2023
Trustworthiness of Children Stories Generated by Large Language Models
Trustworthiness of Children Stories Generated by Large Language Models
Prabin Bhandari
H. M. Brennan
52
2
0
25 Jul 2023
Foundational Models Defining a New Era in Vision: A Survey and Outlook
Foundational Models Defining a New Era in Vision: A Survey and Outlook
Muhammad Awais
Muzammal Naseer
Salman Khan
Rao Muhammad Anwer
Hisham Cholakkal
M. Shah
Ming-Hsuan Yang
Fahad Shahbaz Khan
VLM
51
120
0
25 Jul 2023
Opinion Mining Using Population-tuned Generative Language Models
Opinion Mining Using Population-tuned Generative Language Models
Allmin Pradhap Singh Susaiyah
Abhinay Pandya
Aki Härmä
25
0
0
24 Jul 2023
Enhancing Human-like Multi-Modal Reasoning: A New Challenging Dataset
  and Comprehensive Framework
Enhancing Human-like Multi-Modal Reasoning: A New Challenging Dataset and Comprehensive Framework
Jingxuan Wei
Cheng Tan
Zhangyang Gao
Linzhuang Sun
Siyuan Li
Bihui Yu
R. Guo
Stan Z. Li
LRM
60
8
0
24 Jul 2023
In-Context Learning Learns Label Relationships but Is Not Conventional
  Learning
In-Context Learning Learns Label Relationships but Is Not Conventional Learning
Jannik Kossen
Y. Gal
Tom Rainforth
67
31
0
23 Jul 2023
A Zero-shot and Few-shot Study of Instruction-Finetuned Large Language
  Models Applied to Clinical and Biomedical Tasks
A Zero-shot and Few-shot Study of Instruction-Finetuned Large Language Models Applied to Clinical and Biomedical Tasks
Yanis Labrak
Mickael Rouvier
Richard Dufour
LM&MA
43
26
0
22 Jul 2023
Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot
  Classification
Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification
Neel Guha
Mayee F. Chen
Kush S. Bhatia
Azalia Mirhoseini
Frederic Sala
Christopher Ré
45
4
0
20 Jul 2023
FinGPT: Democratizing Internet-scale Data for Financial Large Language
  Models
FinGPT: Democratizing Internet-scale Data for Financial Large Language Models
Xiao-Yang Liu
Guoxuan Wang
Hongyang Yang
Daochen Zha
AIFin
49
43
0
19 Jul 2023
Can Instruction Fine-Tuned Language Models Identify Social Bias through
  Prompting?
Can Instruction Fine-Tuned Language Models Identify Social Bias through Prompting?
O. Dige
Jacob-Junqi Tian
David B. Emerson
Faiza Khan Khattak
ALM
20
5
0
19 Jul 2023
DialogStudio: Towards Richest and Most Diverse Unified Dataset
  Collection for Conversational AI
DialogStudio: Towards Richest and Most Diverse Unified Dataset Collection for Conversational AI
Jianguo Zhang
Kun Qian
Zhiwei Liu
Shelby Heinecke
Rui Meng
Ye Liu
Zhou Yu
Huan Wang
Silvio Savarese
Caiming Xiong
46
22
0
19 Jul 2023
ZeroQuant-FP: A Leap Forward in LLMs Post-Training W4A8 Quantization
  Using Floating-Point Formats
ZeroQuant-FP: A Leap Forward in LLMs Post-Training W4A8 Quantization Using Floating-Point Formats
Xiaoxia Wu
Z. Yao
Yuxiong He
MQ
35
43
0
19 Jul 2023
Thrust: Adaptively Propels Large Language Models with External Knowledge
Thrust: Adaptively Propels Large Language Models with External Knowledge
Xinran Zhao
Hongming Zhang
Xiaoman Pan
Wenlin Yao
Dong Yu
Jianshu Chen
KELM
93
5
0
19 Jul 2023
ChatSpot: Bootstrapping Multimodal LLMs via Precise Referring
  Instruction Tuning
ChatSpot: Bootstrapping Multimodal LLMs via Precise Referring Instruction Tuning
Liang Zhao
En Yu
Zheng Ge
Jinrong Yang
Hao-Ran Wei
...
Jian‐Yuan Sun
Yuang Peng
Runpei Dong
Chunrui Han
Xiangyu Zhang
MLLM
LRM
44
53
0
18 Jul 2023
Llama 2: Open Foundation and Fine-Tuned Chat Models
Llama 2: Open Foundation and Fine-Tuned Chat Models
Hugo Touvron
Louis Martin
Kevin R. Stone
Peter Albert
Amjad Almahairi
...
Sharan Narang
Aurelien Rodriguez
Robert Stojnic
Sergey Edunov
Thomas Scialom
AI4MH
ALM
150
11,259
0
18 Jul 2023
TableGPT: Towards Unifying Tables, Nature Language and Commands into One
  GPT
TableGPT: Towards Unifying Tables, Nature Language and Commands into One GPT
Liangyu Zha
Junlin Zhou
Liyao Li
Rui Wang
Qingyi Huang
...
Xing-yan Deng
Jinfeng Xu
Haobo Wang
Gang Chen
Jiaqi Zhao
RALM
LMTD
37
43
0
17 Jul 2023
On the application of Large Language Models for language teaching and
  assessment technology
On the application of Large Language Models for language teaching and assessment technology
Andrew Caines
Luca Benedetto
Shiva Taslimipoor
Christopher Davis
Yuan Gao
...
Marek Rei
H. Yannakoudakis
Andrew Mullooly
D. Nicholls
P. Buttery
ELM
32
43
0
17 Jul 2023
Zero-th Order Algorithm for Softmax Attention Optimization
Zero-th Order Algorithm for Softmax Attention Optimization
Yichuan Deng
Zhihang Li
Sridhar Mahadevan
Zhao Song
43
13
0
17 Jul 2023
Fast Quantum Algorithm for Attention Computation
Fast Quantum Algorithm for Attention Computation
Yeqi Gao
Zhao Song
Xin Yang
Ruizhe Zhang
LRM
48
22
0
16 Jul 2023
Planting a SEED of Vision in Large Language Model
Planting a SEED of Vision in Large Language Model
Yuying Ge
Yixiao Ge
Ziyun Zeng
Xintao Wang
Ying Shan
VLM
MLLM
16
93
0
16 Jul 2023
A Survey of Techniques for Optimizing Transformer Inference
A Survey of Techniques for Optimizing Transformer Inference
Krishna Teja Chitty-Venkata
Sparsh Mittal
M. Emani
V. Vishwanath
Arun Somani
54
63
0
16 Jul 2023
Creating a Dataset for High-Performance Computing Code Translation using
  LLMs: A Bridge Between OpenMP Fortran and C++
Creating a Dataset for High-Performance Computing Code Translation using LLMs: A Bridge Between OpenMP Fortran and C++
Bin Lei
Caiwen Ding
Le Chen
Pei-Hung Lin
Chunhua Liao
19
9
0
15 Jul 2023
Previous
123...353637...484950
Next