ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.01068
  4. Cited By
OPT: Open Pre-trained Transformer Language Models

OPT: Open Pre-trained Transformer Language Models

2 May 2022
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
Shuohui Chen
Christopher Dewan
Mona T. Diab
Xian Li
Xi Lin
Todor Mihaylov
Myle Ott
Sam Shleifer
Kurt Shuster
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
    VLM
    OSLM
    AI4CE
ArXivPDFHTML

Papers citing "OPT: Open Pre-trained Transformer Language Models"

50 / 2,486 papers shown
Title
Network Memory Footprint Compression Through Jointly Learnable Codebooks
  and Mappings
Network Memory Footprint Compression Through Jointly Learnable Codebooks and Mappings
Vittorio Giammarino
Arnaud Dapogny
Kévin Bailly
MQ
29
1
0
29 Sep 2023
PB-LLM: Partially Binarized Large Language Models
PB-LLM: Partially Binarized Large Language Models
Yuzhang Shang
Zhihang Yuan
Qiang Wu
Zhen Dong
MQ
38
44
0
29 Sep 2023
Training and inference of large language models using 8-bit floating
  point
Training and inference of large language models using 8-bit floating point
Sergio P. Perez
Yan Zhang
James Briggs
Charlie Blake
Prashanth Krishnamurthy
Paul Balanca
Carlo Luschi
Stephen Barlow
Andrew William Fitzgibbon
MQ
42
18
0
29 Sep 2023
Guiding Instruction-based Image Editing via Multimodal Large Language
  Models
Guiding Instruction-based Image Editing via Multimodal Large Language Models
Johannes Frey
Wenze Hu
Xianzhi Du
William Yang Wang
Yinfei Yang
Zhe Gan
45
92
0
29 Sep 2023
Curriculum-Driven Edubot: A Framework for Developing Language Learning
  Chatbots Through Synthesizing Conversational Data
Curriculum-Driven Edubot: A Framework for Developing Language Learning Chatbots Through Synthesizing Conversational Data
Yu Li
Shang Qu
Jili Shen
Shangchao Min
Zhou Yu
66
17
0
28 Sep 2023
Qwen Technical Report
Qwen Technical Report
Jinze Bai
Shuai Bai
Yunfei Chu
Zeyu Cui
Kai Dang
...
Zhenru Zhang
Chang Zhou
Jingren Zhou
Xiaohuan Zhou
Tianhang Zhu
OSLM
108
1,691
0
28 Sep 2023
At Which Training Stage Does Code Data Help LLMs Reasoning?
At Which Training Stage Does Code Data Help LLMs Reasoning?
Xiaogang Jia
Yue Liu
Yue Yu
Yuanliang Zhang
Yu Jiang
Changjian Wang
Shanshan Li
LRM
SyDa
43
62
0
28 Sep 2023
Large Language Models in Finance: A Survey
Large Language Models in Finance: A Survey
Yinheng Li
Shaofei Wang
Han Ding
Hang Chen
AIFin
58
182
0
28 Sep 2023
AnyMAL: An Efficient and Scalable Any-Modality Augmented Language Model
AnyMAL: An Efficient and Scalable Any-Modality Augmented Language Model
Avamarie Brueggeman
Andrea Madotto
Zhaojiang Lin
Tushar Nagarajan
Matt Smith
...
Peyman Heidari
Yue Liu
Kavya Srinet
Babak Damavandi
Anuj Kumar
MLLM
45
92
0
27 Sep 2023
Identifying and Mitigating Privacy Risks Stemming from Language Models:
  A Survey
Identifying and Mitigating Privacy Risks Stemming from Language Models: A Survey
Victoria Smith
Ali Shahin Shamsabadi
Carolyn Ashurst
Adrian Weller
PILM
39
25
0
27 Sep 2023
HyPoradise: An Open Baseline for Generative Speech Recognition with
  Large Language Models
HyPoradise: An Open Baseline for Generative Speech Recognition with Large Language Models
Cheng Chen
Yuchen Hu
Chao-Han Huck Yang
Sabato Marco Siniscalchi
Pin-Yu Chen
Eng Siong Chng
38
42
0
27 Sep 2023
Jointly Training Large Autoregressive Multimodal Models
Jointly Training Large Autoregressive Multimodal Models
Emanuele Aiello
L. Yu
Yixin Nie
Armen Aghajanyan
Barlas Oğuz
34
30
0
27 Sep 2023
Tackling VQA with Pretrained Foundation Models without Further Training
Tackling VQA with Pretrained Foundation Models without Further Training
Alvin De Jun Tan
Bingquan Shen
MLLM
53
1
0
27 Sep 2023
Navigate through Enigmatic Labyrinth A Survey of Chain of Thought
  Reasoning: Advances, Frontiers and Future
Navigate through Enigmatic Labyrinth A Survey of Chain of Thought Reasoning: Advances, Frontiers and Future
Zheng Chu
Jingchang Chen
Qianglong Chen
Weijiang Yu
Tao He
Haotian Wang
Weihua Peng
Ming-Yuan Liu
Bing Qin
Ting Liu
LRM
AI4CE
62
161
0
27 Sep 2023
Rethinking Channel Dimensions to Isolate Outliers for Low-bit Weight Quantization of Large Language Models
Rethinking Channel Dimensions to Isolate Outliers for Low-bit Weight Quantization of Large Language Models
Jung Hwan Heo
Jeonghoon Kim
Beomseok Kwon
Byeongwook Kim
Se Jung Kwon
Dongsoo Lee
MQ
50
9
0
27 Sep 2023
InternLM-XComposer: A Vision-Language Large Model for Advanced
  Text-image Comprehension and Composition
InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition
Pan Zhang
Xiaoyi Wang
Bin Wang
Yuhang Cao
Chao Xu
...
Conghui He
Xingcheng Zhang
Yu Qiao
Da Lin
Jiaqi Wang
MLLM
82
229
0
26 Sep 2023
Large Language Model Alignment: A Survey
Large Language Model Alignment: A Survey
Tianhao Shen
Renren Jin
Yufei Huang
Chuang Liu
Weilong Dong
Zishan Guo
Xinwei Wu
Yan Liu
Deyi Xiong
LM&MA
33
182
0
26 Sep 2023
Knowledgeable In-Context Tuning: Exploring and Exploiting Factual
  Knowledge for In-Context Learning
Knowledgeable In-Context Tuning: Exploring and Exploiting Factual Knowledge for In-Context Learning
Jiadong Wang
Chengyu Wang
Chuanqi Tan
Jun Huang
Ming Gao
KELM
50
5
0
26 Sep 2023
QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models
QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models
Yuhui Xu
Lingxi Xie
Xiaotao Gu
Xin Chen
Heng Chang
Hengheng Zhang
Zhensu Chen
Xiaopeng Zhang
Qi Tian
MQ
21
96
0
26 Sep 2023
Disinformation Detection: An Evolving Challenge in the Age of LLMs
Disinformation Detection: An Evolving Challenge in the Age of LLMs
Qinglong Cao
Yuntian Chen
Ayushi Nirmal
Xiaokang Yang
DeLMO
58
53
0
25 Sep 2023
Small-scale proxies for large-scale Transformer training instabilities
Small-scale proxies for large-scale Transformer training instabilities
Mitchell Wortsman
Peter J. Liu
Lechao Xiao
Katie Everett
A. Alemi
...
Jascha Narain Sohl-Dickstein
Kelvin Xu
Jaehoon Lee
Justin Gilmer
Simon Kornblith
44
89
0
25 Sep 2023
Reproducing Whisper-Style Training Using an Open-Source Toolkit and
  Publicly Available Data
Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data
Yifan Peng
Jinchuan Tian
Brian Yan
Dan Berrebbi
Xuankai Chang
...
Yui Sudo
Muhammad Shakeel
Jee-weon Jung
Soumi Maiti
Shinji Watanabe
VLM
51
36
0
25 Sep 2023
Can LLM-Generated Misinformation Be Detected?
Can LLM-Generated Misinformation Be Detected?
Canyu Chen
Kai Shu
DeLMO
45
171
0
25 Sep 2023
Resolving References in Visually-Grounded Dialogue via Text Generation
Resolving References in Visually-Grounded Dialogue via Text Generation
Bram Willemsen
Livia Qian
Gabriel Skantze
38
3
0
23 Sep 2023
From Text to Source: Results in Detecting Large Language Model-Generated
  Content
From Text to Source: Results in Detecting Large Language Model-Generated Content
Wissam Antoun
Benoît Sagot
Djamé Seddah
DeLMO
45
11
0
23 Sep 2023
Towards Green AI in Fine-tuning Large Language Models via Adaptive
  Backpropagation
Towards Green AI in Fine-tuning Large Language Models via Adaptive Backpropagation
Kai Huang
Hanyu Yin
Heng Huang
Wei Gao
38
11
0
22 Sep 2023
BenLLMEval: A Comprehensive Evaluation into the Potentials and Pitfalls
  of Large Language Models on Bengali NLP
BenLLMEval: A Comprehensive Evaluation into the Potentials and Pitfalls of Large Language Models on Bengali NLP
M. Kabir
Mohammed Saidul Islam
Md Tahmid Rahman Laskar
Mir Tafseer Nayeem
M Saiful Bari
Enamul Hoque
LM&MA
31
16
0
22 Sep 2023
The Cambridge Law Corpus: A Dataset for Legal AI Research
The Cambridge Law Corpus: A Dataset for Legal AI Research
Andreas Ostling
Holli Sargeant
Huiyuan Xie
Ludwig Bull
Alexander Terenin
Leif Jonsson
Maans Magnusson
Felix Steffek
ELM
AILaw
29
7
0
21 Sep 2023
BELT:Bootstrapping Electroencephalography-to-Language Decoding and Zero-Shot Sentiment Classification by Natural Language Supervision
Jinzhao Zhou
Yiqun Duan
Yu-Cheng Chang
Yu-Kai Wang
Chin-Teng Lin
46
6
0
21 Sep 2023
A Paradigm Shift in Machine Translation: Boosting Translation
  Performance of Large Language Models
A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models
Haoran Xu
Young Jin Kim
Amr Sharaf
Hany Awadalla
43
64
0
20 Sep 2023
BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter Model
BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter Model
Nolan Dey
Daria Soboleva
Faisal Al-Khateeb
Bowen Yang
Ribhu Pathria
...
Robert Myers
Jacob Robert Steeves
Natalia Vassilieva
Marvin Tom
Joel Hestness
MoE
41
15
0
20 Sep 2023
DreamLLM: Synergistic Multimodal Comprehension and Creation
DreamLLM: Synergistic Multimodal Comprehension and Creation
Runpei Dong
Chunrui Han
Yuang Peng
Zekun Qi
Zheng Ge
...
Hao-Ran Wei
Xiangwen Kong
Xiangyu Zhang
Kaisheng Ma
Li Yi
MLLM
50
182
0
20 Sep 2023
CoT-BERT: Enhancing Unsupervised Sentence Representation through
  Chain-of-Thought
CoT-BERT: Enhancing Unsupervised Sentence Representation through Chain-of-Thought
Bowen Zhang
Kehua Chang
Chunping Li
SSL
55
6
0
20 Sep 2023
Exploring the Relationship between LLM Hallucinations and Prompt
  Linguistic Nuances: Readability, Formality, and Concreteness
Exploring the Relationship between LLM Hallucinations and Prompt Linguistic Nuances: Readability, Formality, and Concreteness
Vipula Rawte
Prachi Priya
S.M. Towhidul Islam Tonmoy
M. M. Zaman
A. Sheth
Amitava Das
33
18
0
20 Sep 2023
XATU: A Fine-grained Instruction-based Benchmark for Explainable Text
  Updates
XATU: A Fine-grained Instruction-based Benchmark for Explainable Text Updates
Haopeng Zhang
Hayate Iso
Sairam Gurajada
Nikita Bhutani
59
6
0
20 Sep 2023
In-Context Learning for Text Classification with Many Labels
In-Context Learning for Text Classification with Many Labels
Aristides Milios
Siva Reddy
Dzmitry Bahdanau
27
35
0
19 Sep 2023
SlimPajama-DC: Understanding Data Combinations for LLM Training
SlimPajama-DC: Understanding Data Combinations for LLM Training
Zhiqiang Shen
Tianhua Tao
Liqun Ma
Willie Neiswanger
Zhengzhong Liu
...
Bowen Tan
Joel Hestness
Natalia Vassilieva
Daria Soboleva
Eric Xing
43
44
0
19 Sep 2023
A Blueprint for Precise and Fault-Tolerant Analog Neural Networks
A Blueprint for Precise and Fault-Tolerant Analog Neural Networks
Cansu Demirkıran
Lakshmi Nair
D. Bunandar
Ajay Joshi
29
3
0
19 Sep 2023
GPT4AIGChip: Towards Next-Generation AI Accelerator Design Automation via Large Language Models
GPT4AIGChip: Towards Next-Generation AI Accelerator Design Automation via Large Language Models
Yonggan Fu
Yongan Zhang
Zhongzhi Yu
Sixu Li
Zhifan Ye
Chaojian Li
Cheng Wan
Ying Lin
55
65
0
19 Sep 2023
OpenBA: An Open-sourced 15B Bilingual Asymmetric seq2seq Model
  Pre-trained from Scratch
OpenBA: An Open-sourced 15B Bilingual Asymmetric seq2seq Model Pre-trained from Scratch
Juntao Li
Zecheng Tang
Yuyang Ding
Pinzheng Wang
Pei Guo
...
Wenliang Chen
Guohong Fu
Qiaoming Zhu
Guodong Zhou
Hao Fei
59
5
0
19 Sep 2023
Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative
  Model Inference with Unstructured Sparsity
Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity
Haojun Xia
Zhen Zheng
Yuchao Li
Donglin Zhuang
Zhongzhu Zhou
Xiafei Qiu
Yong Li
Wei Lin
Shuaiwen Leon Song
70
11
0
19 Sep 2023
Baichuan 2: Open Large-scale Language Models
Baichuan 2: Open Large-scale Language Models
Ai Ming Yang
Bin Xiao
Bingning Wang
Borong Zhang
Ce Bian
...
Youxin Jiang
Yuchen Gao
Yupeng Zhang
Guosheng Dong
Zhiying Wu
ELM
LRM
77
725
0
19 Sep 2023
Understanding Catastrophic Forgetting in Language Models via Implicit
  Inference
Understanding Catastrophic Forgetting in Language Models via Implicit Inference
Suhas Kotha
Jacob Mitchell Springer
Aditi Raghunathan
CLL
67
62
0
18 Sep 2023
Automatic Personalized Impression Generation for PET Reports Using Large
  Language Models
Automatic Personalized Impression Generation for PET Reports Using Large Language Models
Xin Tie
Muheon Shin
Ali Pirasteh
Nevein Ibrahim
Zachary Huemann
...
K. M. Kelly
John W. Garrett
Junjie Hu
Steve Y. Cho
Tyler Bradshaw
LM&MA
60
10
0
18 Sep 2023
R2GenGPT: Radiology Report Generation with Frozen LLMs
R2GenGPT: Radiology Report Generation with Frozen LLMs
Zhanyu Wang
Lingqiao Liu
Lei Wang
Luping Zhou
MedIm
LM&MA
VLM
38
69
0
18 Sep 2023
Fabricator: An Open Source Toolkit for Generating Labeled Training Data
  with Teacher LLMs
Fabricator: An Open Source Toolkit for Generating Labeled Training Data with Teacher LLMs
Jonas Golde
Patrick Haller
Felix Hamborg
Julian Risch
Alan Akbik
73
8
0
18 Sep 2023
Summarization is (Almost) Dead
Summarization is (Almost) Dead
Xiao Pu
Mingqi Gao
Xiaojun Wan
HILM
83
40
0
18 Sep 2023
ODSum: New Benchmarks for Open Domain Multi-Document Summarization
ODSum: New Benchmarks for Open Domain Multi-Document Summarization
Yijie Zhou
Kejian Shi
Wencai Zhang
Yixin Liu
Yilun Zhao
Arman Cohan
RALM
42
2
0
16 Sep 2023
Self-Assessment Tests are Unreliable Measures of LLM Personality
Self-Assessment Tests are Unreliable Measures of LLM Personality
Akshat Gupta
Xiaoyang Song
Gopala Anumanchipalli
29
19
0
15 Sep 2023
Oobleck: Resilient Distributed Training of Large Models Using Pipeline
  Templates
Oobleck: Resilient Distributed Training of Large Models Using Pipeline Templates
Insu Jang
Zhenning Yang
Zhen Zhang
Xin Jin
Mosharaf Chowdhury
MoE
AI4CE
OODD
49
44
0
15 Sep 2023
Previous
123...323334...484950
Next