ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.01068
  4. Cited By
OPT: Open Pre-trained Transformer Language Models

OPT: Open Pre-trained Transformer Language Models

2 May 2022
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
Shuohui Chen
Christopher Dewan
Mona T. Diab
Xian Li
Xi Lin
Todor Mihaylov
Myle Ott
Sam Shleifer
Kurt Shuster
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
    VLM
    OSLM
    AI4CE
ArXivPDFHTML

Papers citing "OPT: Open Pre-trained Transformer Language Models"

50 / 2,456 papers shown
Title
Learn To be Efficient: Build Structured Sparsity in Large Language
  Models
Learn To be Efficient: Build Structured Sparsity in Large Language Models
Haizhong Zheng
Xiaoyan Bai
Xueshen Liu
Z. Morley Mao
Beidi Chen
Fan Lai
Atul Prakash
56
11
0
09 Feb 2024
Large Language Models: A Survey
Large Language Models: A Survey
Shervin Minaee
Tomas Mikolov
Narjes Nikzad
M. Asgari-Chenaghlu
R. Socher
Xavier Amatriain
Jianfeng Gao
ALM
LM&MA
ELM
134
377
0
09 Feb 2024
An Interactive Agent Foundation Model
An Interactive Agent Foundation Model
Zane Durante
Bidipta Sarkar
Ran Gong
Rohan Taori
Yusuke Noda
...
Katsushi Ikeuchi
Fei-Fei Li
Jianfeng Gao
Naoki Wake
Qiuyuan Huang
LM&Ro
AI4CE
LLMAG
91
16
0
08 Feb 2024
On the Convergence of Zeroth-Order Federated Tuning for Large Language
  Models
On the Convergence of Zeroth-Order Federated Tuning for Large Language Models
Zhenqing Ling
Daoyuan Chen
Liuyi Yao
Yaliang Li
Ying Shen
FedML
54
12
0
08 Feb 2024
SpiRit-LM: Interleaved Spoken and Written Language Model
SpiRit-LM: Interleaved Spoken and Written Language Model
Tu Nguyen
Benjamin Muller
Bokai Yu
Marta R. Costa-jussá
Maha Elbayad
...
Itai Gat
Gabriel Synnaeve
Juan Pino
Benoît Sagot
Emmanuel Dupoux
AuLLM
VLM
56
34
0
08 Feb 2024
Real-World Robot Applications of Foundation Models: A Review
Real-World Robot Applications of Foundation Models: A Review
Kento Kawaharazuka
T. Matsushima
Andrew Gambardella
Jiaxian Guo
Chris Paxton
Andy Zeng
OffRL
VLM
LM&Ro
51
47
0
08 Feb 2024
RepQuant: Towards Accurate Post-Training Quantization of Large
  Transformer Models via Scale Reparameterization
RepQuant: Towards Accurate Post-Training Quantization of Large Transformer Models via Scale Reparameterization
Zhikai Li
Xuewen Liu
Jing Zhang
Qingyi Gu
MQ
54
7
0
08 Feb 2024
Pretrained Generative Language Models as General Learning Frameworks for
  Sequence-Based Tasks
Pretrained Generative Language Models as General Learning Frameworks for Sequence-Based Tasks
Ben Fauber
32
2
0
08 Feb 2024
In-Context Principle Learning from Mistakes
In-Context Principle Learning from Mistakes
Tianjun Zhang
Aman Madaan
Luyu Gao
Steven Zheng
Swaroop Mishra
Yiming Yang
Niket Tandon
Uri Alon
KELM
ReLM
38
24
0
08 Feb 2024
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models
Chris Liu
Renrui Zhang
Longtian Qiu
Siyuan Huang
Weifeng Lin
...
Hao Shao
Pan Lu
Hongsheng Li
Yu Qiao
Peng Gao
MLLM
130
110
0
08 Feb 2024
L4Q: Parameter Efficient Quantization-Aware Fine-Tuning on Large
  Language Models
L4Q: Parameter Efficient Quantization-Aware Fine-Tuning on Large Language Models
Hyesung Jeon
Yulhwa Kim
Jae-Joon Kim
MQ
29
4
0
07 Feb 2024
Progressive Gradient Flow for Robust N:M Sparsity Training in
  Transformers
Progressive Gradient Flow for Robust N:M Sparsity Training in Transformers
Abhimanyu Bambhaniya
Amir Yazdanbakhsh
Suvinay Subramanian
Sheng-Chun Kao
Shivani Agrawal
Utku Evci
Tushar Krishna
59
14
0
07 Feb 2024
ApiQ: Finetuning of 2-Bit Quantized Large Language Model
ApiQ: Finetuning of 2-Bit Quantized Large Language Model
Baohao Liao
Christian Herold
Shahram Khadivi
Christof Monz
CLL
MQ
55
12
0
07 Feb 2024
The Fine-Grained Complexity of Gradient Computation for Training Large
  Language Models
The Fine-Grained Complexity of Gradient Computation for Training Large Language Models
Josh Alman
Zhao Song
37
12
0
07 Feb 2024
DistiLLM: Towards Streamlined Distillation for Large Language Models
DistiLLM: Towards Streamlined Distillation for Large Language Models
Jongwoo Ko
Sungnyun Kim
Tianyi Chen
SeYoung Yun
71
27
0
06 Feb 2024
BiLLM: Pushing the Limit of Post-Training Quantization for LLMs
BiLLM: Pushing the Limit of Post-Training Quantization for LLMs
Wei Huang
Yangdong Liu
Haotong Qin
Ying Li
Shiming Zhang
Xianglong Liu
Michele Magno
Xiaojuan Qi
MQ
82
72
0
06 Feb 2024
ReLU$^2$ Wins: Discovering Efficient Activation Functions for Sparse
  LLMs
ReLU2^22 Wins: Discovering Efficient Activation Functions for Sparse LLMs
Zhengyan Zhang
Yixin Song
Guanghui Yu
Xu Han
Yankai Lin
Chaojun Xiao
Chenyang Song
Zhiyuan Liu
Zeyu Mi
Maosong Sun
27
31
0
06 Feb 2024
INSIDE: LLMs' Internal States Retain the Power of Hallucination
  Detection
INSIDE: LLMs' Internal States Retain the Power of Hallucination Detection
Chao Chen
Kai-Chun Liu
Ze Chen
Yi Gu
Yue-bo Wu
Mingyuan Tao
Zhihang Fu
Jieping Ye
HILM
85
87
0
06 Feb 2024
Partially Recentralization Softmax Loss for Vision-Language Models
  Robustness
Partially Recentralization Softmax Loss for Vision-Language Models Robustness
Hao Wang
Xin Zhang
Jinzhe Jiang
Yaqian Zhao
Chen Li
AAML
32
0
0
06 Feb 2024
A Survey on Transformer Compression
A Survey on Transformer Compression
Yehui Tang
Yunhe Wang
Jianyuan Guo
Zhijun Tu
Kai Han
Hailin Hu
Dacheng Tao
46
30
0
05 Feb 2024
Shortened LLaMA: Depth Pruning for Large Language Models with Comparison
  of Retraining Methods
Shortened LLaMA: Depth Pruning for Large Language Models with Comparison of Retraining Methods
Bo-Kyeong Kim
Geonmin Kim
Tae-Ho Kim
Thibault Castells
Shinkook Choi
Junho Shin
Hyoung-Kyu Song
62
30
0
05 Feb 2024
DeAL: Decoding-time Alignment for Large Language Models
DeAL: Decoding-time Alignment for Large Language Models
James Y. Huang
Sailik Sengupta
Daniele Bonadiman
Yi-An Lai
Arshit Gupta
Nikolaos Pappas
Saab Mansour
Katrin Kirchoff
Dan Roth
64
29
0
05 Feb 2024
LHRS-Bot: Empowering Remote Sensing with VGI-Enhanced Large Multimodal
  Language Model
LHRS-Bot: Empowering Remote Sensing with VGI-Enhanced Large Multimodal Language Model
Dilxat Muhtar
Zhenshi Li
Feng-Xue Gu
Xue-liang Zhang
Pengfeng Xiao
82
53
0
04 Feb 2024
Knowledge Generation for Zero-shot Knowledge-based VQA
Knowledge Generation for Zero-shot Knowledge-based VQA
Rui Cao
Jing Jiang
28
2
0
04 Feb 2024
LQER: Low-Rank Quantization Error Reconstruction for LLMs
LQER: Low-Rank Quantization Error Reconstruction for LLMs
Cheng Zhang
Jianyi Cheng
George A. Constantinides
Yiren Zhao
MQ
33
9
0
04 Feb 2024
AutoTimes: Autoregressive Time Series Forecasters via Large Language
  Models
AutoTimes: Autoregressive Time Series Forecasters via Large Language Models
Yong Liu
Guo Qin
Xiangdong Huang
Jianmin Wang
Mingsheng Long
AI4TS
37
22
0
04 Feb 2024
NetLLM: Adapting Large Language Models for Networking
NetLLM: Adapting Large Language Models for Networking
Duo Wu
Xianda Wang
Yaqi Qiao
Zhi Wang
Junchen Jiang
Shuguang Cui
Fangxin Wang
45
30
0
04 Feb 2024
Copyright Protection in Generative AI: A Technical Perspective
Copyright Protection in Generative AI: A Technical Perspective
Jie Ren
Han Xu
Pengfei He
Yingqian Cui
Shenglai Zeng
...
Hongzhi Wen
Jiayuan Ding
Hui Liu
Yi Chang
Jiliang Tang
DeLMO
36
33
0
04 Feb 2024
Selecting Large Language Model to Fine-tune via Rectified Scaling Law
Selecting Large Language Model to Fine-tune via Rectified Scaling Law
Haowei Lin
Baizhou Huang
Haotian Ye
Qinyu Chen
Zihao Wang
Sujian Li
Jianzhu Ma
Xiaojun Wan
James Zou
Yitao Liang
90
20
0
04 Feb 2024
Frequency Explains the Inverse Correlation of Large Language Models'
  Size, Training Data Amount, and Surprisal's Fit to Reading Times
Frequency Explains the Inverse Correlation of Large Language Models' Size, Training Data Amount, and Surprisal's Fit to Reading Times
Byung-Doh Oh
Shisen Yue
William Schuler
58
16
0
03 Feb 2024
Do Moral Judgment and Reasoning Capability of LLMs Change with Language?
  A Study using the Multilingual Defining Issues Test
Do Moral Judgment and Reasoning Capability of LLMs Change with Language? A Study using the Multilingual Defining Issues Test
Aditi Khandelwal
Utkarsh Agarwal
Kumar Tanmay
Monojit Choudhury
ELM
LRM
37
6
0
03 Feb 2024
COMET: Generating Commit Messages using Delta Graph Context
  Representation
COMET: Generating Commit Messages using Delta Graph Context Representation
Abhinav Reddy Mandli
Saurabhsingh Rajput
Tushar Sharma
31
1
0
02 Feb 2024
Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and
  Dialogue Abilities
Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities
Zhifeng Kong
Arushi Goel
Rohan Badlani
Ming-Yu Liu
Rafael Valle
Bryan Catanzaro
AuLLM
LM&MA
MLLM
78
75
0
02 Feb 2024
Stochastic Two Points Method for Deep Model Zeroth-order Optimization
Stochastic Two Points Method for Deep Model Zeroth-order Optimization
Yijiang Pang
Jiayu Zhou
35
0
0
02 Feb 2024
Enhancing Stochastic Gradient Descent: A Unified Framework and Novel
  Acceleration Methods for Faster Convergence
Enhancing Stochastic Gradient Descent: A Unified Framework and Novel Acceleration Methods for Faster Convergence
Yichuan Deng
Zhao Song
Chiwun Yang
34
1
0
02 Feb 2024
Can MLLMs Perform Text-to-Image In-Context Learning?
Can MLLMs Perform Text-to-Image In-Context Learning?
Yuchen Zeng
Wonjun Kang
Yicong Chen
Hyung Il Koo
Kangwook Lee
MLLM
36
9
0
02 Feb 2024
Vaccine: Perturbation-aware Alignment for Large Language Model
Vaccine: Perturbation-aware Alignment for Large Language Model
Tiansheng Huang
Sihao Hu
Ling Liu
55
36
0
02 Feb 2024
LLM-based NLG Evaluation: Current Status and Challenges
LLM-based NLG Evaluation: Current Status and Challenges
Mingqi Gao
Xinyu Hu
Jie Ruan
Xiao Pu
Xiaojun Wan
ELM
LM&MA
76
30
0
02 Feb 2024
Decoding Speculative Decoding
Decoding Speculative Decoding
Minghao Yan
Saurabh Agarwal
Shivaram Venkataraman
LRM
45
6
0
02 Feb 2024
Plan-Grounded Large Language Models for Dual Goal Conversational
  Settings
Plan-Grounded Large Language Models for Dual Goal Conversational Settings
Diogo Glória-Silva
Rafael Ferreira
Diogo Tavares
David Semedo
João Magalhães
LLMAG
50
4
0
01 Feb 2024
Evaluating Large Language Models for Generalization and Robustness via
  Data Compression
Evaluating Large Language Models for Generalization and Robustness via Data Compression
Yucheng Li
Yunhao Guo
Frank Guerin
Chenghua Lin
ELM
35
5
0
01 Feb 2024
Can Large Language Models Understand Context?
Can Large Language Models Understand Context?
Yilun Zhu
Joel Ruben Antony Moniz
Shruti Bhargava
Jiarui Lu
Dhivya Piraviperumal
Site Li
Yuan-kang Zhang
Hong-ye Yu
Bo-Hsiang Tseng
58
21
0
01 Feb 2024
OLMo: Accelerating the Science of Language Models
OLMo: Accelerating the Science of Language Models
Dirk Groeneveld
Iz Beltagy
Pete Walsh
Akshita Bhagia
Rodney Michael Kinney
...
Jesse Dodge
Kyle Lo
Luca Soldaini
Noah A. Smith
Hanna Hajishirzi
OSLM
143
367
0
01 Feb 2024
ReAGent: A Model-agnostic Feature Attribution Method for Generative
  Language Models
ReAGent: A Model-agnostic Feature Attribution Method for Generative Language Models
Zhixue Zhao
Boxuan Shan
39
5
0
01 Feb 2024
Non-Exchangeable Conformal Language Generation with Nearest Neighbors
Non-Exchangeable Conformal Language Generation with Nearest Neighbors
Dennis Ulmer
Chrysoula Zerva
André F. T. Martins
46
11
0
01 Feb 2024
CroissantLLM: A Truly Bilingual French-English Language Model
CroissantLLM: A Truly Bilingual French-English Language Model
Manuel Faysse
Patrick Fernandes
Nuno M. Guerreiro
António Loison
Duarte M. Alves
...
François Yvon
André F.T. Martins
Gautier Viaud
C´eline Hudelot
Pierre Colombo
63
32
0
01 Feb 2024
SpeechComposer: Unifying Multiple Speech Tasks with Prompt Composition
SpeechComposer: Unifying Multiple Speech Tasks with Prompt Composition
Yihan Wu
Soumi Maiti
Yifan Peng
Wangyou Zhang
Chenda Li
Yuyue Wang
Xihua Wang
Shinji Watanabe
Ruihua Song
40
3
0
31 Jan 2024
ControlCap: Controllable Region-level Captioning
ControlCap: Controllable Region-level Captioning
Yuzhong Zhao
Yue Liu
Zonghao Guo
Weijia Wu
Chen Gong
Fang Wan
QiXiang Ye
26
5
0
31 Jan 2024
Probing Language Models' Gesture Understanding for Enhanced Human-AI
  Interaction
Probing Language Models' Gesture Understanding for Enhanced Human-AI Interaction
Philipp Wicke
43
2
0
31 Jan 2024
SwarmBrain: Embodied agent for real-time strategy game StarCraft II via
  large language models
SwarmBrain: Embodied agent for real-time strategy game StarCraft II via large language models
Xiao Shao
Weifu Jiang
Fei Zuo
Mengqing Liu
LLMAG
39
7
0
31 Jan 2024
Previous
123...222324...484950
Next