ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.01068
  4. Cited By
OPT: Open Pre-trained Transformer Language Models

OPT: Open Pre-trained Transformer Language Models

2 May 2022
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
Shuohui Chen
Christopher Dewan
Mona T. Diab
Xian Li
Xi Lin
Todor Mihaylov
Myle Ott
Sam Shleifer
Kurt Shuster
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
    VLM
    OSLM
    AI4CE
ArXivPDFHTML

Papers citing "OPT: Open Pre-trained Transformer Language Models"

50 / 2,456 papers shown
Title
Evaluating Quantized Large Language Models
Evaluating Quantized Large Language Models
Shiyao Li
Xuefei Ning
Luning Wang
Tengxuan Liu
Xiangsheng Shi
Shengen Yan
Guohao Dai
Huazhong Yang
Yu Wang
MQ
45
47
0
28 Feb 2024
Cutting Off the Head Ends the Conflict: A Mechanism for Interpreting and
  Mitigating Knowledge Conflicts in Language Models
Cutting Off the Head Ends the Conflict: A Mechanism for Interpreting and Mitigating Knowledge Conflicts in Language Models
Zhuoran Jin
Pengfei Cao
Hongbang Yuan
Yubo Chen
Jiexin Xu
Huaijun Li
Xiaojian Jiang
Kang Liu
Jun Zhao
196
40
0
28 Feb 2024
MedAide: Leveraging Large Language Models for On-Premise Medical
  Assistance on Edge Devices
MedAide: Leveraging Large Language Models for On-Premise Medical Assistance on Edge Devices
Abdul Basit
Khizar Hussain
Muhammad Abdullah Hanif
Mohamed Bennai
LM&MA
27
5
0
28 Feb 2024
Polos: Multimodal Metric Learning from Human Feedback for Image
  Captioning
Polos: Multimodal Metric Learning from Human Feedback for Image Captioning
Yuiga Wada
Kanta Kaneda
Daichi Saito
Komei Sugiura
39
24
0
28 Feb 2024
Token-Specific Watermarking with Enhanced Detectability and Semantic
  Coherence for Large Language Models
Token-Specific Watermarking with Enhanced Detectability and Semantic Coherence for Large Language Models
Mingjia Huo
Sai Ashish Somayajula
Youwei Liang
Ruisi Zhang
F. Koushanfar
Pengtao Xie
WaLM
44
15
0
28 Feb 2024
Characterizing Truthfulness in Large Language Model Generations with
  Local Intrinsic Dimension
Characterizing Truthfulness in Large Language Model Generations with Local Intrinsic Dimension
Fan Yin
Jayanth Srinivasa
Kai-Wei Chang
HILM
65
20
0
28 Feb 2024
FlattenQuant: Breaking Through the Inference Compute-bound for Large
  Language Models with Per-tensor Quantization
FlattenQuant: Breaking Through the Inference Compute-bound for Large Language Models with Per-tensor Quantization
Yi Zhang
Fei Yang
Shuang Peng
Fangyu Wang
Aimin Pan
MQ
36
2
0
28 Feb 2024
All in an Aggregated Image for In-Image Learning
All in an Aggregated Image for In-Image Learning
Lei Wang
Wanyu Xu
Zhiqiang Hu
Yihuai Lan
Shan Dong
Hao Wang
Roy Ka-wei Lee
Ee-Peng Lim
VLM
51
1
0
28 Feb 2024
SparseLLM: Towards Global Pruning for Pre-trained Language Models
SparseLLM: Towards Global Pruning for Pre-trained Language Models
Guangji Bai
Yijiang Li
Chen Ling
Kibaek Kim
Liang Zhao
38
7
0
28 Feb 2024
On the Challenges and Opportunities in Generative AI
On the Challenges and Opportunities in Generative AI
Laura Manduchi
Kushagra Pandey
Robert Bamler
Ryan Cotterell
Sina Daubener
...
F. Wenzel
Frank Wood
Stephan Mandt
Vincent Fortuin
Vincent Fortuin
56
17
0
28 Feb 2024
EmMark: Robust Watermarks for IP Protection of Embedded Quantized Large
  Language Models
EmMark: Robust Watermarks for IP Protection of Embedded Quantized Large Language Models
Ruisi Zhang
F. Koushanfar
VLM
WaLM
46
1
0
27 Feb 2024
On the Societal Impact of Open Foundation Models
On the Societal Impact of Open Foundation Models
Sayash Kapoor
Rishi Bommasani
Kevin Klyman
Shayne Longpre
Ashwin Ramaswami
...
Victor Storchan
Daniel Zhang
Daniel E. Ho
Percy Liang
Arvind Narayanan
31
54
0
27 Feb 2024
Compass: A Decentralized Scheduler for Latency-Sensitive ML Workflows
Compass: A Decentralized Scheduler for Latency-Sensitive ML Workflows
Yuting Yang
Andrea Merlina
Weijia Song
Tiancheng Yuan
Ken Birman
Roman Vitenberg
49
0
0
27 Feb 2024
Agent-Pro: Learning to Evolve via Policy-Level Reflection and
  Optimization
Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization
Wenqi Zhang
Ke Tang
Hai Wu
Mengna Wang
Yongliang Shen
Guiyang Hou
Zeqi Tan
Peng Li
Yueting Zhuang
Weiming Lu
LLMAG
44
37
0
27 Feb 2024
Retrieval is Accurate Generation
Retrieval is Accurate Generation
Bowen Cao
Deng Cai
Leyang Cui
Xuxin Cheng
Wei Bi
Yuexian Zou
Shuming Shi
40
6
0
27 Feb 2024
When Scaling Meets LLM Finetuning: The Effect of Data, Model and
  Finetuning Method
When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method
Biao Zhang
Zhongtao Liu
Colin Cherry
Orhan Firat
LRM
68
128
0
27 Feb 2024
Think Big, Generate Quick: LLM-to-SLM for Fast Autoregressive Decoding
Think Big, Generate Quick: LLM-to-SLM for Fast Autoregressive Decoding
Benjamin Bergner
Andrii Skliar
Amelie Royer
Tijmen Blankevoort
Yuki Markus Asano
B. Bejnordi
58
5
0
26 Feb 2024
Why Transformers Need Adam: A Hessian Perspective
Why Transformers Need Adam: A Hessian Perspective
Yushun Zhang
Congliang Chen
Tian Ding
Ziniu Li
Ruoyu Sun
Zhimin Luo
40
43
0
26 Feb 2024
Multi-Bit Distortion-Free Watermarking for Large Language Models
Multi-Bit Distortion-Free Watermarking for Large Language Models
Massieh Kordi Boroujeny
Ya Jiang
Kai Zeng
Brian L. Mark
WaLM
VLM
48
4
0
26 Feb 2024
On Languaging a Simulation Engine
On Languaging a Simulation Engine
Han Liu
Liantang Li
39
0
0
26 Feb 2024
Language-Specific Neurons: The Key to Multilingual Capabilities in Large
  Language Models
Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models
Tianyi Tang
Wenyang Luo
Haoyang Huang
Dongdong Zhang
Xiaolei Wang
Xin Zhao
Furu Wei
Ji-Rong Wen
64
50
0
26 Feb 2024
LLM Inference Unveiled: Survey and Roofline Model Insights
LLM Inference Unveiled: Survey and Roofline Model Insights
Zhihang Yuan
Yuzhang Shang
Yang Zhou
Zhen Dong
Zhe Zhou
...
Yong Jae Lee
Yan Yan
Beidi Chen
Guangyu Sun
Kurt Keutzer
61
82
0
26 Feb 2024
No Free Lunch in LLM Watermarking: Trade-offs in Watermarking Design
  Choices
No Free Lunch in LLM Watermarking: Trade-offs in Watermarking Design Choices
Qi Pang
Shengyuan Hu
Wenting Zheng
Virginia Smith
WaLM
56
12
0
25 Feb 2024
AVI-Talking: Learning Audio-Visual Instructions for Expressive 3D
  Talking Face Generation
AVI-Talking: Learning Audio-Visual Instructions for Expressive 3D Talking Face Generation
Yasheng Sun
Wenqing Chu
Hang Zhou
Kaisiyuan Wang
Hideki Koike
42
5
0
25 Feb 2024
Towards Accurate Post-training Quantization for Reparameterized Models
Towards Accurate Post-training Quantization for Reparameterized Models
Luoming Zhang
Yefei He
Wen Fei
Zhenyu Lou
Weijia Wu
YangWei Ying
Hong Zhou
MQ
43
0
0
25 Feb 2024
How Large Language Models Encode Context Knowledge? A Layer-Wise Probing
  Study
How Large Language Models Encode Context Knowledge? A Layer-Wise Probing Study
Tianjie Ju
Weiwei Sun
Wei Du
Xinwei Yuan
Zhaochun Ren
Gongshen Liu
KELM
39
24
0
25 Feb 2024
Look Before You Leap: Problem Elaboration Prompting Improves
  Mathematical Reasoning in Large Language Models
Look Before You Leap: Problem Elaboration Prompting Improves Mathematical Reasoning in Large Language Models
Haoran Liao
Jidong Tian
Shaohua Hu
Hao He
Yaohui Jin
ReLM
LRM
46
1
0
24 Feb 2024
Sparse MeZO: Less Parameters for Better Performance in Zeroth-Order LLM
  Fine-Tuning
Sparse MeZO: Less Parameters for Better Performance in Zeroth-Order LLM Fine-Tuning
Yong Liu
Zirui Zhu
Chaoyu Gong
Minhao Cheng
Cho-Jui Hsieh
Yang You
MoE
50
16
0
24 Feb 2024
Addressing Order Sensitivity of In-Context Demonstration Examples in
  Causal Language Models
Addressing Order Sensitivity of In-Context Demonstration Examples in Causal Language Models
Yanzheng Xiang
Hanqi Yan
Lin Gui
Yulan He
42
6
0
23 Feb 2024
MegaScale: Scaling Large Language Model Training to More Than 10,000
  GPUs
MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs
Ziheng Jiang
Yanghua Peng
Yinmin Zhong
Qi Huang
Yangrui Chen
...
Zhe Li
X. Jia
Jia-jun Ye
Xin Jin
Xin Liu
LRM
46
105
0
23 Feb 2024
Fast Adversarial Attacks on Language Models In One GPU Minute
Fast Adversarial Attacks on Language Models In One GPU Minute
Vinu Sankar Sadasivan
Shoumik Saha
Gaurang Sriramanan
Priyatham Kattakinda
Atoosa Malemir Chegini
S. Feizi
MIALM
45
34
0
23 Feb 2024
MemoryPrompt: A Light Wrapper to Improve Context Tracking in Pre-trained
  Language Models
MemoryPrompt: A Light Wrapper to Improve Context Tracking in Pre-trained Language Models
Nathanaël Carraz Rakotonirina
Marco Baroni
VLM
KELM
35
0
0
23 Feb 2024
Where Visual Speech Meets Language: VSP-LLM Framework for Efficient and
  Context-Aware Visual Speech Processing
Where Visual Speech Meets Language: VSP-LLM Framework for Efficient and Context-Aware Visual Speech Processing
Jeong Hun Yeo
Seunghee Han
Minsu Kim
Y. Ro
61
759
0
23 Feb 2024
KIEval: A Knowledge-grounded Interactive Evaluation Framework for Large
  Language Models
KIEval: A Knowledge-grounded Interactive Evaluation Framework for Large Language Models
Zhuohao Yu
Chang Gao
Wenjin Yao
Yidong Wang
Wei Ye
Jindong Wang
Xing Xie
Yue Zhang
Shikun Zhang
42
22
0
23 Feb 2024
MobileLLM: Optimizing Sub-billion Parameter Language Models for
  On-Device Use Cases
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases
Zechun Liu
Changsheng Zhao
Forrest N. Iandola
Chen Lai
Yuandong Tian
...
Ernie Chang
Yangyang Shi
Raghuraman Krishnamoorthi
Liangzhen Lai
Vikas Chandra
ALM
43
78
0
22 Feb 2024
Identifying Multiple Personalities in Large Language Models with
  External Evaluation
Identifying Multiple Personalities in Large Language Models with External Evaluation
Xiaoyang Song
Yuta Adachi
Jessie Feng
Mouwei Lin
Linhao Yu
Frank Li
Akshat Gupta
Gopala Anumanchipalli
Simerjot Kaur
LLMAG
35
8
0
22 Feb 2024
A Usage-centric Take on Intent Understanding in E-Commerce
A Usage-centric Take on Intent Understanding in E-Commerce
Wendi Zhou
Tianyi Li
Pavlos Vougiouklis
Mark Steedman
Jeff Z. Pan
19
5
0
22 Feb 2024
An LLM-Enhanced Adversarial Editing System for Lexical Simplification
An LLM-Enhanced Adversarial Editing System for Lexical Simplification
Keren Tan
Kangyang Luo
Yunshi Lan
Zheng Yuan
Jinlong Shu
AAML
30
6
0
22 Feb 2024
LLMs with Industrial Lens: Deciphering the Challenges and Prospects -- A
  Survey
LLMs with Industrial Lens: Deciphering the Challenges and Prospects -- A Survey
Ashok Urlana
Charaka Vinayak Kumar
Ajeet Kumar Singh
B. Garlapati
S. Chalamala
Rahul Mishra
37
5
0
22 Feb 2024
Rule or Story, Which is a Better Commonsense Expression for Talking with
  Large Language Models?
Rule or Story, Which is a Better Commonsense Expression for Talking with Large Language Models?
Ning Bian
Xianpei Han
Hongyu Lin
Yaojie Lu
Xianpei Han
Le Sun
36
1
0
22 Feb 2024
Take the Bull by the Horns: Hard Sample-Reweighted Continual Training
  Improves LLM Generalization
Take the Bull by the Horns: Hard Sample-Reweighted Continual Training Improves LLM Generalization
Xuxi Chen
Zhendong Wang
Daouda Sow
Junjie Yang
Tianlong Chen
Yingbin Liang
Mingyuan Zhou
Zhangyang Wang
34
6
0
22 Feb 2024
Word-Sequence Entropy: Towards Uncertainty Estimation in Free-Form
  Medical Question Answering Applications and Beyond
Word-Sequence Entropy: Towards Uncertainty Estimation in Free-Form Medical Question Answering Applications and Beyond
Zhiyuan Wang
Jinhao Duan
Chenxi Yuan
Qingyu Chen
Tianlong Chen
Huaxiu Yao
Yue Zhang
Ren Wang
Kaidi Xu
Xiaoshuang Shi
UQLM
47
10
0
22 Feb 2024
COPR: Continual Human Preference Learning via Optimal Policy
  Regularization
COPR: Continual Human Preference Learning via Optimal Policy Regularization
Han Zhang
Lin Gui
Yu Lei
Yuanzhao Zhai
Yehong Zhang
...
Hui Wang
Yue Yu
Kam-Fai Wong
Bin Liang
Ruifeng Xu
CLL
42
4
0
22 Feb 2024
Masked Matrix Multiplication for Emergent Sparsity
Masked Matrix Multiplication for Emergent Sparsity
Brian Wheatman
Meghana Madhyastha
Randal C. Burns
34
0
0
21 Feb 2024
Improving Language Understanding from Screenshots
Improving Language Understanding from Screenshots
Tianyu Gao
Zirui Wang
Adithya Bhaskar
Danqi Chen
VLM
43
10
0
21 Feb 2024
Analysing The Impact of Sequence Composition on Language Model
  Pre-Training
Analysing The Impact of Sequence Composition on Language Model Pre-Training
Yu Zhao
Yuanbin Qu
Konrad Staniszewski
Szymon Tworkowski
Wei Liu
Piotr Milo's
Yuxiang Wu
Pasquale Minervini
39
14
0
21 Feb 2024
Can You Learn Semantics Through Next-Word Prediction? The Case of
  Entailment
Can You Learn Semantics Through Next-Word Prediction? The Case of Entailment
William Merrill
Zhaofeng Wu
Norihito Naka
Yoon Kim
Tal Linzen
51
7
0
21 Feb 2024
MORE: Multi-mOdal REtrieval Augmented Generative Commonsense Reasoning
MORE: Multi-mOdal REtrieval Augmented Generative Commonsense Reasoning
Wanqing Cui
Keping Bi
Jiafeng Guo
Xueqi Cheng
SyDa
ReLM
RALM
LRM
39
8
0
21 Feb 2024
ProSparse: Introducing and Enhancing Intrinsic Activation Sparsity
  within Large Language Models
ProSparse: Introducing and Enhancing Intrinsic Activation Sparsity within Large Language Models
Chenyang Song
Xu Han
Zhengyan Zhang
Shengding Hu
Xiyu Shi
...
Chen Chen
Zhiyuan Liu
Guanglin Li
Tao Yang
Maosong Sun
58
25
0
21 Feb 2024
A Survey on Knowledge Distillation of Large Language Models
A Survey on Knowledge Distillation of Large Language Models
Xiaohan Xu
Ming Li
Chongyang Tao
Tao Shen
Reynold Cheng
Jinyang Li
Can Xu
Dacheng Tao
Dinesh Manocha
KELM
VLM
46
104
0
20 Feb 2024
Previous
123...202122...484950
Next