Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.01068
Cited By
OPT: Open Pre-trained Transformer Language Models
2 May 2022
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
Shuohui Chen
Christopher Dewan
Mona T. Diab
Xian Li
Xi Lin
Todor Mihaylov
Myle Ott
Sam Shleifer
Kurt Shuster
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
VLM
OSLM
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"OPT: Open Pre-trained Transformer Language Models"
50 / 2,454 papers shown
Title
Mobile Edge Intelligence for Large Language Models: A Contemporary Survey
Guanqiao Qu
Qiyuan Chen
Wei Wei
Zheng Lin
Xianhao Chen
Kaibin Huang
45
43
0
09 Jul 2024
Data, Data Everywhere: A Guide for Pretraining Dataset Construction
Jupinder Parmar
Shrimai Prabhumoye
Joseph Jennings
Bo Liu
Aastha Jhunjhunwala
Zhilin Wang
M. Patwary
M. Shoeybi
Bryan Catanzaro
53
6
0
08 Jul 2024
Vision-Language Models under Cultural and Inclusive Considerations
Antonia Karamolegkou
Phillip Rust
Yong Cao
Ruixiang Cui
Anders Søgaard
Daniel Hershcovich
VLM
58
7
0
08 Jul 2024
Pruning Large Language Models to Intra-module Low-rank Architecture with Transitional Activations
Bowen Shen
Zheng Lin
Daren Zha
Wei Liu
Jian Luan
Bin Wang
Weiping Wang
62
1
0
08 Jul 2024
LLMBox: A Comprehensive Library for Large Language Models
Tianyi Tang
Yiwen Hu
Bingqian Li
Wenyang Luo
Zijing Qin
...
Chunxuan Xia
Junyi Li
Kun Zhou
Wayne Xin Zhao
Ji-Rong Wen
53
1
0
08 Jul 2024
AI Safety in Generative AI Large Language Models: A Survey
Jaymari Chua
Yun Yvonna Li
Shiyi Yang
Chen Wang
Lina Yao
LM&MA
47
12
0
06 Jul 2024
Lazarus: Resilient and Elastic Training of Mixture-of-Experts Models with Adaptive Expert Placement
Yongji Wu
Wenjie Qu
Tianyang Tao
Zhuang Wang
Wei Bai
Zhuohao Li
Yuan Tian
Jiaheng Zhang
Matthew Lentz
Danyang Zhuo
74
3
0
05 Jul 2024
Identifying the Source of Generation for Large Language Models
Bumjin Park
Jaesik Choi
36
0
0
05 Jul 2024
SpikeLLM: Scaling up Spiking Neural Network to Large Language Models via Saliency-based Spiking
Xingrun Xing
Boyan Gao
Zheng Zhang
David A. Clifton
Shitao Xiao
Li Du
Guoqi Li
Jiajun Zhang
63
5
0
05 Jul 2024
TongGu: Mastering Classical Chinese Understanding with Knowledge-Grounded Large Language Models
Jiahuan Cao
Dezhi Peng
Peirong Zhang
Yongxin Shi
Yang Liu
Kai Ding
Lianwen Jin
31
0
0
04 Jul 2024
On the Benchmarking of LLMs for Open-Domain Dialogue Evaluation
John Mendonça
A. Lavie
Isabel Trancoso
ELM
43
2
0
04 Jul 2024
MSfusion: A Dynamic Model Splitting Approach for Resource-Constrained Machines to Collaboratively Train Larger Models
Jin Xie
Songze Li
FedML
49
0
0
04 Jul 2024
Social Bias Evaluation for Large Language Models Requires Prompt Variations
Rem Hida
Masahiro Kaneko
Naoaki Okazaki
46
14
0
03 Jul 2024
Improving Conversational Abilities of Quantized Large Language Models via Direct Preference Alignment
Janghwan Lee
Seongmin Park
S. Hong
Minsoo Kim
Du-Seong Chang
Jungwook Choi
37
4
0
03 Jul 2024
GPTQT: Quantize Large Language Models Twice to Push the Efficiency
Yipin Guo
Yilin Lang
Qinyuan Ren
MQ
26
1
0
03 Jul 2024
Efficient Training of Language Models with Compact and Consistent Next Token Distributions
Ashutosh Sathe
Sunita Sarawagi
40
0
0
03 Jul 2024
Whispering Experts: Neural Interventions for Toxicity Mitigation in Language Models
Xavier Suau
Pieter Delobelle
Katherine Metcalf
Armand Joulin
N. Apostoloff
Luca Zappella
P. Rodríguez
MU
AAML
47
9
0
02 Jul 2024
LogEval: A Comprehensive Benchmark Suite for Large Language Models In Log Analysis
Tianyu Cui
Shiyu Ma
Ziang Chen
Tong Xiao
Shimin Tao
...
Changchang Liu
Yuzhe Cai
Weibin Meng
Yongqian Sun
Dan Pei
ELM
35
5
0
02 Jul 2024
Meerkat: Audio-Visual Large Language Model for Grounding in Space and Time
Sanjoy Chowdhury
Sayan Nag
Subhrajyoti Dasgupta
Jun Chen
Mohamed Elhoseiny
Ruohan Gao
Dinesh Manocha
VLM
MLLM
49
9
0
01 Jul 2024
ScanReason: Empowering 3D Visual Grounding with Reasoning Capabilities
Chenming Zhu
Tai Wang
Wenwei Zhang
Kai Chen
Xihui Liu
ReLM
LRM
49
17
0
01 Jul 2024
Mobile-Bench: An Evaluation Benchmark for LLM-based Mobile Agents
Shihan Deng
Weikai Xu
Hongda Sun
Wei Liu
Tao Tan
...
Ang Li
Jian Luan
Bin Wang
Rui Yan
Shuo Shang
LLMAG
52
8
0
01 Jul 2024
FoldGPT: Simple and Effective Large Language Model Compression Scheme
Songwei Liu
Chao Zeng
Lianqiang Li
Chenqian Yan
Lean Fu
Xing Mei
Fangmin Chen
48
4
0
01 Jul 2024
LLM4GEN: Leveraging Semantic Representation of LLMs for Text-to-Image Generation
Mushui Liu
Yuhang Ma
Yang Zhen
Jun Dan
Yunlong Yu
Zeng Zhao
Zhipeng Hu
Bai Liu
Changjie Fan
VLM
DiffM
71
14
0
30 Jun 2024
InfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV Cache Management
Wonbeom Lee
Jungi Lee
Junghwan Seo
Jaewoong Sim
RALM
34
75
0
28 Jun 2024
MMRo: Are Multimodal LLMs Eligible as the Brain for In-Home Robotics?
Jinming Li
Yichen Zhu
Zhiyuan Xu
Jindong Gu
Minjie Zhu
Xin Liu
Ning Liu
Yaxin Peng
Feifei Feng
Jian Tang
LRM
LM&Ro
36
7
0
28 Jun 2024
A Survey on Failure Analysis and Fault Injection in AI Systems
Guangba Yu
Gou Tan
Haojia Huang
Zhenyu Zhang
Pengfei Chen
Roberto Natella
Zibin Zheng
62
4
0
28 Jun 2024
Direct Preference Knowledge Distillation for Large Language Models
Yixing Li
Yuxian Gu
Li Dong
Dequan Wang
Yu Cheng
Furu Wei
45
6
0
28 Jun 2024
Adaptive Draft-Verification for Efficient Large Language Model Decoding
Xukun Liu
Bowen Lei
Ruqi Zhang
Dongkuan Xu
39
3
0
27 Jun 2024
Accurate Prediction of Ligand-Protein Interaction Affinities with Fine-Tuned Small Language Models
Ben Fauber
LM&MA
48
1
0
27 Jun 2024
Resolving Discrepancies in Compute-Optimal Scaling of Language Models
Tomer Porian
Mitchell Wortsman
J. Jitsev
Ludwig Schmidt
Y. Carmon
63
21
0
27 Jun 2024
CELLO: Causal Evaluation of Large Vision-Language Models
Meiqi Chen
Bo Peng
Yan Zhang
Chaochao Lu
LRM
ELM
53
0
0
27 Jun 2024
Fairness and Bias in Multimodal AI: A Survey
Tosin Adewumi
Lama Alkhaled
Namrata Gurung
G. V. Boven
Irene Pagliai
58
9
0
27 Jun 2024
LLM-based Frameworks for API Argument Filling in Task-Oriented Conversational Systems
J. Mok
Mohammad Kachuee
Shuyang Dai
Shayan Ray
Tara Taghavi
Sungroh Yoon
LLMAG
37
2
0
27 Jun 2024
FFN: a Fine-grained Chinese-English Financial Domain Parallel Corpus
Yuxin Fu
Shijing Si
Leyi Mai
Xi-ang Li
47
1
0
27 Jun 2024
OutlierTune: Efficient Channel-Wise Quantization for Large Language Models
Jinguang Wang
Yuexi Yin
Haifeng Sun
Qi Qi
Jingyu Wang
Zirui Zhuang
Tingting Yang
Jianxin Liao
46
2
0
27 Jun 2024
Selective Prompting Tuning for Personalized Conversations with LLMs
Qiushi Huang
Xubo Liu
Tom Ko
Bo Wu
Wenwu Wang
Yu Zhang
Lilian H. Y. Tang
46
5
0
26 Jun 2024
PharmaGPT: Domain-Specific Large Language Models for Bio-Pharmaceutical and Chemistry
Linqing Chen
Weilei Wang
Zilong Bai
Peng Xu
Yan Fang
...
Lisha Zhang
Fu Bian
Zhongkai Ye
Lidong Pei
Changyang Tu
AI4MH
LM&MA
53
2
0
26 Jun 2024
DKPROMPT: Domain Knowledge Prompting Vision-Language Models for Open-World Planning
Xiaohan Zhang
Zainab Altaweel
Yohei Hayamizu
Yan Ding
S. Amiri
Hao Yang
Andy Kaminski
Chad Esselink
Shiqi Zhang
VLM
LM&Ro
41
7
0
25 Jun 2024
Layer-Wise Quantization: A Pragmatic and Effective Method for Quantizing LLMs Beyond Integer Bit-Levels
Razvan-Gabriel Dumitru
Vikas Yadav
Rishabh Maheshwary
Paul-Ioan Clotan
Sathwik Tejaswi Madhusudhan
Mihai Surdeanu
MQ
46
2
0
25 Jun 2024
ARES: Alternating Reinforcement Learning and Supervised Fine-Tuning for Enhanced Multi-Modal Chain-of-Thought Reasoning Through Diverse AI Feedback
Ju-Seung Byun
Jiyun Chun
Jihyung Kil
Andrew Perrault
ReLM
LRM
43
2
0
25 Jun 2024
Leveraging LLMs for Dialogue Quality Measurement
Jinghan Jia
A. Komma
Timothy Leffel
Xujun Peng
Ajay Nagesh
Tamer Soliman
Aram Galstyan
Anoop Kumar
47
5
0
25 Jun 2024
OPT-Tree: Speculative Decoding with Adaptive Draft Tree Structure
Jikai Wang
Yi Su
Juntao Li
Qingrong Xia
Zi Ye
Xinyu Duan
Zhefeng Wang
Min Zhang
46
14
0
25 Jun 2024
Paraphrase and Aggregate with Large Language Models for Minimizing Intent Classification Errors
Vikas Yadav
Zheng Tang
Vijay Srinivasan
40
8
0
24 Jun 2024
MM-SpuBench: Towards Better Understanding of Spurious Biases in Multimodal LLMs
Wenqian Ye
Guangtao Zheng
Yunsheng Ma
Xu Cao
Bolin Lai
James M. Rehg
Aidong Zhang
37
10
0
24 Jun 2024
Evaluating the Quality of Hallucination Benchmarks for Large Vision-Language Models
Bei Yan
Jie Zhang
Zheng Yuan
Shiguang Shan
Xilin Chen
VLM
46
4
0
24 Jun 2024
Scaling Laws for Linear Complexity Language Models
Xuyang Shen
Dong Li
Ruitao Leng
Zhen Qin
Weigao Sun
Yiran Zhong
LRM
33
6
0
24 Jun 2024
LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training
Tong Zhu
Xiaoye Qu
Daize Dong
Jiacheng Ruan
Jingqi Tong
Conghui He
Yu Cheng
MoE
ALM
54
73
0
24 Jun 2024
Large Vocabulary Size Improves Large Language Models
Sho Takase
Ryokan Ri
Shun Kiyono
Takuya Kato
45
3
0
24 Jun 2024
Building on Efficient Foundations: Effectively Training LLMs with Structured Feedforward Layers
Xiuying Wei
Skander Moalla
Razvan Pascanu
Çağlar Gülçehre
34
0
0
24 Jun 2024
Compensate Quantization Errors: Make Weights Hierarchical to Compensate Each Other
Yifei Gao
Jie Ou
Lei Wang
Yuting Xiao
Zhiyuan Xiang
Ruiting Dai
Jun Cheng
MQ
36
3
0
24 Jun 2024
Previous
1
2
3
...
11
12
13
...
48
49
50
Next