Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.01068
Cited By
OPT: Open Pre-trained Transformer Language Models
2 May 2022
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
Shuohui Chen
Christopher Dewan
Mona T. Diab
Xian Li
Xi Lin
Todor Mihaylov
Myle Ott
Sam Shleifer
Kurt Shuster
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
VLM
OSLM
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"OPT: Open Pre-trained Transformer Language Models"
50 / 2,456 papers shown
Title
Ethos: Rectifying Language Models in Orthogonal Parameter Space
Lei Gao
Yue Niu
Tingting Tang
A. Avestimehr
Murali Annavaram
MU
40
10
0
13 Mar 2024
Do Language Models Care About Text Quality? Evaluating Web-Crawled Corpora Across 11 Languages
Rik van Noord
Taja Kuzman
Peter Rupnik
Nikola Ljubesic
Miquel Espla-Gomis
Gema Ramírez-Sánchez
Antonio Toral
ALM
40
2
0
13 Mar 2024
Mastering Text, Code and Math Simultaneously via Fusing Highly Specialized Language Models
Ning Ding
Yulin Chen
Ganqu Cui
Xingtai Lv
Weilin Zhao
Ruobing Xie
Bowen Zhou
Zhiyuan Liu
Maosong Sun
ALM
MoMe
AI4CE
40
7
0
13 Mar 2024
Learning to Watermark LLM-generated Text via Reinforcement Learning
Xiaojun Xu
Yuanshun Yao
Yang Liu
29
10
0
13 Mar 2024
MoleculeQA: A Dataset to Evaluate Factual Accuracy in Molecular Comprehension
Xingyu Lu
He Cao
Zijing Liu
Shengyuan Bai
Leqing Chen
Yuan Yao
Hai-Tao Zheng
Yu-Feng Li
HILM
26
7
0
13 Mar 2024
DialogGen: Multi-modal Interactive Dialogue System for Multi-turn Text-to-Image Generation
Minbin Huang
Yanxin Long
Xinchi Deng
Ruihang Chu
Jiangfeng Xiong
Xiaodan Liang
Hong Cheng
Qinglin Lu
Wei Liu
MLLM
EGVM
65
8
0
13 Mar 2024
CHAI: Clustered Head Attention for Efficient LLM Inference
Saurabh Agarwal
Bilge Acun
Basil Homer
Mostafa Elhoushi
Yejin Lee
Shivaram Venkataraman
Dimitris Papailiopoulos
Carole-Jean Wu
63
8
0
12 Mar 2024
Beyond Text: Frozen Large Language Models in Visual Signal Comprehension
Lei Zhu
Fangyun Wei
Yanye Lu
MLLM
VLM
54
18
0
12 Mar 2024
Rethinking Generative Large Language Model Evaluation for Semantic Comprehension
Fangyun Wei
Xi Chen
Linzi Luo
ELM
ALM
LRM
38
7
0
12 Mar 2024
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM
Sainbayar Sukhbaatar
O. Yu. Golovneva
Vasu Sharma
Hu Xu
Xi Lin
...
Jacob Kahn
Shang-Wen Li
Wen-tau Yih
Jason Weston
Xian Li
MoMe
OffRL
MoE
45
62
0
12 Mar 2024
ORPO: Monolithic Preference Optimization without Reference Model
Jiwoo Hong
Noah Lee
James Thorne
OSLM
42
213
0
12 Mar 2024
Characterization of Large Language Model Development in the Datacenter
Qi Hu
Zhisheng Ye
Zerui Wang
Guoteng Wang
Mengdie Zhang
...
Dahua Lin
Xiaolin Wang
Yingwei Luo
Yonggang Wen
Tianwei Zhang
56
45
0
12 Mar 2024
LLMvsSmall Model? Large Language Model Based Text Augmentation Enhanced Personality Detection Model
Linmei Hu
Hongyu He
Duokang Wang
Ziwang Zhao
Yingxia Shao
Liqiang Nie
34
15
0
12 Mar 2024
MoAI: Mixture of All Intelligence for Large Language and Vision Models
Byung-Kwan Lee
Beomchan Park
Chae Won Kim
Yonghyun Ro
MLLM
VLM
50
20
0
12 Mar 2024
SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression
Xin Wang
Yu Zheng
Zhongwei Wan
Mi Zhang
MQ
57
44
0
12 Mar 2024
MEND: Meta dEmonstratioN Distillation for Efficient and Effective In-Context Learning
Yichuan Li
Xiyao Ma
Sixing Lu
Kyumin Lee
Xiaohu Liu
Chenlei Guo
29
6
0
11 Mar 2024
ConspEmoLLM: Conspiracy Theory Detection Using an Emotion-Based Large Language Model
Zhiwei Liu
Boyang Liu
Paul Thompson
Kailai Yang
Sophia Ananiadou
45
3
0
11 Mar 2024
ACT-MNMT Auto-Constriction Turning for Multilingual Neural Machine Translation
Shaojie Dai
Xin Liu
Ping Luo
Yue Yu
LRM
42
1
0
11 Mar 2024
Unsupervised Real-Time Hallucination Detection based on the Internal States of Large Language Models
Weihang Su
Changyue Wang
Qingyao Ai
Hu Yiran
Zhijing Wu
Yujia Zhou
Yiqun Liu
HILM
52
28
0
11 Mar 2024
What Makes Quantization for Large Language Models Hard? An Empirical Study from the Lens of Perturbation
Zhuocheng Gong
Jiahao Liu
Jingang Wang
Xunliang Cai
Dongyan Zhao
Rui Yan
MQ
35
8
0
11 Mar 2024
Mipha: A Comprehensive Overhaul of Multimodal Assistant with Small Language Models
Minjie Zhu
Yichen Zhu
Xin Liu
Ning Liu
Zhiyuan Xu
Yaxin Peng
Chaomin Shen
Zhicai Ou
Feifei Feng
Jian Tang
VLM
57
20
0
10 Mar 2024
Tuning-Free Accountable Intervention for LLM Deployment -- A Metacognitive Approach
Zhen Tan
Jie Peng
Tianlong Chen
Huan Liu
37
6
0
08 Mar 2024
GEAR: An Efficient KV Cache Compression Recipe for Near-Lossless Generative Inference of LLM
Hao Kang
Qingru Zhang
Souvik Kundu
Geonhwa Jeong
Zaoxing Liu
Tushar Krishna
Tuo Zhao
MQ
49
79
0
08 Mar 2024
Multimodal Infusion Tuning for Large Models
Hao Sun
Yu Song
Xinyao Yu
Jiaqing Liu
Yen-Wei Chen
Lanfen Lin
VLM
40
0
0
08 Mar 2024
MEIT: Multi-Modal Electrocardiogram Instruction Tuning on Large Language Models for Report Generation
Zhongwei Wan
Che Liu
Xin Wang
Chaofan Tao
Hui Shen
Zhenwu Peng
Jie Fu
Rossella Arcucci
Huaxiu Yao
Mi Zhang
57
7
0
07 Mar 2024
QAQ: Quality Adaptive Quantization for LLM KV Cache
Shichen Dong
Wenfang Cheng
Jiayu Qin
Wei Wang
MQ
51
34
0
07 Mar 2024
Backtracing: Retrieving the Cause of the Query
Rose E. Wang
Pawan Wirawarn
Omar Khattab
Noah D. Goodman
Dorottya Demszky
50
1
0
06 Mar 2024
SaulLM-7B: A pioneering Large Language Model for Law
Pierre Colombo
T. Pires
Malik Boudiaf
Dominic Culver
Rui Melo
...
Andre F. T. Martins
Fabrizio Esposito
Vera Lúcia Raposo
Sofia Morgado
Michael Desa
ELM
AILaw
54
66
0
06 Mar 2024
Popeye: A Unified Visual-Language Model for Multi-Source Ship Detection from Remote Sensing Imagery
Wei Zhang
Miaoxin Cai
Tong Zhang
Guoqiang Lei
Zhuang Yin
Xuerui Mao
35
7
0
06 Mar 2024
MeaCap: Memory-Augmented Zero-shot Image Captioning
Zequn Zeng
Yan Xie
Hao Zhang
Chiyu Chen
Zhengjue Wang
Boli Chen
VLM
43
14
0
06 Mar 2024
General2Specialized LLMs Translation for E-commerce
Kaidi Chen
Ben Chen
Dehong Gao
Huangyu Dai
Wen Jiang
Wei Ning
Shanqing Yu
Libin Yang
Xiaoyan Cai
15
8
0
06 Mar 2024
WaterMax: breaking the LLM watermark detectability-robustness-quality trade-off
Eva Giboulot
Furon Teddy
WaLM
45
13
0
06 Mar 2024
EasyQuant: An Efficient Data-free Quantization Algorithm for LLMs
Hanlin Tang
Yifu Sun
Decheng Wu
Kai Liu
Jianchen Zhu
Zhanhui Kang
MQ
28
11
0
05 Mar 2024
Balancing Enhancement, Harmlessness, and General Capabilities: Enhancing Conversational LLMs with Direct RLHF
Chen Zheng
Ke Sun
Hang Wu
Chenguang Xi
Xun Zhou
60
12
0
04 Mar 2024
RIFF: Learning to Rephrase Inputs for Few-shot Fine-tuning of Language Models
Saeed Najafi
Alona Fyshe
37
1
0
04 Mar 2024
Not All Layers of LLMs Are Necessary During Inference
Siqi Fan
Xin Jiang
Xiang Li
Xuying Meng
Peng Han
Shuo Shang
Aixin Sun
Yequan Wang
Zhongyuan Wang
51
33
0
04 Mar 2024
Differentially Private Synthetic Data via Foundation Model APIs 2: Text
Chulin Xie
Zinan Lin
A. Backurs
Sivakanth Gopi
Da Yu
...
Haotian Jiang
Huishuai Zhang
Yin Tat Lee
Bo Li
Sergey Yekhanin
SyDa
63
34
0
04 Mar 2024
On the Compressibility of Quantized Large Language Models
Yu Mao
Weilan Wang
Hongchao Du
Nan Guan
Chun Jason Xue
MQ
36
6
0
03 Mar 2024
Automatic Question-Answer Generation for Long-Tail Knowledge
Rohan Kumar
Youngmin Kim
Sunitha Ravi
Haitian Sun
Christos Faloutsos
Ruslan Salakhutdinov
Minji Yoon
25
8
0
03 Mar 2024
OSSCAR: One-Shot Structured Pruning in Vision and Language Models with Combinatorial Optimization
Xiang Meng
Shibal Ibrahim
Kayhan Behdin
Hussein Hazimeh
Natalia Ponomareva
Rahul Mazumder
VLM
49
5
0
02 Mar 2024
Dissecting Language Models: Machine Unlearning via Selective Pruning
Nicholas Pochinkov
Nandi Schoots
MILM
MU
26
16
0
02 Mar 2024
HeteGen: Heterogeneous Parallel Inference for Large Language Models on Resource-Constrained Devices
Xuanlei Zhao
Bin Jia
Hao Zhou
Ziming Liu
Shenggan Cheng
Yang You
34
4
0
02 Mar 2024
LLM-PQ: Serving LLM on Heterogeneous Clusters with Phase-Aware Partition and Adaptive Quantization
Juntao Zhao
Borui Wan
Size Zheng
Yanghua Peng
Chuan Wu
MQ
49
13
0
02 Mar 2024
MediSwift: Efficient Sparse Pre-trained Biomedical Language Models
Vithursan Thangarasa
Mahmoud Salem
Shreyas Saxena
Kevin Leong
Joel Hestness
Sean Lie
MedIm
40
1
0
01 Mar 2024
Artwork Explanation in Large-scale Vision Language Models
Kazuki Hayashi
Yusuke Sakai
Hidetaka Kamigaito
Katsuhiko Hayashi
Taro Watanabe
23
0
0
29 Feb 2024
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
Tsai-Shien Chen
Aliaksandr Siarohin
Willi Menapace
Ekaterina Deyneka
Hsiang-wei Chao
...
Yuwei Fang
Hsin-Ying Lee
Jian Ren
Ming-Hsuan Yang
Sergey Tulyakov
VGen
89
180
0
29 Feb 2024
On the Scaling Laws of Geographical Representation in Language Models
Nathan Godey
Eric Villemonte de la Clergerie
Benoît Sagot
51
6
0
29 Feb 2024
Batch size invariant Adam
Xi Wang
Laurence Aitchison
46
2
0
29 Feb 2024
FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient Finetuning
Xupeng Miao
Gabriele Oliaro
Xinhao Cheng
Vineeth Kada
Ruohan Gao
...
April Yang
Yingcheng Wang
Mengdi Wu
Colin Unger
Zhihao Jia
MoE
94
9
0
29 Feb 2024
Rethinking the Bounds of LLM Reasoning: Are Multi-Agent Discussions the Key?
Qineng Wang
Zihao Wang
Ying Su
Hanghang Tong
Yangqiu Song
LLMAG
LRM
43
64
0
28 Feb 2024
Previous
1
2
3
...
19
20
21
...
48
49
50
Next