Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.01068
Cited By
OPT: Open Pre-trained Transformer Language Models
2 May 2022
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
Shuohui Chen
Christopher Dewan
Mona T. Diab
Xian Li
Xi Lin
Todor Mihaylov
Myle Ott
Sam Shleifer
Kurt Shuster
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
VLM
OSLM
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"OPT: Open Pre-trained Transformer Language Models"
50 / 2,449 papers shown
Title
Investigating the Influence of Prompt-Specific Shortcuts in AI Generated Text Detection
Choonghyun Park
Sungmin Cho
Junyeob Kim
Youna Kim
Taeuk Kim
Hyunsoo Cho
Hwiyeol Jo
Sang-goo Lee
Kang Min Yoo
AAML
46
1
0
24 Jun 2024
M2Lingual: Enhancing Multilingual, Multi-Turn Instruction Alignment in Large Language Models
Rishabh Maheshwary
Vikas Yadav
Hoang Nguyen
Khyati Mahajan
Sathwik Tejaswi Madhusudhan
49
3
0
24 Jun 2024
Token-based Decision Criteria Are Suboptimal in In-context Learning
Hakaze Cho
Yoshihiro Sakai
Mariko Kato
Kenshiro Tanaka
Akira Ishii
Naoya Inoue
46
3
0
24 Jun 2024
ReCaLL: Membership Inference via Relative Conditional Log-Likelihoods
Roy Xie
Junlin Wang
Ruomin Huang
Minxing Zhang
Rong Ge
Jian Pei
Neil Zhenqiang Gong
Bhuwan Dhingra
MIALM
63
14
0
23 Jun 2024
Reading Is Believing: Revisiting Language Bottleneck Models for Image Classification
Honori Udo
Takafumi Koshinaka
VLM
43
0
0
22 Jun 2024
Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibration
Zhongzhi Yu
Zheng Wang
Yonggan Fu
Huihong Shi
Khalid Shaikh
Yingyan Celine Lin
51
21
0
22 Jun 2024
MetaGreen: Meta-Learning Inspired Transformer Selection for Green Semantic Communication
Shubhabrata Mukherjee
Cory Beard
Sejun Song
45
0
0
22 Jun 2024
DEM: Distribution Edited Model for Training with Mixed Data Distributions
Dhananjay Ram
Aditya Rawal
Momchil Hardalov
Nikolaos Pappas
Sheng Zha
MoMe
65
1
0
21 Jun 2024
MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression
Tianyu Fu
Haofeng Huang
Xuefei Ning
Genghan Zhang
Boju Chen
...
Shiyao Li
Shengen Yan
Guohao Dai
Huazhong Yang
Yu Wang
MQ
52
17
0
21 Jun 2024
Rethinking Pruning Large Language Models: Benefits and Pitfalls of Reconstruction Error Minimization
Sungbin Shin
Wonpyo Park
Jaeho Lee
Namhoon Lee
46
1
0
21 Jun 2024
PostMark: A Robust Blackbox Watermark for Large Language Models
Yapei Chang
Kalpesh Krishna
Amir Houmansadr
John Wieting
Mohit Iyyer
42
5
0
20 Jun 2024
Instruction Pre-Training: Language Models are Supervised Multitask Learners
Daixuan Cheng
Yuxian Gu
Shaohan Huang
Junyu Bi
Minlie Huang
Furu Wei
SyDa
67
20
0
20 Jun 2024
IWISDM: Assessing instruction following in multimodal models at scale
Xiaoxuan Lei
Lucas Gomez
Hao Yuan Bai
P. Bashivan
VLM
33
1
0
20 Jun 2024
Protecting Privacy Through Approximating Optimal Parameters for Sequence Unlearning in Language Models
Dohyun Lee
Daniel Rim
Minseok Choi
Jaegul Choo
PILM
MU
65
4
0
20 Jun 2024
LangTopo: Aligning Language Descriptions of Graphs with Tokenized Topological Modeling
Zhong Guan
Hongke Zhao
Likang Wu
Ming He
Jianpin Fan
40
3
0
19 Jun 2024
AdaMoE: Token-Adaptive Routing with Null Experts for Mixture-of-Experts Language Models
Zihao Zeng
Yibo Miao
Hongcheng Gao
Hao Zhang
Zhijie Deng
MoE
52
8
0
19 Jun 2024
BoA: Attention-aware Post-training Quantization without Backpropagation
Junhan Kim
Ho-Young Kim
Eulrang Cho
Chungman Lee
Joonyoung Kim
Yongkweon Jeon
MQ
38
0
0
19 Jun 2024
ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools
Team GLM
:
Aohan Zeng
Bin Xu
Bowen Wang
...
Zhaoyu Wang
Zhen Yang
Zhengxiao Du
Zhenyu Hou
Zihan Wang
ALM
79
515
0
18 Jun 2024
FuseGen: PLM Fusion for Data-generation based Zero-shot Learning
Tianyuan Zou
Yang Liu
Ziwei Sun
Jianqing Zhang
Jingjing Liu
Ya-Qin Zhang
36
3
0
18 Jun 2024
Mixture of Scales: Memory-Efficient Token-Adaptive Binarization for Large Language Models
Dongwon Jo
Taesu Kim
Yulhwa Kim
Jae-Joon Kim
52
3
0
18 Jun 2024
Soft Prompting for Unlearning in Large Language Models
Karuna Bhaila
Minh-Hao Van
Xintao Wu
MU
KELM
38
4
0
17 Jun 2024
Prefixing Attention Sinks can Mitigate Activation Outliers for Large Language Model Quantization
Seungwoo Son
Wonpyo Park
Woohyun Han
Kyuyeun Kim
Jaeho Lee
MQ
37
10
0
17 Jun 2024
RepLiQA: A Question-Answering Dataset for Benchmarking LLMs on Unseen Reference Content
Joao Monteiro
Pierre-Andre Noel
Étienne Marcotte
Sai Rajeswar
Valentina Zantedeschi
David Vazquez
Nicolas Chapados
Christopher Pal
Perouz Taslakian
41
5
0
17 Jun 2024
Transcendence: Generative Models Can Outperform The Experts That Train Them
Edwin Zhang
Vincent Zhu
Naomi Saphra
Anat Kleiman
Benjamin L. Edelman
Milind Tambe
Sham Kakade
Eran Malach
40
10
0
17 Jun 2024
Endor: Hardware-Friendly Sparse Format for Offloaded LLM Inference
Donghyeon Joo
Ramyad Hadidi
S. Feizi
Bahar Asgari
MQ
39
0
0
17 Jun 2024
Towards an End-to-End Framework for Invasive Brain Signal Decoding with Large Language Models
Sheng Feng
Heyang Liu
Yu Wang
Yanfeng Wang
24
3
0
17 Jun 2024
BAMBINO-LM: (Bilingual-)Human-Inspired Continual Pretraining of BabyLM
Zhewen Shen
Aditya Joshi
Ruey-Cheng Chen
CLL
52
2
0
17 Jun 2024
An Empirical Investigation of Matrix Factorization Methods for Pre-trained Transformers
Ashim Gupta
Sina Mahdipour Saravani
P. Sadayappan
Vivek Srikumar
35
2
0
17 Jun 2024
Emotion-LLaMA: Multimodal Emotion Recognition and Reasoning with Instruction Tuning
Zebang Cheng
Zhi-Qi Cheng
Jun-Yan He
Jingdong Sun
Kai Wang
Yuxiang Lin
Zheng Lian
Xiaojiang Peng
Alexander G. Hauptmann
MLLM
40
31
0
17 Jun 2024
Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization
Wenkai Yang
Shiqi Shen
Guangyao Shen
Zhi Gong
Yankai Lin
Zhi Gong
Yankai Lin
Ji-Rong Wen
61
13
0
17 Jun 2024
MFC-Bench: Benchmarking Multimodal Fact-Checking with Large Vision-Language Models
Shengkang Wang
Hongzhan Lin
Ziyang Luo
Zhen Ye
Guang Chen
Jing Ma
68
3
0
17 Jun 2024
Taking a Deep Breath: Enhancing Language Modeling of Large Language Models with Sentinel Tokens
Weiyao Luo
Suncong Zheng
Heming Xia
Weikang Wang
Yan Lei
Tianyu Liu
Shuang Chen
Zhifang Sui
45
1
0
16 Jun 2024
Promoting Data and Model Privacy in Federated Learning through Quantized LoRA
Jianhao Zhu
Changze Lv
Xiaohua Wang
Muling Wu
Wenhao Liu
Tianlong Li
Zixuan Ling
Cenyuan Zhang
Xiaoqing Zheng
Xuanjing Huang
46
4
0
16 Jun 2024
Tender: Accelerating Large Language Models via Tensor Decomposition and Runtime Requantization
Jungi Lee
Wonbeom Lee
Jaewoong Sim
MQ
46
14
0
16 Jun 2024
ShareLoRA: Parameter Efficient and Robust Large Language Model Fine-tuning via Shared Low-Rank Adaptation
Yurun Song
Junchen Zhao
Ian G. Harris
Sangeetha Abdu Jyothi
32
3
0
16 Jun 2024
MIND: Multimodal Shopping Intention Distillation from Large Vision-language Models for E-commerce Purchase Understanding
Baixuan Xu
Weiqi Wang
Haochen Shi
Wenxuan Ding
Huihao Jing
Tianqing Fang
Jiaxin Bai
Long Chen
Yangqiu Song
44
10
0
15 Jun 2024
Improving Large Models with Small models: Lower Costs and Better Performance
Dong Chen
Shuo Zhang
Yueting Zhuang
Siliang Tang
Qidong Liu
Hua Wang
Mingliang Xu
45
4
0
15 Jun 2024
Applications of Generative AI in Healthcare: algorithmic, ethical, legal and societal considerations
Onyekachukwu R. Okonji
Kamol Yunusov
Bonnie Gordon
MedIm
46
3
0
15 Jun 2024
Evaluating the Generalization Ability of Quantized LLMs: Benchmark, Analysis, and Toolbox
Yijun Liu
Yuan Meng
Fang Wu
Shenhao Peng
Hang Yao
Chaoyu Guan
Chen Tang
Xinzhu Ma
Zhi Wang
Wenwu Zhu
MQ
62
7
0
15 Jun 2024
IntentionQA: A Benchmark for Evaluating Purchase Intention Comprehension Abilities of Language Models in E-commerce
Wenxuan Ding
Weiqi Wang
Sze Heng Douglas Kwok
Minghao Liu
Tianqing Fang
Jiaxin Bai
Junxian He
Yangqiu Song
RALM
44
8
0
14 Jun 2024
MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers
Yiwen Chen
Tong He
Di Huang
Weicai Ye
Sijin Chen
...
Zhongang Cai
Lei Yang
Gang Yu
Guosheng Lin
Chi Zhang
53
49
0
14 Jun 2024
QQQ: Quality Quattuor-Bit Quantization for Large Language Models
Ying Zhang
Peng Zhang
Mincong Huang
Jingyang Xiang
Yujie Wang
Chao Wang
Yineng Zhang
Lei Yu
Chuan Liu
Wei Lin
VLM
MQ
50
5
0
14 Jun 2024
GEB-1.3B: Open Lightweight Large Language Model
Jie Wu
Yufeng Zhu
Lei Shen
Xuqing Lu
ALM
37
0
0
14 Jun 2024
Pcc-tuning: Breaking the Contrastive Learning Ceiling in Semantic Textual Similarity
Bowen Zhang
Chunping Li
50
0
0
14 Jun 2024
Multi-Modal Retrieval For Large Language Model Based Speech Recognition
J. Kolehmainen
Aditya Gourav
Prashanth Gurunath Shivakumar
Yile Gu
Ankur Gandhe
Ariya Rastrow
Grant P. Strimel
I. Bulyko
40
4
0
13 Jun 2024
MLKV: Multi-Layer Key-Value Heads for Memory Efficient Transformer Decoding
Zayd Muhammad Kawakibi Zuhri
Muhammad Farid Adilazuarda
Ayu Purwarianti
Alham Fikri Aji
47
9
0
13 Jun 2024
Enhancing Cross-Modal Fine-Tuning with Gradually Intermediate Modality Generation
Lincan Cai
Shuang Li
Wenxuan Ma
Jingxuan Kang
Binhui Xie
Zixun Sun
Chengwei Zhu
MoE
MoMe
42
0
0
13 Jun 2024
Hierarchical Compression of Text-Rich Graphs via Large Language Models
Shichang Zhang
Da Zheng
Jiani Zhang
Qi Zhu
Xiang Song
Soji Adeshina
Christos Faloutsos
George Karypis
Yizhou Sun
VLM
33
1
0
13 Jun 2024
ProTrain: Efficient LLM Training via Memory-Aware Techniques
Hanmei Yang
Jin Zhou
Yao Fu
Xiaoqun Wang
Ramine Roane
Hui Guan
Tongping Liu
VLM
36
0
0
12 Jun 2024
A Concept-Based Explainability Framework for Large Multimodal Models
Jayneel Parekh
Pegah Khayatan
Mustafa Shukor
A. Newson
Matthieu Cord
40
16
0
12 Jun 2024
Previous
1
2
3
...
12
13
14
...
47
48
49
Next