ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.01068
  4. Cited By
OPT: Open Pre-trained Transformer Language Models

OPT: Open Pre-trained Transformer Language Models

2 May 2022
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
Shuohui Chen
Christopher Dewan
Mona T. Diab
Xian Li
Xi Lin
Todor Mihaylov
Myle Ott
Sam Shleifer
Kurt Shuster
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
    VLM
    OSLM
    AI4CE
ArXivPDFHTML

Papers citing "OPT: Open Pre-trained Transformer Language Models"

50 / 2,447 papers shown
Title
How Many Bytes Can You Take Out Of Brain-To-Text Decoding?
How Many Bytes Can You Take Out Of Brain-To-Text Decoding?
Richard Antonello
Nihita Sarma
Jerry Tang
Jiaru Song
Alexander G. Huth
43
1
0
22 May 2024
Dense Connector for MLLMs
Dense Connector for MLLMs
Huanjin Yao
Wenhao Wu
Taojiannan Yang
Yuxin Song
Mengxi Zhang
Haocheng Feng
Yifan Sun
Zhiheng Li
Wanli Ouyang
Jingdong Wang
MLLM
VLM
42
18
0
22 May 2024
AdpQ: A Zero-shot Calibration Free Adaptive Post Training Quantization
  Method for LLMs
AdpQ: A Zero-shot Calibration Free Adaptive Post Training Quantization Method for LLMs
Alireza Ghaffari
Sharareh Younesian
Vahid Partovi Nia
Boxing Chen
M. Asgharian
MQ
55
0
0
22 May 2024
How to set AdamW's weight decay as you scale model and dataset size
How to set AdamW's weight decay as you scale model and dataset size
Xi Wang
Laurence Aitchison
46
10
0
22 May 2024
Large Language Models Meet NLP: A Survey
Large Language Models Meet NLP: A Survey
Libo Qin
Qiguang Chen
Xiachong Feng
Yang Wu
Yongheng Zhang
Hai-Tao Zheng
Min Li
Wanxiang Che
Philip S. Yu
ALM
LM&MA
ELM
LRM
52
49
0
21 May 2024
Unlocking Data-free Low-bit Quantization with Matrix Decomposition for
  KV Cache Compression
Unlocking Data-free Low-bit Quantization with Matrix Decomposition for KV Cache Compression
Peiyu Liu
Zeming Gao
Wayne Xin Zhao
Yipeng Ma
Tao Wang
Ji-Rong Wen
MQ
37
5
0
21 May 2024
Context-Enhanced Video Moment Retrieval with Large Language Models
Context-Enhanced Video Moment Retrieval with Large Language Models
Weijia Liu
Bo Miao
Jiuxin Cao
Xueling Zhu
Bo Liu
Mehwish Nasim
Ajmal Mian
52
2
0
21 May 2024
PyramidInfer: Pyramid KV Cache Compression for High-throughput LLM
  Inference
PyramidInfer: Pyramid KV Cache Compression for High-throughput LLM Inference
Dongjie Yang
Xiaodong Han
Yan Gao
Yao Hu
Shilin Zhang
Hai Zhao
41
53
0
21 May 2024
Sparse Autoencoders Enable Scalable and Reliable Circuit Identification
  in Language Models
Sparse Autoencoders Enable Scalable and Reliable Circuit Identification in Language Models
Charles OÑeill
Thang Bui
43
5
0
21 May 2024
Data Contamination Calibration for Black-box LLMs
Data Contamination Calibration for Black-box LLMs
Wen-song Ye
Jiaqi Hu
Liyao Li
Haobo Wang
Gang Chen
Junbo Zhao
40
7
0
20 May 2024
Unveiling and Manipulating Prompt Influence in Large Language Models
Unveiling and Manipulating Prompt Influence in Large Language Models
Zijian Feng
Hanzhang Zhou
Zixiao Zhu
Junlang Qian
Kezhi Mao
45
2
0
20 May 2024
Quantifying In-Context Reasoning Effects and Memorization Effects in
  LLMs
Quantifying In-Context Reasoning Effects and Memorization Effects in LLMs
Siyu Lou
Yuntian Chen
Xiaodan Liang
Liang Lin
Quanshi Zhang
45
2
0
20 May 2024
Efficient Prompt Tuning by Multi-Space Projection and Prompt Fusion
Efficient Prompt Tuning by Multi-Space Projection and Prompt Fusion
Pengxiang Lan
Enneng Yang
Yuting Liu
Guibing Guo
Linying Jiang
Jianzhe Zhao
Xingwei Wang
VLM
AAML
41
1
0
19 May 2024
TriLoRA: Integrating SVD for Advanced Style Personalization in
  Text-to-Image Generation
TriLoRA: Integrating SVD for Advanced Style Personalization in Text-to-Image Generation
Chengcheng Feng
Mu He
Qiuyu Tian
Haojie Yin
Xiaofang Zhao
Hongwei Tang
Xingqiang Wei
DiffM
38
3
0
18 May 2024
The Future of Large Language Model Pre-training is Federated
The Future of Large Language Model Pre-training is Federated
Lorenzo Sani
Alexandru Iacob
Zeyu Cao
Bill Marino
Yan Gao
...
Wanru Zhao
William F. Shen
Preslav Aleksandrov
Xinchi Qiu
Nicholas D. Lane
AI4CE
39
13
0
17 May 2024
Lean Attention: Hardware-Aware Scalable Attention Mechanism for the Decode-Phase of Transformers
Lean Attention: Hardware-Aware Scalable Attention Mechanism for the Decode-Phase of Transformers
Rya Sanovar
Srikant Bharadwaj
Renée St. Amant
Victor Rühle
Saravan Rajmohan
61
6
0
17 May 2024
Conformal Alignment: Knowing When to Trust Foundation Models with
  Guarantees
Conformal Alignment: Knowing When to Trust Foundation Models with Guarantees
Yu Gui
Ying Jin
Zhimei Ren
MedIm
40
18
0
16 May 2024
A Systematic Evaluation of Large Language Models for Natural Language
  Generation Tasks
A Systematic Evaluation of Large Language Models for Natural Language Generation Tasks
Xuanfan Ni
Piji Li
ELM
LRM
34
8
0
16 May 2024
MarkLLM: An Open-Source Toolkit for LLM Watermarking
MarkLLM: An Open-Source Toolkit for LLM Watermarking
Leyi Pan
Aiwei Liu
Zhiwei He
Zitian Gao
Xuandong Zhao
...
Shuliang Liu
Xuming Hu
Lijie Wen
Irwin King
Philip S. Yu
49
29
0
16 May 2024
A Robust Autoencoder Ensemble-Based Approach for Anomaly Detection in
  Text
A Robust Autoencoder Ensemble-Based Approach for Anomaly Detection in Text
Jeremie Pantin
Christophe Marsala
29
0
0
16 May 2024
IGOT: Information Gain Optimized Tokenizer on Domain Adaptive
  Pretraining
IGOT: Information Gain Optimized Tokenizer on Domain Adaptive Pretraining
Dawei Feng
Yihai Zhang
Zhixuan Xu
SyDa
35
0
0
16 May 2024
A Survey on Transformers in NLP with Focus on Efficiency
A Survey on Transformers in NLP with Focus on Efficiency
Wazib Ansar
Saptarsi Goswami
Amlan Chakrabarti
MedIm
40
2
0
15 May 2024
Xmodel-VLM: A Simple Baseline for Multimodal Vision Language Model
Xmodel-VLM: A Simple Baseline for Multimodal Vision Language Model
Wanting Xu
Yang Liu
Langping He
Xucheng Huang
Ling Jiang
VLM
MLLM
43
2
0
15 May 2024
FreeVA: Offline MLLM as Training-Free Video Assistant
FreeVA: Offline MLLM as Training-Free Video Assistant
Wenhao Wu
VLM
OffRL
40
20
0
13 May 2024
EMS-SD: Efficient Multi-sample Speculative Decoding for Accelerating
  Large Language Models
EMS-SD: Efficient Multi-sample Speculative Decoding for Accelerating Large Language Models
Yunsheng Ni
Chuanjian Liu
Yehui Tang
Kai Han
Yunhe Wang
31
0
0
13 May 2024
PLeak: Prompt Leaking Attacks against Large Language Model Applications
PLeak: Prompt Leaking Attacks against Large Language Model Applications
Bo Hui
Haolin Yuan
Neil Zhenqiang Gong
Philippe Burlina
Yinzhi Cao
LLMAG
AAML
SILM
39
36
0
10 May 2024
OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage
  Pruning
OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage Pruning
Dan Qiao
Yi Su
Pinzheng Wang
Jing Ye
Wen Xie
...
Wenliang Chen
Guohong Fu
Guodong Zhou
Qiaoming Zhu
Min Zhang
MQ
40
0
0
09 May 2024
LLMC: Benchmarking Large Language Model Quantization with a Versatile
  Compression Toolkit
LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit
Ruihao Gong
Yang Yong
Shiqiao Gu
Yushi Huang
Chentao Lv
Yunchen Zhang
Xianglong Liu
Dacheng Tao
MQ
42
7
0
09 May 2024
KV-Runahead: Scalable Causal LLM Inference by Parallel Key-Value Cache
  Generation
KV-Runahead: Scalable Causal LLM Inference by Parallel Key-Value Cache Generation
Minsik Cho
Mohammad Rastegari
Devang Naik
32
4
0
08 May 2024
DALK: Dynamic Co-Augmentation of LLMs and KG to answer Alzheimer's
  Disease Questions with Scientific Literature
DALK: Dynamic Co-Augmentation of LLMs and KG to answer Alzheimer's Disease Questions with Scientific Literature
Dawei Li
Shu Yang
Zhen Tan
Jae Young Baik
Sunkwon Yun
...
D. Duong-Tran
Ying Ding
Huan Liu
Li Shen
Tianlong Chen
59
34
0
08 May 2024
Bridging the Bosphorus: Advancing Turkish Large Language Models through
  Strategies for Low-Resource Language Adaptation and Benchmarking
Bridging the Bosphorus: Advancing Turkish Large Language Models through Strategies for Low-Resource Language Adaptation and Benchmarking
Emre Can Acikgoz
Mete Erdogan
Deniz Yuret
39
7
0
07 May 2024
SUTRA: Scalable Multilingual Language Model Architecture
SUTRA: Scalable Multilingual Language Model Architecture
Abhijit Bendale
Michael Sapienza
Steven Ripplinger
Simon Gibbs
Jaewon Lee
Pranav Mistry
LRM
ELM
36
4
0
07 May 2024
Optimizing Language Model's Reasoning Abilities with Weak Supervision
Optimizing Language Model's Reasoning Abilities with Weak Supervision
Yongqi Tong
Sizhe Wang
Dawei Li
Yifan Wang
Simeng Han
Zi Lin
Chengsong Huang
Jiaxin Huang
Jingbo Shang
LRM
ReLM
42
8
0
07 May 2024
KV Cache is 1 Bit Per Channel: Efficient Large Language Model Inference
  with Coupled Quantization
KV Cache is 1 Bit Per Channel: Efficient Large Language Model Inference with Coupled Quantization
Tianyi Zhang
Jonah Yi
Zhaozhuo Xu
Anshumali Shrivastava
MQ
29
26
0
07 May 2024
Who Wrote This? The Key to Zero-Shot LLM-Generated Text Detection Is GECScore
Who Wrote This? The Key to Zero-Shot LLM-Generated Text Detection Is GECScore
Junchao Wu
Runzhe Zhan
Derek F. Wong
Shu Yang
Xuebo Liu
Lidia S. Chao
Min Zhang
DeLMO
46
4
0
07 May 2024
Self-Improving Customer Review Response Generation Based on LLMs
Self-Improving Customer Review Response Generation Based on LLMs
Guy Azov
Tatiana Pelc
Adi Fledel Alon
Gila Kamhi
40
0
0
06 May 2024
A Controlled Experiment on the Energy Efficiency of the Source Code
  Generated by Code Llama
A Controlled Experiment on the Energy Efficiency of the Source Code Generated by Code Llama
Vlad-Andrei Cursaru
Laura Duits
Joel Milligan
Damla Ural
Berta Rodriguez Sanchez
Vincenzo Stoico
I. Malavolta
32
3
0
06 May 2024
Learning from Students: Applying t-Distributions to Explore Accurate and
  Efficient Formats for LLMs
Learning from Students: Applying t-Distributions to Explore Accurate and Efficient Formats for LLMs
Jordan Dotzel
Yuzong Chen
Bahaa Kotb
Sushma Prasad
Gang Wu
Sheng Li
Mohamed S. Abdelfattah
Zhiru Zhang
31
9
0
06 May 2024
Compressing Long Context for Enhancing RAG with AMR-based Concept
  Distillation
Compressing Long Context for Enhancing RAG with AMR-based Concept Distillation
Kaize Shi
Xueyao Sun
Qing Li
Guandong Xu
51
13
0
06 May 2024
Assessing Adversarial Robustness of Large Language Models: An Empirical
  Study
Assessing Adversarial Robustness of Large Language Models: An Empirical Study
Zeyu Yang
Zhao Meng
Xiaochen Zheng
Roger Wattenhofer
ELM
AAML
31
7
0
04 May 2024
Enhancing Contextual Understanding in Large Language Models through
  Contrastive Decoding
Enhancing Contextual Understanding in Large Language Models through Contrastive Decoding
Zheng Zhao
Emilio Monti
Jens Lehmann
H. Assem
42
24
0
04 May 2024
Random Masking Finds Winning Tickets for Parameter Efficient Fine-tuning
Random Masking Finds Winning Tickets for Parameter Efficient Fine-tuning
Jing Xu
Jingzhao Zhang
39
7
0
04 May 2024
PICLe: Eliciting Diverse Behaviors from Large Language Models with
  Persona In-Context Learning
PICLe: Eliciting Diverse Behaviors from Large Language Models with Persona In-Context Learning
Hyeong Kyu Choi
Yixuan Li
69
17
0
03 May 2024
A Survey of Time Series Foundation Models: Generalizing Time Series
  Representation with Large Language Model
A Survey of Time Series Foundation Models: Generalizing Time Series Representation with Large Language Model
Weiqi Zhang
Jiexia Ye
Ke Yi
Yongzi Yu
Ziyue Li
Jia Li
Fugee Tsung
AI4TS
AI4CE
49
22
0
03 May 2024
D2PO: Discriminator-Guided DPO with Response Evaluation Models
D2PO: Discriminator-Guided DPO with Response Evaluation Models
Prasann Singhal
Nathan Lambert
S. Niekum
Tanya Goyal
Greg Durrett
OffRL
EGVM
48
4
0
02 May 2024
When Quantization Affects Confidence of Large Language Models?
When Quantization Affects Confidence of Large Language Models?
Irina Proskurina
Luc Brun
Guillaume Metzler
Julien Velcin
MQ
47
2
0
01 May 2024
RAG and RAU: A Survey on Retrieval-Augmented Language Model in Natural
  Language Processing
RAG and RAU: A Survey on Retrieval-Augmented Language Model in Natural Language Processing
Yucheng Hu
Yuxing Lu
RALM
60
18
0
30 Apr 2024
AppPoet: Large Language Model based Android malware detection via
  multi-view prompt engineering
AppPoet: Large Language Model based Android malware detection via multi-view prompt engineering
Wenxiang Zhao
Juntao Wu
Zhaoyi Meng
AAML
37
11
0
29 Apr 2024
Time Machine GPT
Time Machine GPT
Felix Drinkall
Eghbal Rahimikia
J. Pierrehumbert
Stefan Zohren
AI4TS
AI4CE
KELM
SyDa
44
3
0
29 Apr 2024
Ethical Reasoning and Moral Value Alignment of LLMs Depend on the
  Language we Prompt them in
Ethical Reasoning and Moral Value Alignment of LLMs Depend on the Language we Prompt them in
Utkarsh Agarwal
Kumar Tanmay
Aditi Khandelwal
Monojit Choudhury
LRM
31
7
0
29 Apr 2024
Previous
123...151617...474849
Next