ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.01068
  4. Cited By
OPT: Open Pre-trained Transformer Language Models

OPT: Open Pre-trained Transformer Language Models

2 May 2022
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
Shuohui Chen
Christopher Dewan
Mona T. Diab
Xian Li
Xi Lin
Todor Mihaylov
Myle Ott
Sam Shleifer
Kurt Shuster
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
    VLM
    OSLM
    AI4CE
ArXivPDFHTML

Papers citing "OPT: Open Pre-trained Transformer Language Models"

50 / 2,434 papers shown
Title
MixPE: Quantization and Hardware Co-design for Efficient LLM Inference
MixPE: Quantization and Hardware Co-design for Efficient LLM Inference
Yu Zhang
Ming Wang
Lancheng Zou
Wulong Liu
Hui-Ling Zhen
M. Yuan
Bei Yu
MQ
79
1
0
25 Nov 2024
Anda: Unlocking Efficient LLM Inference with a Variable-Length Grouped
  Activation Data Format
Anda: Unlocking Efficient LLM Inference with a Variable-Length Grouped Activation Data Format
Chao Fang
Man Shi
Robin Geens
Arne Symons
Zhongfeng Wang
Marian Verhelst
76
0
0
24 Nov 2024
State-Space Large Audio Language Models
State-Space Large Audio Language Models
Saurabhchand Bhati
Yuan Gong
Leonid Karlinsky
Hilde Kuehne
Rogerio Feris
James Glass
99
0
0
24 Nov 2024
Lifelong Knowledge Editing for Vision Language Models with Low-Rank Mixture-of-Experts
Lifelong Knowledge Editing for Vision Language Models with Low-Rank Mixture-of-Experts
Qizhou Chen
Chengyu Wang
Dakan Wang
Taolin Zhang
Wangyue Li
Xiaofeng He
KELM
83
1
0
23 Nov 2024
FuseGPT: Learnable Layers Fusion of Generative Pre-trained Transformers
FuseGPT: Learnable Layers Fusion of Generative Pre-trained Transformers
Zehua Pei
Hui-Ling Zhen
Xianzhi Yu
Sinno Jialin Pan
M. Yuan
Bei Yu
AI4CE
89
0
0
21 Nov 2024
Star-Agents: Automatic Data Optimization with LLM Agents for Instruction
  Tuning
Star-Agents: Automatic Data Optimization with LLM Agents for Instruction Tuning
Hang Zhou
Yehui Tang
Haochen Qin
Yujie Yang
Renren Jin
Deyi Xiong
Kai Han
Yunhe Wang
65
2
0
21 Nov 2024
Hymba: A Hybrid-head Architecture for Small Language Models
Hymba: A Hybrid-head Architecture for Small Language Models
Xin Dong
Y. Fu
Shizhe Diao
Wonmin Byeon
Zijia Chen
...
Min-Hung Chen
Yoshi Suhara
Y. Lin
Jan Kautz
Pavlo Molchanov
Mamba
102
21
0
20 Nov 2024
WaterPark: A Robustness Assessment of Language Model Watermarking
WaterPark: A Robustness Assessment of Language Model Watermarking
Jiacheng Liang
Zian Wang
Lauren Hong
Shouling Ji
Ting Wang
AAML
106
0
0
20 Nov 2024
AIDBench: A benchmark for evaluating the authorship identification
  capability of large language models
AIDBench: A benchmark for evaluating the authorship identification capability of large language models
Zichen Wen
Dadi Guo
Huishuai Zhang
79
0
0
20 Nov 2024
Hardware Scaling Trends and Diminishing Returns in Large-Scale Distributed Training
Hardware Scaling Trends and Diminishing Returns in Large-Scale Distributed Training
Jared Fernandez
Luca Wehrstedt
Leonid Shamis
Mostafa Elhoushi
Kalyan Saladi
Yonatan Bisk
Emma Strubell
Jacob Kahn
254
3
0
20 Nov 2024
Bi-Mamba: Towards Accurate 1-Bit State Space Models
Shengkun Tang
Liqun Ma
Yiming Li
Mingjie Sun
Zhiqiang Shen
Mamba
81
3
0
18 Nov 2024
BitMoD: Bit-serial Mixture-of-Datatype LLM Acceleration
Yuzong Chen
Ahmed F. AbouElhamayed
Xilai Dai
Yang Wang
Marta Andronic
George A. Constantinides
Mohamed S. Abdelfattah
MQ
108
1
0
18 Nov 2024
SEFD: Semantic-Enhanced Framework for Detecting LLM-Generated Text
SEFD: Semantic-Enhanced Framework for Detecting LLM-Generated Text
Weiqing He
Bojian Hou
Tianqi Shang
Davoud Ataee Tarzanagh
Qi Long
Li Shen
DeLMO
85
0
0
17 Nov 2024
AmoebaLLM: Constructing Any-Shape Large Language Models for Efficient
  and Instant Deployment
AmoebaLLM: Constructing Any-Shape Large Language Models for Efficient and Instant Deployment
Y. Fu
Zhongzhi Yu
Junwei Li
Jiayi Qian
Yongan Zhang
Xiangchi Yuan
Dachuan Shi
Roman Yakunin
Y. Lin
43
2
0
15 Nov 2024
Xmodel-1.5: An 1B-scale Multilingual LLM
Xmodel-1.5: An 1B-scale Multilingual LLM
Wang Qun
Liu Yang
Lin Qingquan
Jiang Ling
LRM
44
0
0
15 Nov 2024
SmartInv: Multimodal Learning for Smart Contract Invariant Inference
SmartInv: Multimodal Learning for Smart Contract Invariant Inference
Sally Junsong Wang
Kexin Pei
Junfeng Yang
64
12
0
14 Nov 2024
ClevrSkills: Compositional Language and Visual Reasoning in Robotics
ClevrSkills: Compositional Language and Visual Reasoning in Robotics
Sanjay Haresh
Daniel Dijkman
Apratim Bhattacharyya
Roland Memisevic
CoGe
LRM
42
1
0
13 Nov 2024
NavAgent: Multi-scale Urban Street View Fusion For UAV Embodied
  Vision-and-Language Navigation
NavAgent: Multi-scale Urban Street View Fusion For UAV Embodied Vision-and-Language Navigation
Youzhi Liu
Fanglong Yao
Yuanchang Yue
Guangluan Xu
Xian Sun
Kun Fu
LM&Ro
42
3
0
13 Nov 2024
New Emerged Security and Privacy of Pre-trained Model: a Survey and
  Outlook
New Emerged Security and Privacy of Pre-trained Model: a Survey and Outlook
Meng Yang
Tianqing Zhu
Chi Liu
Wanlei Zhou
Shui Yu
Philip S. Yu
AAML
ELM
PILM
64
1
0
12 Nov 2024
Beyond the Safety Bundle: Auditing the Helpful and Harmless Dataset
Beyond the Safety Bundle: Auditing the Helpful and Harmless Dataset
Khaoula Chehbouni
Jonathan Colaço-Carr
Yash More
Jackie CK Cheung
G. Farnadi
78
0
0
12 Nov 2024
Efficient Adaptive Optimization via Subset-Norm and Subspace-Momentum:
  Fast, Memory-Reduced Training with Convergence Guarantees
Efficient Adaptive Optimization via Subset-Norm and Subspace-Momentum: Fast, Memory-Reduced Training with Convergence Guarantees
T. Nguyen
Huy Le Nguyen
ODL
33
0
0
11 Nov 2024
Zeroth-Order Adaptive Neuron Alignment Based Pruning without Re-Training
Zeroth-Order Adaptive Neuron Alignment Based Pruning without Re-Training
Elia Cunegatti
Leonardo Lucio Custode
Giovanni Iacca
52
0
0
11 Nov 2024
Model Fusion through Bayesian Optimization in Language Model Fine-Tuning
Model Fusion through Bayesian Optimization in Language Model Fine-Tuning
Chaeyun Jang
Hyungi Lee
Jungtaek Kim
Juho Lee
MoMe
53
0
0
11 Nov 2024
CityGuessr: City-Level Video Geo-Localization on a Global Scale
CityGuessr: City-Level Video Geo-Localization on a Global Scale
P. Kulkarni
Gaurav Kumar Nayak
Mubarak Shah
ViT
AI4TS
29
2
0
10 Nov 2024
Towards Low-Resource Harmful Meme Detection with LMM Agents
Towards Low-Resource Harmful Meme Detection with LMM Agents
Jianzhao Huang
Hongzhan Lin
Ziyan Liu
Ziyang Luo
Guang Chen
Jing Ma
38
2
0
08 Nov 2024
Scaling Laws for Precision
Scaling Laws for Precision
Tanishq Kumar
Zachary Ankner
Benjamin Spector
Blake Bordelon
Niklas Muennighoff
Mansheej Paul
Cengiz Pehlevan
Christopher Ré
Aditi Raghunathan
AIFin
MoMe
54
14
0
07 Nov 2024
Prompt-Guided Internal States for Hallucination Detection of Large Language Models
Prompt-Guided Internal States for Hallucination Detection of Large Language Models
Fujie Zhang
Peiqi Yu
Biao Yi
Baolei Zhang
Tong Li
Zheli Liu
HILM
LRM
57
0
0
07 Nov 2024
Interactions Across Blocks in Post-Training Quantization of Large
  Language Models
Interactions Across Blocks in Post-Training Quantization of Large Language Models
Khasmamad Shabanovi
Lukas Wiest
Vladimir Golkov
Daniel Cremers
Thomas Pfeil
MQ
33
1
0
06 Nov 2024
Crystal: Illuminating LLM Abilities on Language and Code
Crystal: Illuminating LLM Abilities on Language and Code
Tianhua Tao
Junbo Li
Bowen Tan
Hongyi Wang
William Marshall
...
Joel Hestness
Natalia Vassilieva
Zhiqiang Shen
Eric P. Xing
Zhengzhong Liu
52
4
0
06 Nov 2024
Understanding the Effects of Human-written Paraphrases in LLM-generated
  Text Detection
Understanding the Effects of Human-written Paraphrases in LLM-generated Text Detection
Hiu Ting Lau
Arkaitz Zubiaga
DeLMO
47
1
0
06 Nov 2024
No Culture Left Behind: ArtELingo-28, a Benchmark of WikiArt with
  Captions in 28 Languages
No Culture Left Behind: ArtELingo-28, a Benchmark of WikiArt with Captions in 28 Languages
Youssef Mohamed
Runjia Li
Ibrahim Said Ahmad
Kilichbek Haydarov
Philip Torr
Kenneth Church
Mohamed Elhoseiny
VLM
38
7
0
06 Nov 2024
KptLLM: Unveiling the Power of Large Language Model for Keypoint
  Comprehension
KptLLM: Unveiling the Power of Large Language Model for Keypoint Comprehension
Jie-jin Yang
Wang Zeng
Sheng Jin
Lumin Xu
Wentao Liu
Chen Qian
Ruimao Zhang
MLLM
65
2
0
04 Nov 2024
What Goes Into a LM Acceptability Judgment? Rethinking the Impact of Frequency and Length
What Goes Into a LM Acceptability Judgment? Rethinking the Impact of Frequency and Length
Lindia Tjuatja
Graham Neubig
Tal Linzen
Sophie Hao
47
1
0
04 Nov 2024
Explaining and Improving Contrastive Decoding by Extrapolating the
  Probabilities of a Huge and Hypothetical LM
Explaining and Improving Contrastive Decoding by Extrapolating the Probabilities of a Huge and Hypothetical LM
Haw-Shiuan Chang
Nanyun Peng
Mohit Bansal
Anil Ramakrishna
Tagyoung Chung
39
2
0
03 Nov 2024
Privacy Risks of Speculative Decoding in Large Language Models
Privacy Risks of Speculative Decoding in Large Language Models
Jiankun Wei
Abdulrahman Abdulrazzag
Tianchen Zhang
Adel Muursepp
Gururaj Saileshwar
35
2
0
01 Nov 2024
SimpleFSDP: Simpler Fully Sharded Data Parallel with torch.compile
SimpleFSDP: Simpler Fully Sharded Data Parallel with torch.compile
Ruisi Zhang
Tianyu Liu
Will Feng
Andrew Gu
Sanket Purandare
Wanchao Liang
Francisco Massa
31
1
0
01 Nov 2024
LLM-Inference-Bench: Inference Benchmarking of Large Language Models on
  AI Accelerators
LLM-Inference-Bench: Inference Benchmarking of Large Language Models on AI Accelerators
Krishna Teja Chitty-Venkata
Siddhisanket Raskar
B. Kale
Farah Ferdaus
Aditya Tanikanti
Ken Raffenetti
Valerie Taylor
M. Emani
V. Vishwanath
49
7
0
31 Oct 2024
ALISE: Accelerating Large Language Model Serving with Speculative
  Scheduling
ALISE: Accelerating Large Language Model Serving with Speculative Scheduling
Youpeng Zhao
Jun Wang
37
0
0
31 Oct 2024
Tiny Transformers Excel at Sentence Compression
Tiny Transformers Excel at Sentence Compression
Peter Belcak
Roger Wattenhofer
40
0
0
30 Oct 2024
A Comprehensive Study on Quantization Techniques for Large Language
  Models
A Comprehensive Study on Quantization Techniques for Large Language Models
Jiedong Lang
Zhehao Guo
Shuyu Huang
MQ
44
10
0
30 Oct 2024
TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters
TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters
Haiyang Wang
Yue Fan
Muhammad Ferjad Naeem
Yongqin Xian
J. E. Lenssen
Liwei Wang
F. Tombari
Bernt Schiele
49
2
0
30 Oct 2024
Improving Performance of Commercially Available AI Products in a
  Multi-Agent Configuration
Improving Performance of Commercially Available AI Products in a Multi-Agent Configuration
Cory Hymel
Sida Peng
Kevin Xu
Charath Ranganathan
LLMAG
31
0
0
29 Oct 2024
Enhancing Adversarial Attacks through Chain of Thought
Enhancing Adversarial Attacks through Chain of Thought
Jingbo Su
LRM
31
2
0
29 Oct 2024
ProMoE: Fast MoE-based LLM Serving using Proactive Caching
ProMoE: Fast MoE-based LLM Serving using Proactive Caching
Xiaoniu Song
Zihang Zhong
Rong Chen
Haibo Chen
MoE
65
4
0
29 Oct 2024
SocialGPT: Prompting LLMs for Social Relation Reasoning via Greedy
  Segment Optimization
SocialGPT: Prompting LLMs for Social Relation Reasoning via Greedy Segment Optimization
Wanhua Li
Zibin Meng
Jiawei Zhou
D. Wei
Chuang Gan
Hanspeter Pfister
LRM
VLM
29
5
0
28 Oct 2024
LLMCBench: Benchmarking Large Language Model Compression for Efficient
  Deployment
LLMCBench: Benchmarking Large Language Model Compression for Efficient Deployment
Ge Yang
Changyi He
J. Guo
Jianyu Wu
Yifu Ding
Aishan Liu
Haotong Qin
Pengliang Ji
Xianglong Liu
MQ
33
4
0
28 Oct 2024
FACTS: A Factored State-Space Framework For World Modelling
FACTS: A Factored State-Space Framework For World Modelling
Li Nanbo
Firas Laakom
Yucheng Xu
Wenyi Wang
Jürgen Schmidhuber
AI4TS
205
0
0
28 Oct 2024
Dynamic layer selection in decoder-only transformers
Dynamic layer selection in decoder-only transformers
Theodore Glavas
Joud Chataoui
Florence Regol
Wassim Jabbour
Antonios Valkanas
Boris N. Oreshkin
Mark J. Coates
AI4CE
26
0
0
26 Oct 2024
Deep Optimizer States: Towards Scalable Training of Transformer Models
  Using Interleaved Offloading
Deep Optimizer States: Towards Scalable Training of Transformer Models Using Interleaved Offloading
Avinash Maurya
Jie Ye
M. Rafique
Franck Cappello
Bogdan Nicolae
29
1
0
26 Oct 2024
Computational Bottlenecks of Training Small-scale Large Language Models
Computational Bottlenecks of Training Small-scale Large Language Models
Saleh Ashkboos
Iman Mirzadeh
Keivan Alizadeh
Mohammad Hossein Sekhavat
Moin Nabi
Mehrdad Farajtabar
Fartash Faghri
26
0
0
25 Oct 2024
Previous
123...567...474849
Next