ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.01068
  4. Cited By
OPT: Open Pre-trained Transformer Language Models

OPT: Open Pre-trained Transformer Language Models

2 May 2022
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
Shuohui Chen
Christopher Dewan
Mona T. Diab
Xian Li
Xi Lin
Todor Mihaylov
Myle Ott
Sam Shleifer
Kurt Shuster
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
    VLM
    OSLM
    AI4CE
ArXivPDFHTML

Papers citing "OPT: Open Pre-trained Transformer Language Models"

50 / 767 papers shown
Title
HALO: Hardware-aware quantization with low critical-path-delay weights for LLM acceleration
HALO: Hardware-aware quantization with low critical-path-delay weights for LLM acceleration
Rohan Juneja
Shivam Aggarwal
Safeen Huda
Tulika Mitra
L. Peh
50
0
0
27 Feb 2025
ANPMI: Assessing the True Comprehension Capabilities of LLMs for Multiple Choice Questions
ANPMI: Assessing the True Comprehension Capabilities of LLMs for Multiple Choice Questions
Gyeongje Cho
Yeonkyoung So
Jaejin Lee
ELM
62
0
0
26 Feb 2025
When Compression Meets Model Compression: Memory-Efficient Double Compression for Large Language Models
When Compression Meets Model Compression: Memory-Efficient Double Compression for Large Language Models
Weilan Wang
Yu Mao
Dongdong Tang
Hongchao Du
Nan Guan
Chun Jason Xue
MQ
67
1
0
24 Feb 2025
Probe Pruning: Accelerating LLMs through Dynamic Pruning via Model-Probing
Probe Pruning: Accelerating LLMs through Dynamic Pruning via Model-Probing
Qi Le
Enmao Diao
Ziyan Wang
Xinran Wang
Jie Ding
Li Yang
Ali Anwar
77
2
0
24 Feb 2025
SpinQuant: LLM quantization with learned rotations
SpinQuant: LLM quantization with learned rotations
Zechun Liu
Changsheng Zhao
Igor Fedorov
Bilge Soran
Dhruv Choudhary
Raghuraman Krishnamoorthi
Vikas Chandra
Yuandong Tian
Tijmen Blankevoort
MQ
137
85
0
21 Feb 2025
Dynamic Low-Rank Sparse Adaptation for Large Language Models
Dynamic Low-Rank Sparse Adaptation for Large Language Models
Weizhong Huang
Yuxin Zhang
Xiawu Zheng
Yong-Jin Liu
Jing Lin
Yiwu Yao
Rongrong Ji
97
1
0
21 Feb 2025
LOVA3: Learning to Visual Question Answering, Asking and Assessment
LOVA3: Learning to Visual Question Answering, Asking and Assessment
Henry Hengyuan Zhao
Pan Zhou
Difei Gao
Zechen Bai
Mike Zheng Shou
82
8
0
21 Feb 2025
Pretrained Image-Text Models are Secretly Video Captioners
Pretrained Image-Text Models are Secretly Video Captioners
Chunhui Zhang
Yiren Jian
Z. Ouyang
Soroush Vosoughi
VLM
82
4
0
20 Feb 2025
FedSpaLLM: Federated Pruning of Large Language Models
FedSpaLLM: Federated Pruning of Large Language Models
Guangji Bai
Yijiang Li
Zilinghan Li
Liang Zhao
Kibaek Kim
FedML
68
4
0
20 Feb 2025
One Model for All: Large Language Models are Domain-Agnostic Recommendation Systems
One Model for All: Large Language Models are Domain-Agnostic Recommendation Systems
Zuoli Tang
Zhaoxin Huan
Zihao Li
Xiaolu Zhang
Jun Hu
Chilin Fu
Jun Zhou
Lixin Zou
Chenliang Li
61
15
0
20 Feb 2025
EvoP: Robust LLM Inference via Evolutionary Pruning
EvoP: Robust LLM Inference via Evolutionary Pruning
Shangyu Wu
Hongchao Du
Ying Xiong
Shuai Chen
Tei-Wei Kuo
Nan Guan
Chun Jason Xue
34
1
0
19 Feb 2025
PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models
PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models
Jun Zhao
Miao Zhang
Ming Wang
Yuzhang Shang
Kaihao Zhang
Weili Guan
Yaowei Wang
Min Zhang
MQ
49
0
0
18 Feb 2025
RTBAS: Defending LLM Agents Against Prompt Injection and Privacy Leakage
RTBAS: Defending LLM Agents Against Prompt Injection and Privacy Leakage
Peter Yong Zhong
Siyuan Chen
Ruiqi Wang
McKenna McCall
Ben L. Titzer
Heather Miller
Phillip B. Gibbons
LLMAG
93
3
0
17 Feb 2025
Eye Tracking Based Cognitive Evaluation of Automatic Readability Assessment Measures
Eye Tracking Based Cognitive Evaluation of Automatic Readability Assessment Measures
Keren Gruteke Klein
Shachar Frenkel
Omer Shubi
Yevgeni Berzak
46
0
0
16 Feb 2025
Smoothing Out Hallucinations: Mitigating LLM Hallucination with Smoothed Knowledge Distillation
Smoothing Out Hallucinations: Mitigating LLM Hallucination with Smoothed Knowledge Distillation
Hieu Nguyen
Zihao He
Shoumik Atul Gandre
Ujjwal Pasupulety
Sharanya Kumari Shivakumar
Kristina Lerman
HILM
59
1
0
16 Feb 2025
LLM-Enhanced Multiple Instance Learning for Joint Rumor and Stance Detection with Social Context Information
LLM-Enhanced Multiple Instance Learning for Joint Rumor and Stance Detection with Social Context Information
Ruichao Yang
Jing Ma
Wei Gao
Hongzhan Lin
68
0
0
13 Feb 2025
Automated Consistency Analysis of LLMs
Automated Consistency Analysis of LLMs
Aditya Patwardhan
Vivek Vaidya
Ashish Kundu
60
0
0
10 Feb 2025
Learning to Substitute Words with Model-based Score Ranking
Learning to Substitute Words with Model-based Score Ranking
Hongye Liu
Ricardo Henao
43
0
0
09 Feb 2025
Efficient Knowledge Feeding to Language Models: A Novel Integrated Encoder-Decoder Architecture
Efficient Knowledge Feeding to Language Models: A Novel Integrated Encoder-Decoder Architecture
Sachin Kumar
Rishi Gottimukkala
Supriya Devidutta
K. Spindler
RALM
KELM
3DV
52
0
0
07 Feb 2025
Multilingual Machine Translation with Open Large Language Models at Practical Scale: An Empirical Study
Multilingual Machine Translation with Open Large Language Models at Practical Scale: An Empirical Study
Menglong Cui
Pengzhi Gao
Wei Liu
Jian Luan
Bin Wang
LRM
45
2
0
04 Feb 2025
Large Language Models Are Human-Like Internally
Large Language Models Are Human-Like Internally
Tatsuki Kuribayashi
Yohei Oseki
Souhaib Ben Taieb
Kentaro Inui
Timothy Baldwin
73
4
0
03 Feb 2025
Progressive Binarization with Semi-Structured Pruning for LLMs
Progressive Binarization with Semi-Structured Pruning for LLMs
Xinyu Yan
Tianao Zhang
Zhiteng Li
Yulun Zhang
MQ
54
0
0
03 Feb 2025
Wizard of Shopping: Target-Oriented E-commerce Dialogue Generation with Decision Tree Branching
Wizard of Shopping: Target-Oriented E-commerce Dialogue Generation with Decision Tree Branching
Xuelong Li
Zhiyu Zoey Chen
J. Choi
Nikhita Vedula
B. Fetahu
Oleg Rokhlenko
S. Malmasi
83
2
0
03 Feb 2025
Evaluating Small Language Models for News Summarization: Implications and Factors Influencing Performance
Evaluating Small Language Models for News Summarization: Implications and Factors Influencing Performance
Borui Xu
Yao Chen
Zeyi Wen
Weiguo Liu
Bingsheng He
84
1
0
02 Feb 2025
Symmetric Pruning of Large Language Models
Symmetric Pruning of Large Language Models
Kai Yi
Peter Richtárik
AAML
VLM
73
0
0
31 Jan 2025
Memory-Efficient Fine-Tuning of Transformers via Token Selection
Memory-Efficient Fine-Tuning of Transformers via Token Selection
Antoine Simoulin
Namyong Park
Xiaoyi Liu
Grey Yang
115
0
0
31 Jan 2025
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Tianzhe Chu
Yuexiang Zhai
Jihan Yang
Shengbang Tong
Saining Xie
Dale Schuurmans
Quoc V. Le
Sergey Levine
Yi Ma
OffRL
70
60
0
28 Jan 2025
Mobile Manipulation Instruction Generation from Multiple Images with Automatic Metric Enhancement
Mobile Manipulation Instruction Generation from Multiple Images with Automatic Metric Enhancement
Kei Katsumata
Motonari Kambara
Daichi Yashima
Ryosuke Korekata
Komei Sugiura
65
0
0
28 Jan 2025
A Survey of Large Language Models for Healthcare: from Data, Technology, and Applications to Accountability and Ethics
A Survey of Large Language Models for Healthcare: from Data, Technology, and Applications to Accountability and Ethics
Kai He
Rui Mao
Qika Lin
Yucheng Ruan
Xiang Lan
Mengling Feng
Min Zhang
LM&MA
AILaw
98
154
0
28 Jan 2025
Merino: Entropy-driven Design for Generative Language Models on IoT Devices
Merino: Entropy-driven Design for Generative Language Models on IoT Devices
Youpeng Zhao
Ming Lin
Huadong Tang
Qiang Wu
Jun Wang
83
0
0
28 Jan 2025
Towards Cross-Tokenizer Distillation: the Universal Logit Distillation Loss for LLMs
Towards Cross-Tokenizer Distillation: the Universal Logit Distillation Loss for LLMs
Nicolas Boizard
Kevin El Haddad
C´eline Hudelot
Pierre Colombo
80
15
0
28 Jan 2025
Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models
Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models
Jingwei Yi
Yueqi Xie
Bin Zhu
Emre Kiciman
Guangzhong Sun
Xing Xie
Fangzhao Wu
AAML
62
65
0
28 Jan 2025
Addressing Out-of-Label Hazard Detection in Dashcam Videos: Insights from the COOOL Challenge
Anh-Kiet Duong
Petra Gomez-Krämer
49
2
0
27 Jan 2025
Decentralized Low-Rank Fine-Tuning of Large Language Models
Sajjad Ghiasvand
Mahnoosh Alizadeh
Ramtin Pedarsani
ALM
66
0
0
26 Jan 2025
EDoRA: Efficient Weight-Decomposed Low-Rank Adaptation via Singular Value Decomposition
EDoRA: Efficient Weight-Decomposed Low-Rank Adaptation via Singular Value Decomposition
Hamid Nasiri
Peter Garraghan
41
1
0
21 Jan 2025
Human-like conceptual representations emerge from language prediction
Human-like conceptual representations emerge from language prediction
Ningyu Xu
Qi Zhang
Chao Du
Qiang Luo
Xipeng Qiu
Xuanjing Huang
Menghan Zhang
70
0
0
21 Jan 2025
BiMarker: Enhancing Text Watermark Detection for Large Language Models with Bipolar Watermarks
BiMarker: Enhancing Text Watermark Detection for Large Language Models with Bipolar Watermarks
Zhuang Li
50
1
0
21 Jan 2025
Can AI-Generated Text be Reliably Detected?
Can AI-Generated Text be Reliably Detected?
Vinu Sankar Sadasivan
Aounon Kumar
S. Balasubramanian
Wenxiao Wang
S. Feizi
DeLMO
81
365
0
20 Jan 2025
Dynamic Scene Understanding from Vision-Language Representations
Dynamic Scene Understanding from Vision-Language Representations
Shahaf Pruss
Morris Alper
Hadar Averbuch-Elor
OCL
218
0
0
20 Jan 2025
Enhancing Patient-Centric Communication: Leveraging LLMs to Simulate Patient Perspectives
Enhancing Patient-Centric Communication: Leveraging LLMs to Simulate Patient Perspectives
Xinyao Ma
Rui Zhu
Zihao Wang
Jingwei Xiong
Qingyu Chen
Haixu Tang
L. Jean Camp
Lucila Ohno-Machado
LM&MA
46
0
0
12 Jan 2025
SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training
SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training
Tianjin Huang
Ziquan Zhu
Gaojie Jin
Lu Liu
Zhangyang Wang
Shiwei Liu
47
1
0
12 Jan 2025
Navigating the Designs of Privacy-Preserving Fine-tuning for Large Language Models
Navigating the Designs of Privacy-Preserving Fine-tuning for Large Language Models
Haonan Shi
Tu Ouyang
An Wang
36
0
0
08 Jan 2025
VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks
VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks
Jiannan Wu
Muyan Zhong
Sen Xing
Zeqiang Lai
Zhaoyang Liu
...
Lewei Lu
Tong Lu
Ping Luo
Yu Qiao
Jifeng Dai
MLLM
VLM
LRM
102
48
0
03 Jan 2025
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
Haipeng Luo
Qingfeng Sun
Can Xu
Pu Zhao
Jian-Guang Lou
...
Xiubo Geng
Qingwei Lin
Shifeng Chen
Yansong Tang
Dongmei Zhang
OSLM
LRM
110
412
0
03 Jan 2025
Altogether: Image Captioning via Re-aligning Alt-text
Altogether: Image Captioning via Re-aligning Alt-text
Hu Xu
Po-Yao (Bernie) Huang
Xiaoqing Ellen Tan
Ching-Feng Yeh
Jacob Kahn
...
Luke Zettlemoyer
Wen-tau Yih
Shang-Wen Li
Saining Xie
Christoph Feichtenhofer
DiffM
46
6
0
31 Dec 2024
Adaptive Batch Size Schedules for Distributed Training of Language Models with Data and Model Parallelism
Adaptive Batch Size Schedules for Distributed Training of Language Models with Data and Model Parallelism
Tim Tsz-Kit Lau
Weijian Li
Chenwei Xu
Han Liu
Mladen Kolar
191
0
0
30 Dec 2024
PTQ4VM: Post-Training Quantization for Visual Mamba
PTQ4VM: Post-Training Quantization for Visual Mamba
Younghyun Cho
Changhun Lee
Seonggon Kim
Eunhyeok Park
MQ
Mamba
46
2
0
29 Dec 2024
GQSA: Group Quantization and Sparsity for Accelerating Large Language Model Inference
GQSA: Group Quantization and Sparsity for Accelerating Large Language Model Inference
Chao Zeng
Songwei Liu
Shu Yang
Fangmin Chen
Xing Mei
Lean Fu
MQ
44
0
0
23 Dec 2024
Multi-Level Optimal Transport for Universal Cross-Tokenizer Knowledge Distillation on Language Models
Multi-Level Optimal Transport for Universal Cross-Tokenizer Knowledge Distillation on Language Models
Xiao Cui
Mo Zhu
Yulei Qin
Liang Xie
Wengang Zhou
Yiming Li
91
4
0
19 Dec 2024
Next Patch Prediction for Autoregressive Visual Generation
Next Patch Prediction for Autoregressive Visual Generation
Yatian Pang
Peng Jin
Shuo Yang
Bin Lin
Bin Zhu
...
Liuhan Chen
Francis E. H. Tay
Ser-Nam Lim
Harry Yang
Li Yuan
129
9
0
19 Dec 2024
Previous
12345...141516
Next