ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.01068
  4. Cited By
OPT: Open Pre-trained Transformer Language Models

OPT: Open Pre-trained Transformer Language Models

2 May 2022
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
Shuohui Chen
Christopher Dewan
Mona T. Diab
Xian Li
Xi Lin
Todor Mihaylov
Myle Ott
Sam Shleifer
Kurt Shuster
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
    VLM
    OSLM
    AI4CE
ArXivPDFHTML

Papers citing "OPT: Open Pre-trained Transformer Language Models"

50 / 767 papers shown
Title
Class Distillation with Mahalanobis Contrast: An Efficient Training Paradigm for Pragmatic Language Understanding Tasks
Class Distillation with Mahalanobis Contrast: An Efficient Training Paradigm for Pragmatic Language Understanding Tasks
Chenlu Wang
Weimin Lyu
Ritwik Banerjee
12
0
0
17 May 2025
Fast RoPE Attention: Combining the Polynomial Method and Fast Fourier Transform
Fast RoPE Attention: Combining the Polynomial Method and Fast Fourier Transform
Josh Alman
Zhao Song
15
0
0
17 May 2025
From Trade-off to Synergy: A Versatile Symbiotic Watermarking Framework for Large Language Models
From Trade-off to Synergy: A Versatile Symbiotic Watermarking Framework for Large Language Models
Yidan Wang
Yubing Ren
Yanan Cao
Binxing Fang
32
0
0
15 May 2025
Resource-Efficient Language Models: Quantization for Fast and Accessible Inference
Resource-Efficient Language Models: Quantization for Fast and Accessible Inference
Tollef Emil Jørgensen
MQ
54
0
0
13 May 2025
Whitened CLIP as a Likelihood Surrogate of Images and Captions
Whitened CLIP as a Likelihood Surrogate of Images and Captions
Roy Betser
Meir Yossef Levi
Guy Gilboa
31
0
0
11 May 2025
Evaluating Financial Sentiment Analysis with Annotators Instruction Assisted Prompting: Enhancing Contextual Interpretation and Stock Prediction Accuracy
Evaluating Financial Sentiment Analysis with Annotators Instruction Assisted Prompting: Enhancing Contextual Interpretation and Stock Prediction Accuracy
A M Muntasir Rahman
Ajim Uddin
Guiling Wang
21
0
0
09 May 2025
Challenging GPU Dominance: When CPUs Outperform for On-Device LLM Inference
Challenging GPU Dominance: When CPUs Outperform for On-Device LLM Inference
Haolin Zhang
Jeff Huang
35
0
0
09 May 2025
X-Transfer Attacks: Towards Super Transferable Adversarial Attacks on CLIP
X-Transfer Attacks: Towards Super Transferable Adversarial Attacks on CLIP
Hanxun Huang
Sarah Monazam Erfani
Yige Li
Xingjun Ma
James Bailey
AAML
53
0
0
08 May 2025
Revealing Weaknesses in Text Watermarking Through Self-Information Rewrite Attacks
Revealing Weaknesses in Text Watermarking Through Self-Information Rewrite Attacks
Yixin Cheng
Hongcheng Guo
Yangming Li
Leonid Sigal
AAML
WaLM
59
0
0
08 May 2025
Diffusion Model Quantization: A Review
Diffusion Model Quantization: A Review
Qian Zeng
Chenggong Hu
Mingli Song
Jie Song
MQ
45
0
0
08 May 2025
Reliably Bounding False Positives: A Zero-Shot Machine-Generated Text Detection Framework via Multiscaled Conformal Prediction
Reliably Bounding False Positives: A Zero-Shot Machine-Generated Text Detection Framework via Multiscaled Conformal Prediction
Xiaowei Zhu
Yubing Ren
Yanan Cao
Xixun Lin
Fang Fang
Yangxi Li
47
0
0
08 May 2025
SPAP: Structured Pruning via Alternating Optimization and Penalty Methods
SPAP: Structured Pruning via Alternating Optimization and Penalty Methods
Hanyu Hu
Xiaoming Yuan
50
0
0
06 May 2025
Optimizing LLMs for Resource-Constrained Environments: A Survey of Model Compression Techniques
Optimizing LLMs for Resource-Constrained Environments: A Survey of Model Compression Techniques
Sanjay Surendranath Girija
Shashank Kapoor
Lakshit Arora
Dipen Pradhan
Aman Raj
Ankit Shetgaonkar
57
0
0
05 May 2025
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
Jiahui Geng
Jintao Guo
Shanshan Zhao
Minghao Fu
Lunhao Duan
Guo-Hua Wang
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
DiffM
74
0
0
05 May 2025
An End-to-End Model For Logits Based Large Language Models Watermarking
An End-to-End Model For Logits Based Large Language Models Watermarking
Kahim Wong
Jicheng Zhou
Jiantao Zhou
Yain-Whar Si
WaLM
39
2
0
05 May 2025
Radio: Rate-Distortion Optimization for Large Language Model Compression
Radio: Rate-Distortion Optimization for Large Language Model Compression
Sean I. Young
MQ
26
0
0
05 May 2025
Demystifying optimized prompts in language models
Demystifying optimized prompts in language models
Rimon Melamed
Lucas H. McCabe
H. H. Huang
44
0
0
04 May 2025
MateICL: Mitigating Attention Dispersion in Large-Scale In-Context Learning
MateICL: Mitigating Attention Dispersion in Large-Scale In-Context Learning
Murtadha Ahmed
Wenbo
Liu yunfeng
41
0
0
02 May 2025
LLM Watermarking Using Mixtures and Statistical-to-Computational Gaps
LLM Watermarking Using Mixtures and Statistical-to-Computational Gaps
Pedro Abdalla
Roman Vershynin
WaLM
40
0
0
02 May 2025
Don't be lazy: CompleteP enables compute-efficient deep transformers
Don't be lazy: CompleteP enables compute-efficient deep transformers
Nolan Dey
Bin Claire Zhang
Lorenzo Noci
Mufan Li
Blake Bordelon
Shane Bergsma
Cengiz Pehlevan
Boris Hanin
Joel Hestness
44
0
0
02 May 2025
Fast and Low-Cost Genomic Foundation Models via Outlier Removal
Fast and Low-Cost Genomic Foundation Models via Outlier Removal
Haozheng Luo
Chenghao Qiu
Maojiang Su
Zhihan Zhou
Zoe Mehta
Guo Ye
Jerry Yao-Chieh Hu
Han Liu
AAML
55
1
0
01 May 2025
An Evaluation of a Visual Question Answering Strategy for Zero-shot Facial Expression Recognition in Still Images
An Evaluation of a Visual Question Answering Strategy for Zero-shot Facial Expression Recognition in Still Images
Modesto Castrillón-Santana
Oliverio J. Santana
David Freire-Obregón
Daniel Hernández-Sosa
J. Lorenzo-Navarro
54
0
0
30 Apr 2025
Detecting and Mitigating Hateful Content in Multimodal Memes with Vision-Language Models
Detecting and Mitigating Hateful Content in Multimodal Memes with Vision-Language Models
Minh-Hao Van
Xintao Wu
VLM
88
0
0
30 Apr 2025
A Domain-Agnostic Scalable AI Safety Ensuring Framework
A Domain-Agnostic Scalable AI Safety Ensuring Framework
Beomjun Kim
Kangyeon Kim
Sunwoo Kim
Heejin Ahn
57
0
0
29 Apr 2025
Perturbation-efficient Zeroth-order Optimization for Hardware-friendly On-device Training
Perturbation-efficient Zeroth-order Optimization for Hardware-friendly On-device Training
Qitao Tan
Sung-En Chang
Rui Xia
Huidong Ji
Chence Yang
...
Zheng Zhan
Zhou Zou
Yufei Wang
Jin Lu
Geng Yuan
41
0
0
28 Apr 2025
R-Sparse: Rank-Aware Activation Sparsity for Efficient LLM Inference
R-Sparse: Rank-Aware Activation Sparsity for Efficient LLM Inference
Zhenyu Zhang
Zechun Liu
Yuandong Tian
Harshit Khaitan
Zihan Wang
Steven Li
68
0
0
28 Apr 2025
AndroidGen: Building an Android Language Agent under Data Scarcity
AndroidGen: Building an Android Language Agent under Data Scarcity
Hanyu Lai
Junjie Gao
Xiao-Yang Liu
Zifei Shan
S. Zhang
Yuxiao Dong
Jie Tang
LLMAG
77
0
0
27 Apr 2025
Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Yixin Cao
Shibo Hong
Xuzhao Li
Jiahao Ying
Yubo Ma
...
Juanzi Li
Aixin Sun
Xuanjing Huang
Tat-Seng Chua
Tianwei Zhang
ALM
ELM
91
2
0
26 Apr 2025
Revisiting Transformers through the Lens of Low Entropy and Dynamic Sparsity
Revisiting Transformers through the Lens of Low Entropy and Dynamic Sparsity
Ruifeng Ren
Yong Liu
180
0
0
26 Apr 2025
Leveraging Decoder Architectures for Learned Sparse Retrieval
Leveraging Decoder Architectures for Learned Sparse Retrieval
Jingfen Qiao
Thong Nguyen
Evangelos Kanoulas
Andrew Yates
56
0
0
25 Apr 2025
Comparing Uncertainty Measurement and Mitigation Methods for Large Language Models: A Systematic Review
Comparing Uncertainty Measurement and Mitigation Methods for Large Language Models: A Systematic Review
Toghrul Abbasli
Kentaroh Toyoda
Yuan Wang
Leon Witt
Muhammad Asif Ali
Yukai Miao
Dan Li
Qingsong Wei
UQCV
94
0
0
25 Apr 2025
Evaluating Evaluation Metrics -- The Mirage of Hallucination Detection
Evaluating Evaluation Metrics -- The Mirage of Hallucination Detection
Atharva Kulkarni
Yuan-kang Zhang
Joel Ruben Antony Moniz
Xiou Ge
Bo-Hsiang Tseng
Dhivya Piraviperumal
Shri Kiran Srinivasan
Hong-ye Yu
HILM
86
0
0
25 Apr 2025
The Big Send-off: High Performance Collectives on GPU-based Supercomputers
The Big Send-off: High Performance Collectives on GPU-based Supercomputers
Siddharth Singh
Mahua Singh
A. Bhatele
54
0
0
25 Apr 2025
Context-Enhanced Contrastive Search for Improved LLM Text Generation
Context-Enhanced Contrastive Search for Improved LLM Text Generation
Jaydip Sen
Rohit Pandey
Hetvi Waghela
55
0
0
22 Apr 2025
EmoVoice: LLM-based Emotional Text-To-Speech Model with Freestyle Text Prompting
EmoVoice: LLM-based Emotional Text-To-Speech Model with Freestyle Text Prompting
Guanrou Yang
Chen Yang
Qian Chen
Ziyang Ma
Wenxi Chen
...
Fan Yu
Zhihao Du
Zhifu Gao
Shiliang Zhang
Xie Chen
AuLLM
57
0
0
17 Apr 2025
CSPLADE: Learned Sparse Retrieval with Causal Language Models
CSPLADE: Learned Sparse Retrieval with Causal Language Models
Zhichao Xu
Aosong Feng
Yijun Tian
Haibo Ding
Lin Leee Cheong
RALM
47
0
0
15 Apr 2025
Transferable text data distillation by trajectory matching
Transferable text data distillation by trajectory matching
Rong Yao
Hailin Hu
Yifei Fu
Hanting Chen
Wenyi Fang
Fanyi Du
Kai Han
Yunhe Wang
28
0
0
14 Apr 2025
Quantization Error Propagation: Revisiting Layer-Wise Post-Training Quantization
Quantization Error Propagation: Revisiting Layer-Wise Post-Training Quantization
Yamato Arai
Yuma Ichikawa
MQ
34
0
0
13 Apr 2025
Findings of the BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora
Findings of the BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora
Alex Warstadt
Aaron Mueller
Leshem Choshen
E. Wilcox
Chengxu Zhuang
...
Rafael Mosquera
Bhargavi Paranjape
Adina Williams
Tal Linzen
Ryan Cotterell
43
108
0
10 Apr 2025
Your Image Generator Is Your New Private Dataset
Your Image Generator Is Your New Private Dataset
Nicolo Resmini
Eugenio Lomurno
Cristian Sbrolli
Matteo Matteucci
31
0
0
06 Apr 2025
A Perplexity and Menger Curvature-Based Approach for Similarity Evaluation of Large Language Models
A Perplexity and Menger Curvature-Based Approach for Similarity Evaluation of Large Language Models
Yuantao Zhang
Zhankui Yang
AAML
38
0
0
05 Apr 2025
When Reasoning Meets Compression: Benchmarking Compressed Large Reasoning Models on Complex Reasoning Tasks
When Reasoning Meets Compression: Benchmarking Compressed Large Reasoning Models on Complex Reasoning Tasks
Nan Zhang
Yusen Zhang
Prasenjit Mitra
Rui Zhang
MQ
LRM
61
2
0
02 Apr 2025
Leaking LoRa: An Evaluation of Password Leaks and Knowledge Storage in Large Language Models
Leaking LoRa: An Evaluation of Password Leaks and Knowledge Storage in Large Language Models
Ryan Marinelli
Magnus Eckhoff
PILM
57
0
0
29 Mar 2025
TempTest: Local Normalization Distortion and the Detection of Machine-generated Text
TempTest: Local Normalization Distortion and the Detection of Machine-generated Text
Tom Kempton
Stuart Burrell
Connor Cheverall
DeLMO
111
0
0
26 Mar 2025
Oaken: Fast and Efficient LLM Serving with Online-Offline Hybrid KV Cache Quantization
Oaken: Fast and Efficient LLM Serving with Online-Offline Hybrid KV Cache Quantization
Minsu Kim
Seongmin Hong
RyeoWook Ko
S. Choi
Hunjong Lee
Junsoo Kim
Joo-Young Kim
Jongse Park
57
0
0
24 Mar 2025
Disentangling Fine-Tuning from Pre-Training in Visual Captioning with Hybrid Markov Logic
Disentangling Fine-Tuning from Pre-Training in Visual Captioning with Hybrid Markov Logic
Monika Shah
Somdeb Sarkhel
Deepak Venugopal
MLLM
BDL
VLM
85
0
0
18 Mar 2025
Direction-Aware Diagonal Autoregressive Image Generation
Direction-Aware Diagonal Autoregressive Image Generation
Yijia Xu
Jianzhong Ju
Jian Luan
J. Cui
57
0
0
14 Mar 2025
Prompt Inversion Attack against Collaborative Inference of Large Language Models
Prompt Inversion Attack against Collaborative Inference of Large Language Models
Wenjie Qu
Yuguang Zhou
Yongji Wu
Tingsong Xiao
Binhang Yuan
Heng Chang
Jiaheng Zhang
76
0
0
12 Mar 2025
VisRL: Intention-Driven Visual Perception via Reinforced Reasoning
VisRL: Intention-Driven Visual Perception via Reinforced Reasoning
Zhangquan Chen
Xufang Luo
Dongsheng Li
OffRL
LRM
72
3
0
10 Mar 2025
Enhancing Vision-Language Compositional Understanding with Multimodal Synthetic Data
Enhancing Vision-Language Compositional Understanding with Multimodal Synthetic Data
Haoxin Li
Boyang Li
CoGe
73
0
0
03 Mar 2025
1234...141516
Next