ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.01068
  4. Cited By
OPT: Open Pre-trained Transformer Language Models

OPT: Open Pre-trained Transformer Language Models

2 May 2022
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
Shuohui Chen
Christopher Dewan
Mona T. Diab
Xian Li
Xi Lin
Todor Mihaylov
Myle Ott
Sam Shleifer
Kurt Shuster
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
    VLM
    OSLM
    AI4CE
ArXivPDFHTML

Papers citing "OPT: Open Pre-trained Transformer Language Models"

50 / 2,454 papers shown
Title
CoMMIT: Coordinated Instruction Tuning for Multimodal Large Language
  Models
CoMMIT: Coordinated Instruction Tuning for Multimodal Large Language Models
Junda Wu
Xintong Li
Tong Yu
Yu Wang
Xiang Chen
Jiuxiang Gu
Lina Yao
Jingbo Shang
Julian McAuley
52
0
0
29 Jul 2024
Understanding Memorisation in LLMs: Dynamics, Influencing Factors, and
  Implications
Understanding Memorisation in LLMs: Dynamics, Influencing Factors, and Implications
Till Speicher
Mohammad Aflah Khan
Qinyuan Wu
Vedant Nanda
Soumi Das
Bishwamittra Ghosh
Krishna P. Gummadi
Evimaria Terzi
49
3
0
27 Jul 2024
Adaptive Contrastive Search: Uncertainty-Guided Decoding for Open-Ended
  Text Generation
Adaptive Contrastive Search: Uncertainty-Guided Decoding for Open-Ended Text Generation
Esteban Garces Arias
Julian Rodemann
Meimingwei Li
Christian Heumann
Matthias Aßenmacher
45
4
0
26 Jul 2024
SWIFT: Semantic Watermarking for Image Forgery Thwarting
SWIFT: Semantic Watermarking for Image Forgery Thwarting
Gautier Evennou
Vivien Chappelier
Ewa Kijak
Teddy Furon
53
1
0
26 Jul 2024
Fairness Definitions in Language Models Explained
Fairness Definitions in Language Models Explained
Thang Viet Doan
Zhibo Chu
Zichong Wang
Wenbin Zhang
ALM
63
10
0
26 Jul 2024
Dallah: A Dialect-Aware Multimodal Large Language Model for Arabic
Dallah: A Dialect-Aware Multimodal Large Language Model for Arabic
Fakhraddin Alwajih
Gagan Bhatia
Muhammad Abdul-Mageed
45
5
0
25 Jul 2024
Learn while Unlearn: An Iterative Unlearning Framework for Generative Language Models
Learn while Unlearn: An Iterative Unlearning Framework for Generative Language Models
Haoyu Tang
Ye Liu
Xukai Liu
Xukai Liu
Yanghai Zhang
Kai Zhang
Xiaofang Zhou
Enhong Chen
MU
75
3
0
25 Jul 2024
Graph-Structured Speculative Decoding
Graph-Structured Speculative Decoding
Zhuocheng Gong
Jiahao Liu
Ziyue Wang
Pengfei Wu
Jingang Wang
Xunliang Cai
Dongyan Zhao
Rui Yan
31
3
0
23 Jul 2024
INF-LLaVA: Dual-perspective Perception for High-Resolution Multimodal
  Large Language Model
INF-LLaVA: Dual-perspective Perception for High-Resolution Multimodal Large Language Model
Yiwei Ma
Zhibin Wang
Xiaoshuai Sun
Weihuang Lin
Qiang-feng Zhou
Jiayi Ji
Rongrong Ji
MLLM
VLM
57
1
0
23 Jul 2024
Accelerating Pre-training of Multimodal LLMs via Chain-of-Sight
Accelerating Pre-training of Multimodal LLMs via Chain-of-Sight
Ziyuan Huang
Kaixiang Ji
Biao Gong
Zhiwu Qing
Qinglong Zhang
Kecheng Zheng
Jian Wang
Jingdong Chen
Ming Yang
LRM
47
2
0
22 Jul 2024
dMel: Speech Tokenization made Simple
dMel: Speech Tokenization made Simple
Richard He Bai
Tatiana Likhomanenko
Ruixiang Zhang
Zijin Gu
Zakaria Aldeneh
Navdeep Jaitly
48
4
0
22 Jul 2024
Compensate Quantization Errors+: Quantized Models Are Inquisitive Learners
Compensate Quantization Errors+: Quantized Models Are Inquisitive Learners
Yifei Gao
Jie Ou
Lei Wang
Fanhua Shang
Jaji Wu
MQ
68
0
0
22 Jul 2024
Token-Picker: Accelerating Attention in Text Generation with Minimized
  Memory Transfer via Probability Estimation
Token-Picker: Accelerating Attention in Text Generation with Minimized Memory Transfer via Probability Estimation
Junyoung Park
Myeonggu Kang
Yunki Han
Yang-Gon Kim
Jaekang Shin
Lee-Sup Kim
27
0
0
21 Jul 2024
Hard Prompts Made Interpretable: Sparse Entropy Regularization for
  Prompt Tuning with RL
Hard Prompts Made Interpretable: Sparse Entropy Regularization for Prompt Tuning with RL
Yunseon Choi
Sangmin Bae
Seonghyun Ban
Minchan Jeong
Chuheng Zhang
Lei Song
Li Zhao
Jiang Bian
Kee-Eung Kim
VLM
AAML
38
3
0
20 Jul 2024
Impact of Model Size on Fine-tuned LLM Performance in Data-to-Text
  Generation: A State-of-the-Art Investigation
Impact of Model Size on Fine-tuned LLM Performance in Data-to-Text Generation: A State-of-the-Art Investigation
Joy Mahapatra
Utpal Garain
47
8
0
19 Jul 2024
Watermark Smoothing Attacks against Language Models
Watermark Smoothing Attacks against Language Models
Hongyan Chang
Hamed Hassani
Reza Shokri
WaLM
67
3
0
19 Jul 2024
X-Former: Unifying Contrastive and Reconstruction Learning for MLLMs
X-Former: Unifying Contrastive and Reconstruction Learning for MLLMs
S. Swetha
Jinyu Yang
T. Neiman
Mamshad Nayeem Rizve
Son Tran
Benjamin Z. Yao
Trishul Chilimbi
Mubarak Shah
62
2
0
18 Jul 2024
Reconstruct the Pruned Model without Any Retraining
Reconstruct the Pruned Model without Any Retraining
Pingjie Wang
Ziqing Fan
Shengchao Hu
Zhe Chen
Yanfeng Wang
Yu Wang
53
1
0
18 Jul 2024
Integrated Hardware Architecture and Device Placement Search
Integrated Hardware Architecture and Device Placement Search
Irene Wang
Jakub Tarnawski
Amar Phanishayee
Divya Mahajan
43
1
0
18 Jul 2024
ViLLa: Video Reasoning Segmentation with Large Language Model
ViLLa: Video Reasoning Segmentation with Large Language Model
Rongkun Zheng
Lu Qi
Xi Chen
Yi Wang
Kun Wang
Yu Qiao
Hengshuang Zhao
VOS
LRM
80
2
0
18 Jul 2024
Establishing Knowledge Preference in Language Models
Establishing Knowledge Preference in Language Models
Sizhe Zhou
Sha Li
Yu Meng
Yizhu Jiao
Heng Ji
Jiawei Han
KELM
85
0
0
17 Jul 2024
SmartQuant: CXL-based AI Model Store in Support of Runtime Configurable
  Weight Quantization
SmartQuant: CXL-based AI Model Store in Support of Runtime Configurable Weight Quantization
Rui Xie
Asad Ul Haq
Linsen Ma
Krystal Sun
Sanchari Sen
Swagath Venkataramani
Liu Liu
Tong Zhang
MQ
32
1
0
17 Jul 2024
Matryoshka-Adaptor: Unsupervised and Supervised Tuning for Smaller
  Embedding Dimensions
Matryoshka-Adaptor: Unsupervised and Supervised Tuning for Smaller Embedding Dimensions
Jinsung Yoon
Raj Sinha
Sercan Ö. Arik
Tomas Pfister
24
1
0
17 Jul 2024
$\textit{GeoHard}$: Towards Measuring Class-wise Hardness through
  Modelling Class Semantics
GeoHard\textit{GeoHard}GeoHard: Towards Measuring Class-wise Hardness through Modelling Class Semantics
Fengyu Cai
Xinran Zhao
Hongming Zhang
Iryna Gurevych
Heinz Koeppl
34
0
0
17 Jul 2024
InstructAV: Instruction Fine-tuning Large Language Models for Authorship
  Verification
InstructAV: Instruction Fine-tuning Large Language Models for Authorship Verification
Yujia Hu
Zhiqiang Hu
C. Seah
Roy Ka-wei Lee
36
0
0
16 Jul 2024
Enhancing Parameter Efficiency and Generalization in Large-Scale Models:
  A Regularized and Masked Low-Rank Adaptation Approach
Enhancing Parameter Efficiency and Generalization in Large-Scale Models: A Regularized and Masked Low-Rank Adaptation Approach
Yuzhu Mao
Siqi Ping
Zihao Zhao
Yang Liu
Wenbo Ding
37
1
0
16 Jul 2024
SwitchCIT: Switching for Continual Instruction Tuning of Large Language
  Models
SwitchCIT: Switching for Continual Instruction Tuning of Large Language Models
Xinbo Wu
Max Hartman
Vidhata Arjun Jayaraman
Lav Varshney
CLL
LRM
41
1
0
16 Jul 2024
How Are LLMs Mitigating Stereotyping Harms? Learning from Search Engine
  Studies
How Are LLMs Mitigating Stereotyping Harms? Learning from Search Engine Studies
Alina Leidinger
Richard Rogers
39
5
0
16 Jul 2024
MINI-LLM: Memory-Efficient Structured Pruning for Large Language Models
MINI-LLM: Memory-Efficient Structured Pruning for Large Language Models
Hongrong Cheng
Miao Zhang
J. Q. Shi
57
2
0
16 Jul 2024
Co-Designing Binarized Transformer and Hardware Accelerator for
  Efficient End-to-End Edge Deployment
Co-Designing Binarized Transformer and Hardware Accelerator for Efficient End-to-End Edge Deployment
Yuhao Ji
Chao Fang
Shaobo Ma
Haikuo Shao
Zhongfeng Wang
MQ
47
1
0
16 Jul 2024
Reflective Instruction Tuning: Mitigating Hallucinations in Large
  Vision-Language Models
Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models
Jinrui Zhang
Teng Wang
Haigang Zhang
Ping Lu
Feng Zheng
MLLM
LRM
VLM
44
3
0
16 Jul 2024
LRQ: Optimizing Post-Training Quantization for Large Language Models by Learning Low-Rank Weight-Scaling Matrices
LRQ: Optimizing Post-Training Quantization for Large Language Models by Learning Low-Rank Weight-Scaling Matrices
Jung Hyun Lee
Jeonghoon Kim
J. Yang
S. Kwon
Eunho Yang
Kang Min Yoo
Dongsoo Lee
MQ
36
2
0
16 Jul 2024
BiasAlert: A Plug-and-play Tool for Social Bias Detection in LLMs
BiasAlert: A Plug-and-play Tool for Social Bias Detection in LLMs
Zhiting Fan
Ruizhe Chen
Ruiling Xu
Zuozhu Liu
KELM
27
16
0
14 Jul 2024
Multi-Granularity Semantic Revision for Large Language Model
  Distillation
Multi-Granularity Semantic Revision for Large Language Model Distillation
Xiaoyu Liu
Yun-feng Zhang
Wei Li
Simiao Li
Xu Huang
Hanting Chen
Yehui Tang
Jie Hu
Zhiwei Xiong
Yunhe Wang
43
1
0
14 Jul 2024
Minimizing PLM-Based Few-Shot Intent Detectors
Minimizing PLM-Based Few-Shot Intent Detectors
Haode Zhang
Xiao-Ming Wu
Albert Y. S. Lam
VLM
38
0
0
13 Jul 2024
Investigating Low-Rank Training in Transformer Language Models:
  Efficiency and Scaling Analysis
Investigating Low-Rank Training in Transformer Language Models: Efficiency and Scaling Analysis
Xiuying Wei
Skander Moalla
Razvan Pascanu
Çağlar Gülçehre
33
1
0
13 Jul 2024
MaskMoE: Boosting Token-Level Learning via Routing Mask in
  Mixture-of-Experts
MaskMoE: Boosting Token-Level Learning via Routing Mask in Mixture-of-Experts
Zhenpeng Su
Zijia Lin
Xue Bai
Xing Wu
Yizhe Xiong
...
Guangyuan Ma
Hui Chen
Guiguang Ding
Wei Zhou
Songlin Hu
MoE
34
5
0
13 Jul 2024
Mitigating Entity-Level Hallucination in Large Language Models
Mitigating Entity-Level Hallucination in Large Language Models
Weihang Su
Yichen Tang
Qingyao Ai
Changyue Wang
Zhijing Wu
Yiqun Liu
HILM
47
7
0
12 Jul 2024
H2O-Danube3 Technical Report
H2O-Danube3 Technical Report
Pascal Pfeiffer
Philipp Singer
Yauhen Babakhin
Gabor Fodor
Nischay Dhankhar
Sri Satish Ambati
24
3
0
12 Jul 2024
A Survey on Symbolic Knowledge Distillation of Large Language Models
A Survey on Symbolic Knowledge Distillation of Large Language Models
Kamal Acharya
Alvaro Velasquez
Haoze Song
SyDa
44
5
0
12 Jul 2024
GPT-4 is judged more human than humans in displaced and inverted Turing
  tests
GPT-4 is judged more human than humans in displaced and inverted Turing tests
Ishika Rathi
Sydney Taylor
Benjamin K. Bergen
Cameron R. Jones
DeLMO
38
5
0
11 Jul 2024
MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image
  Synthesis
MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis
Wanggui He
Siming Fu
Mushui Liu
Xierui Wang
Wenyi Xiao
...
Zhelun Yu
Haoyuan Li
Ziwei Huang
Leilei Gan
Hao Jiang
DiffM
29
23
0
10 Jul 2024
Bucket Pre-training is All You Need
Bucket Pre-training is All You Need
Hongtao Liu
Qiyao Peng
Qing Yang
Kai Liu
Hongyan Xu
28
1
0
10 Jul 2024
A Survey of Attacks on Large Vision-Language Models: Resources,
  Advances, and Future Trends
A Survey of Attacks on Large Vision-Language Models: Resources, Advances, and Future Trends
Daizong Liu
Mingyu Yang
Xiaoye Qu
Pan Zhou
Yu Cheng
Wei Hu
ELM
AAML
37
25
0
10 Jul 2024
Inference Performance Optimization for Large Language Models on CPUs
Inference Performance Optimization for Large Language Models on CPUs
Pujiang He
Shan Zhou
Wenhuan Huang
Changqing Li
Duyi Wang
Bin Guo
Chen Meng
Sheng Gui
Weifei Yu
Yi Xie
38
4
0
10 Jul 2024
FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive
  Distillation
FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive Distillation
Liqun Ma
Mingjie Sun
Zhiqiang Shen
31
7
0
09 Jul 2024
ICLGuard: Controlling In-Context Learning Behavior for Applicability
  Authorization
ICLGuard: Controlling In-Context Learning Behavior for Applicability Authorization
Wai Man Si
Michael Backes
Yang Zhang
54
1
0
09 Jul 2024
Who is better at math, Jenny or Jingzhen? Uncovering Stereotypes in
  Large Language Models
Who is better at math, Jenny or Jingzhen? Uncovering Stereotypes in Large Language Models
Zara Siddique
Liam D. Turner
Luis Espinosa-Anke
42
0
0
09 Jul 2024
CEIA: CLIP-Based Event-Image Alignment for Open-World Event-Based
  Understanding
CEIA: CLIP-Based Event-Image Alignment for Open-World Event-Based Understanding
Wenhao Xu
Wenming Weng
Yueyi Zhang
Zhiwei Xiong
VLM
49
0
0
09 Jul 2024
VQA-Diff: Exploiting VQA and Diffusion for Zero-Shot Image-to-3D Vehicle
  Asset Generation in Autonomous Driving
VQA-Diff: Exploiting VQA and Diffusion for Zero-Shot Image-to-3D Vehicle Asset Generation in Autonomous Driving
Yibo Liu
Zheyuan Yang
Guile Wu
Y. Ren
Kejian Lin
Bingbing Liu
Yang Liu
Jinjun Shan
44
5
0
09 Jul 2024
Previous
123...101112...484950
Next