ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.01068
  4. Cited By
OPT: Open Pre-trained Transformer Language Models

OPT: Open Pre-trained Transformer Language Models

2 May 2022
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
Shuohui Chen
Christopher Dewan
Mona T. Diab
Xian Li
Xi Lin
Todor Mihaylov
Myle Ott
Sam Shleifer
Kurt Shuster
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
    VLM
    OSLM
    AI4CE
ArXivPDFHTML

Papers citing "OPT: Open Pre-trained Transformer Language Models"

50 / 2,454 papers shown
Title
Unleashing the Temporal-Spatial Reasoning Capacity of GPT for
  Training-Free Audio and Language Referenced Video Object Segmentation
Unleashing the Temporal-Spatial Reasoning Capacity of GPT for Training-Free Audio and Language Referenced Video Object Segmentation
Shaofei Huang
Rui Ling
Hongyu Li
Tianrui Hui
Zongheng Tang
Xiaoming Wei
Jizhong Han
Si Liu
VOS
42
4
0
28 Aug 2024
Efficient LLM Scheduling by Learning to Rank
Efficient LLM Scheduling by Learning to Rank
Yichao Fu
Siqi Zhu
Runlong Su
Aurick Qiao
Ion Stoica
Hao Zhang
58
19
0
28 Aug 2024
BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and
  Deduplication by Introducing a Competitive Large Language Model Baseline
BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Competitive Large Language Model Baseline
Guosheng Dong
Zhuoran Zhang
Yiding Sun
Da Pan
Zheng Liang
...
Bingning Wang
Wentao Zhang
Jiaxin Mao
Zenan Zhou
Weipeng Chen
ALM
48
2
0
27 Aug 2024
HPT++: Hierarchically Prompting Vision-Language Models with
  Multi-Granularity Knowledge Generation and Improved Structure Modeling
HPT++: Hierarchically Prompting Vision-Language Models with Multi-Granularity Knowledge Generation and Improved Structure Modeling
Yubin Wang
Xinyang Jiang
De Cheng
Wenli Sun
Dongsheng Li
Cairong Zhao
VLM
48
0
0
27 Aug 2024
An Evaluation of Explanation Methods for Black-Box Detectors of
  Machine-Generated Text
An Evaluation of Explanation Methods for Black-Box Detectors of Machine-Generated Text
Loris Schoenegger
Yuxi Xia
Benjamin Roth
FAtt
46
0
0
26 Aug 2024
Has Multimodal Learning Delivered Universal Intelligence in Healthcare?
  A Comprehensive Survey
Has Multimodal Learning Delivered Universal Intelligence in Healthcare? A Comprehensive Survey
Qika Lin
Yifan Zhu
Xin Mei
Ling Huang
Jingying Ma
Kai He
Zhen Peng
Min Zhang
Mengling Feng
49
19
0
23 Aug 2024
IAA: Inner-Adaptor Architecture Empowers Frozen Large Language Model with Multimodal Capabilities
IAA: Inner-Adaptor Architecture Empowers Frozen Large Language Model with Multimodal Capabilities
Bin Wang
Chunyu Xie
Dawei Leng
Yuhui Yin
MLLM
54
1
0
23 Aug 2024
A Tighter Complexity Analysis of SparseGPT
A Tighter Complexity Analysis of SparseGPT
Xiaoyu Li
Yingyu Liang
Zhenmei Shi
Zhao Song
80
22
0
22 Aug 2024
Matmul or No Matmal in the Era of 1-bit LLMs
Matmul or No Matmal in the Era of 1-bit LLMs
Jinendra Malekar
Mohammed E. Elbtity
Ramtin Zand
MQ
34
2
0
21 Aug 2024
MARLIN: Mixed-Precision Auto-Regressive Parallel Inference on Large
  Language Models
MARLIN: Mixed-Precision Auto-Regressive Parallel Inference on Large Language Models
Elias Frantar
Roberto L. Castro
Jiale Chen
Torsten Hoefler
Dan Alistarh
MQ
31
12
0
21 Aug 2024
First Activations Matter: Training-Free Methods for Dynamic Activation
  in Large Language Models
First Activations Matter: Training-Free Methods for Dynamic Activation in Large Language Models
Chi Ma
Mincong Huang
Ying Zhang
Chao Wang
Yujie Wang
Lei Yu
Chuan Liu
Wei Lin
AI4CE
LLMSV
53
2
0
21 Aug 2024
EmbodiedSAM: Online Segment Any 3D Thing in Real Time
EmbodiedSAM: Online Segment Any 3D Thing in Real Time
Xiuwei Xu
Huangxing Chen
Linqing Zhao
Ziwei Wang
Jie Zhou
Jiwen Lu
42
15
0
21 Aug 2024
CodeJudge-Eval: Can Large Language Models be Good Judges in Code
  Understanding?
CodeJudge-Eval: Can Large Language Models be Good Judges in Code Understanding?
Yuwei Zhao
Ziyang Luo
Yuchen Tian
Hongzhan Lin
Weixiang Yan
Annan Li
Jing Ma
ELM
ALM
LRM
50
8
0
20 Aug 2024
LLM-Barber: Block-Aware Rebuilder for Sparsity Mask in One-Shot for
  Large Language Models
LLM-Barber: Block-Aware Rebuilder for Sparsity Mask in One-Shot for Large Language Models
Yupeng Su
Ziyi Guan
Xiaoqun Liu
Tianlai Jin
Dongkuan Wu
G. Chesi
Ngai Wong
Hao Yu
45
1
0
20 Aug 2024
Enhancing One-shot Pruned Pre-trained Language Models through
  Sparse-Dense-Sparse Mechanism
Enhancing One-shot Pruned Pre-trained Language Models through Sparse-Dense-Sparse Mechanism
Guanchen Li
Xiandong Zhao
Lian Liu
Zeping Li
Dong Li
Lu Tian
Jie He
Ashish Sirasao
E. Barsoum
VLM
37
0
0
20 Aug 2024
CLIP-DPO: Vision-Language Models as a Source of Preference for Fixing
  Hallucinations in LVLMs
CLIP-DPO: Vision-Language Models as a Source of Preference for Fixing Hallucinations in LVLMs
Yassine Ouali
Adrian Bulat
Brais Martínez
Georgios Tzimiropoulos
VLM
MLLM
40
18
0
19 Aug 2024
MoDeGPT: Modular Decomposition for Large Language Model Compression
MoDeGPT: Modular Decomposition for Large Language Model Compression
Chi-Heng Lin
Shangqian Gao
James Seale Smith
Abhishek Patel
Shikhar Tuli
Yilin Shen
Hongxia Jin
Yen-Chang Hsu
71
9
0
19 Aug 2024
WPN: An Unlearning Method Based on N-pair Contrastive Learning in
  Language Models
WPN: An Unlearning Method Based on N-pair Contrastive Learning in Language Models
Guitao Chen
Yunshen Wang
Hongye Sun
Guang Chen
MU
28
1
0
18 Aug 2024
CogLM: Tracking Cognitive Development of Large Language Models
CogLM: Tracking Cognitive Development of Large Language Models
Xinglin Wang
Peiwen Yuan
Shaoxiong Feng
Yiwei Li
Boyuan Pan
Heda Wang
Yao Hu
Kan Li
ELM
67
0
0
17 Aug 2024
CIKMar: A Dual-Encoder Approach to Prompt-Based Reranking in Educational
  Dialogue Systems
CIKMar: A Dual-Encoder Approach to Prompt-Based Reranking in Educational Dialogue Systems
Joanito Agili Lopo
Marina Indah Prasasti
Alma Permatasari
41
0
0
16 Aug 2024
BAM! Just Like That: Simple and Efficient Parameter Upcycling for
  Mixture of Experts
BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts
Qizhen Zhang
Nikolas Gritsch
Dwaraknath Gnaneshwar
Simon Guo
David Cairuz
...
Jakob N. Foerster
Phil Blunsom
Sebastian Ruder
Ahmet Üstün
Acyr Locatelli
MoMe
MoE
56
5
0
15 Aug 2024
mhGPT: A Lightweight Generative Pre-Trained Transformer for Mental
  Health Text Analysis
mhGPT: A Lightweight Generative Pre-Trained Transformer for Mental Health Text Analysis
Dae-young Kim
Rebecca Hwa
Muhammad Mahbubur Rahman
LM&MA
AI4MH
27
2
0
15 Aug 2024
CROME: Cross-Modal Adapters for Efficient Multimodal LLM
CROME: Cross-Modal Adapters for Efficient Multimodal LLM
Sayna Ebrahimi
Sercan Ö. Arik
Tejas Nama
Tomas Pfister
49
1
0
13 Aug 2024
FuxiTranyu: A Multilingual Large Language Model Trained with Balanced
  Data
FuxiTranyu: A Multilingual Large Language Model Trained with Balanced Data
Haoran Sun
Renren Jin
Shaoyang Xu
Leiyu Pan
Supryadi
...
Lei Yang
Ling Shi
Juesi Xiao
Shaolin Zhu
Deyi Xiong
65
2
0
12 Aug 2024
Eigen Attention: Attention in Low-Rank Space for KV Cache Compression
Eigen Attention: Attention in Low-Rank Space for KV Cache Compression
Utkarsh Saxena
Gobinda Saha
Sakshi Choudhary
Kaushik Roy
44
9
0
10 Aug 2024
Your Context Is Not an Array: Unveiling Random Access Limitations in
  Transformers
Your Context Is Not an Array: Unveiling Random Access Limitations in Transformers
MohammadReza Ebrahimi
Sunny Panchal
Roland Memisevic
41
5
0
10 Aug 2024
Hyperbolic Learning with Multimodal Large Language Models
Hyperbolic Learning with Multimodal Large Language Models
Paolo Mandica
Luca Franco
Konstantinos Kallidromitis
Suzanne Petryk
Fabio Galasso
44
1
0
09 Aug 2024
Generating novel experimental hypotheses from language models: A case
  study on cross-dative generalization
Generating novel experimental hypotheses from language models: A case study on cross-dative generalization
Kanishka Misra
Najoung Kim
29
3
0
09 Aug 2024
Instruction Tuning-free Visual Token Complement for Multimodal LLMs
Instruction Tuning-free Visual Token Complement for Multimodal LLMs
Dongsheng Wang
Jiequan Cui
Miaoge Li
Wang Lin
Bo Chen
Hanwang Zhang
MLLM
34
3
0
09 Aug 2024
Generalisation First, Memorisation Second? Memorisation Localisation for
  Natural Language Classification Tasks
Generalisation First, Memorisation Second? Memorisation Localisation for Natural Language Classification Tasks
Verna Dankers
Ivan Titov
45
5
0
09 Aug 2024
Towards a Generative Approach for Emotion Detection and Reasoning
Towards a Generative Approach for Emotion Detection and Reasoning
Ankita Bhaumik
T. Strzalkowski
ReLM
LRM
42
3
0
09 Aug 2024
Scaling Deep Learning Computation over the Inter-Core Connected
  Intelligence Processor with T10
Scaling Deep Learning Computation over the Inter-Core Connected Intelligence Processor with T10
Yiqi Liu
Yuqi Xue
Yu Cheng
Lingxiao Ma
Ziming Miao
Jilong Xue
Jian Huang
GNN
26
1
0
09 Aug 2024
MM-Forecast: A Multimodal Approach to Temporal Event Forecasting with
  Large Language Models
MM-Forecast: A Multimodal Approach to Temporal Event Forecasting with Large Language Models
Haoxuan Li
Zhengmao Yang
Yunshan Ma
Yi Bin
Yang Yang
Tat-Seng Chua
41
0
0
08 Aug 2024
A Convex-optimization-based Layer-wise Post-training Pruner for Large
  Language Models
A Convex-optimization-based Layer-wise Post-training Pruner for Large Language Models
Pengxiang Zhao
Hanyu Hu
Ping Li
Yi Zheng
Zhefeng Wang
Xiaoming Yuan
44
1
0
07 Aug 2024
AgentsCoMerge: Large Language Model Empowered Collaborative Decision Making for Ramp Merging
AgentsCoMerge: Large Language Model Empowered Collaborative Decision Making for Ramp Merging
Senkang Hu
Zhengru Fang
Zihan Fang
Yiqin Deng
Xianhao Chen
Yuguang Fang
Sam Kwong
65
14
0
07 Aug 2024
From Recognition to Prediction: Leveraging Sequence Reasoning for Action
  Anticipation
From Recognition to Prediction: Leveraging Sequence Reasoning for Action Anticipation
Xin Liu
Chao Hao
Zitong Yu
Huanjing Yue
Jingyu Yang
41
1
0
05 Aug 2024
MeshAnything V2: Artist-Created Mesh Generation With Adjacent Mesh
  Tokenization
MeshAnything V2: Artist-Created Mesh Generation With Adjacent Mesh Tokenization
Yiwen Chen
Yikai Wang
Yihao Luo
Zhilin Wang
Zilong Chen
Jun Zhu
Chi Zhang
Guosheng Lin
33
24
0
05 Aug 2024
From LLMs to LLM-based Agents for Software Engineering: A Survey of Current, Challenges and Future
From LLMs to LLM-based Agents for Software Engineering: A Survey of Current, Challenges and Future
Haolin Jin
Linghan Huang
Haipeng Cai
Jun Yan
Bo Li
Huaming Chen
78
30
0
05 Aug 2024
Effective Demonstration Annotation for In-Context Learning via Language
  Model-Based Determinantal Point Process
Effective Demonstration Annotation for In-Context Learning via Language Model-Based Determinantal Point Process
Peng Wang
Xiaobin Wang
Chao Lou
Shengyu Mao
Pengjun Xie
Yong-jia Jiang
54
0
0
04 Aug 2024
Cross-layer Attention Sharing for Large Language Models
Cross-layer Attention Sharing for Large Language Models
Yongyu Mu
Yuzhang Wu
Yuchun Fan
Chenglong Wang
Hengyu Li
Qiaozhi He
Murun Yang
Tong Xiao
Jingbo Zhu
42
5
0
04 Aug 2024
STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs
STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs
Peijie Dong
Lujun Li
Dayou Du
Yuhan Chen
Zhenheng Tang
...
Wei Xue
Wenhan Luo
Qi-fei Liu
Yi-Ting Guo
Xiaowen Chu
MQ
58
4
0
03 Aug 2024
The Phantom Menace: Unmasking Privacy Leakages in Vision-Language Models
The Phantom Menace: Unmasking Privacy Leakages in Vision-Language Models
Simone Caldarella
Massimiliano Mancini
Elisa Ricci
Rahaf Aljundi
PILM
58
2
0
02 Aug 2024
SynesLM: A Unified Approach for Audio-visual Speech Recognition and
  Translation via Language Model and Synthetic Data
SynesLM: A Unified Approach for Audio-visual Speech Recognition and Translation via Language Model and Synthetic Data
Yichen Lu
Álvaro Huertas-García
Xuankai Chang
Hengwei Bian
Soumi Maiti
Shinji Watanabe
46
2
0
01 Aug 2024
Memorization Capacity for Additive Fine-Tuning with Small ReLU Networks
Memorization Capacity for Additive Fine-Tuning with Small ReLU Networks
Jy-yong Sohn
Dohyun Kwon
Seoyeon An
Kangwook Lee
48
0
0
01 Aug 2024
Adversarial Text Rewriting for Text-aware Recommender Systems
Adversarial Text Rewriting for Text-aware Recommender Systems
Ganesh Ghalme
Reshef Meir
Srijan Kumar
42
0
0
01 Aug 2024
Evaluating SAM2's Role in Camouflaged Object Detection: From SAM to SAM2
Evaluating SAM2's Role in Camouflaged Object Detection: From SAM to SAM2
Lv Tang
Bo Li
VLM
40
7
0
31 Jul 2024
Accelerating Large Language Model Inference with Self-Supervised Early
  Exits
Accelerating Large Language Model Inference with Self-Supervised Early Exits
Florian Valade
LRM
44
1
0
30 Jul 2024
Decoding Linguistic Representations of Human Brain
Decoding Linguistic Representations of Human Brain
Yu Wang
Heyang Liu
Yuhao Wang
Chuan Xuan
Yixuan Hou
Sheng Feng
Hongcheng Liu
Yusheng Liao
Yanfeng Wang
AI4CE
36
1
0
30 Jul 2024
Pruning Large Language Models with Semi-Structural Adaptive Sparse
  Training
Pruning Large Language Models with Semi-Structural Adaptive Sparse Training
Weiyu Huang
Yuezhou Hu
Guohao Jian
Jun Zhu
Jianfei Chen
35
5
0
30 Jul 2024
A2SF: Accumulative Attention Scoring with Forgetting Factor for Token
  Pruning in Transformer Decoder
A2SF: Accumulative Attention Scoring with Forgetting Factor for Token Pruning in Transformer Decoder
Hyun Rae Jo
Dong Kun Shin
40
4
0
30 Jul 2024
Previous
123...91011...484950
Next