ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.01068
  4. Cited By
OPT: Open Pre-trained Transformer Language Models

OPT: Open Pre-trained Transformer Language Models

2 May 2022
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
Shuohui Chen
Christopher Dewan
Mona T. Diab
Xian Li
Xi Lin
Todor Mihaylov
Myle Ott
Sam Shleifer
Kurt Shuster
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
    VLM
    OSLM
    AI4CE
ArXivPDFHTML

Papers citing "OPT: Open Pre-trained Transformer Language Models"

50 / 2,449 papers shown
Title
HaloScope: Harnessing Unlabeled LLM Generations for Hallucination
  Detection
HaloScope: Harnessing Unlabeled LLM Generations for Hallucination Detection
Xuefeng Du
Chaowei Xiao
Yixuan Li
HILM
37
20
0
26 Sep 2024
MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models
MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models
Gongfan Fang
Hongxu Yin
Saurav Muralidharan
Greg Heinrich
Jeff Pool
Jan Kautz
Pavlo Molchanov
Xinchao Wang
43
3
0
26 Sep 2024
Accumulator-Aware Post-Training Quantization
Accumulator-Aware Post-Training Quantization
Ian Colbert
Fabian Grob
Giuseppe Franco
Jinjie Zhang
Rayan Saab
MQ
35
4
0
25 Sep 2024
Decoding Large-Language Models: A Systematic Overview of Socio-Technical
  Impacts, Constraints, and Emerging Questions
Decoding Large-Language Models: A Systematic Overview of Socio-Technical Impacts, Constraints, and Emerging Questions
Zeyneb N. Kaya
Souvick Ghosh
42
0
0
25 Sep 2024
Zero-Shot Detection of LLM-Generated Text using Token Cohesiveness
Zero-Shot Detection of LLM-Generated Text using Token Cohesiveness
Shixuan Ma
Quan Wang
40
2
0
25 Sep 2024
Speech Recognition Rescoring with Large Speech-Text Foundation Models
Speech Recognition Rescoring with Large Speech-Text Foundation Models
Prashanth Gurunath Shivakumar
J. Kolehmainen
Aditya Gourav
Yi Gu
Ankur Gandhe
Ariya Rastrow
I. Bulyko
AuLLM
31
0
0
25 Sep 2024
Ascend HiFloat8 Format for Deep Learning
Ascend HiFloat8 Format for Deep Learning
Yuanyong Luo
Zhongxing Zhang
Richard Wu
Hu Liu
Ying Jin
...
Korviakov Vladimir
Bobrin Maxim
Yuhao Hu
Guanfu Chen
Zeyi Huang
MQ
38
1
0
25 Sep 2024
Dynamic-Width Speculative Beam Decoding for Efficient LLM Inference
Dynamic-Width Speculative Beam Decoding for Efficient LLM Inference
Zongyue Qin
Zifan He
Neha Prakriya
Jason Cong
Yizhou Sun
30
4
0
25 Sep 2024
Exploring Fine-grained Retail Product Discrimination with Zero-shot
  Object Classification Using Vision-Language Models
Exploring Fine-grained Retail Product Discrimination with Zero-shot Object Classification Using Vision-Language Models
Anil Osman Tur
Alessandro Conti
Cigdem Beyan
Davide Boscaini
Roberto Larcher
S. Messelodi
Fabio Poiesi
Elisa Ricci
VLM
39
0
0
23 Sep 2024
VLM's Eye Examination: Instruct and Inspect Visual Competency of Vision
  Language Models
VLM's Eye Examination: Instruct and Inspect Visual Competency of Vision Language Models
Nam Hyeon-Woo
Moon Ye-Bin
Wonseok Choi
Lee Hyun
Tae-Hyun Oh
CoGe
28
3
0
23 Sep 2024
Pretraining Data Detection for Large Language Models: A Divergence-based Calibration Method
Pretraining Data Detection for Large Language Models: A Divergence-based Calibration Method
Weichao Zhang
Ruqing Zhang
Jiafeng Guo
Maarten de Rijke
Yixing Fan
Xueqi Cheng
38
9
0
23 Sep 2024
Unleashing the Power of Emojis in Texts via Self-supervised Graph
  Pre-Training
Unleashing the Power of Emojis in Texts via Self-supervised Graph Pre-Training
Zhou Zhang
Dongzeng Tan
Jiaan Wang
Yilong Chen
Jiarong Xu
29
0
0
22 Sep 2024
Order of Magnitude Speedups for LLM Membership Inference
Order of Magnitude Speedups for LLM Membership Inference
Rongting Zhang
Martín Bertrán
Aaron Roth
47
1
0
22 Sep 2024
TalkMosaic: Interactive PhotoMosaic with Multi-modal LLM Q&A
  Interactions
TalkMosaic: Interactive PhotoMosaic with Multi-modal LLM Q&A Interactions
Kevin Li
Fulu Li
21
0
0
20 Sep 2024
From Linguistic Giants to Sensory Maestros: A Survey on Cross-Modal
  Reasoning with Large Language Models
From Linguistic Giants to Sensory Maestros: A Survey on Cross-Modal Reasoning with Large Language Models
Shengsheng Qian
Zuyi Zhou
Dizhan Xue
Bing Wang
Changsheng Xu
LRM
39
1
0
19 Sep 2024
TinyVLA: Towards Fast, Data-Efficient Vision-Language-Action Models for Robotic Manipulation
TinyVLA: Towards Fast, Data-Efficient Vision-Language-Action Models for Robotic Manipulation
Junjie Wen
Bo Li
Jinming Li
Minjie Zhu
Kun Wu
...
Ran Cheng
Yaxin Peng
Chaomin Shen
Feifei Feng
Jian Tang
LM&Ro
76
50
0
19 Sep 2024
ChefFusion: Multimodal Foundation Model Integrating Recipe and Food
  Image Generation
ChefFusion: Multimodal Foundation Model Integrating Recipe and Food Image Generation
Peiyu Li
Xiaobao Huang
Yijun Tian
Nitesh Chawla
35
0
0
18 Sep 2024
Evaluating the Impact of Compression Techniques on Task-Specific
  Performance of Large Language Models
Evaluating the Impact of Compression Techniques on Task-Specific Performance of Large Language Models
Bishwash Khanal
Jeffery M. Capone
33
1
0
17 Sep 2024
Reasoning Graph Enhanced Exemplars Retrieval for In-Context Learning
Reasoning Graph Enhanced Exemplars Retrieval for In-Context Learning
Yukang Lin
Bingchen Zhong
Shuoran Jiang
Joanna Siebert
Qingcai Chen
RALM
ReLM
LRM
30
0
0
17 Sep 2024
Diversity-grounded Channel Prototypical Learning for Out-of-Distribution
  Intent Detection
Diversity-grounded Channel Prototypical Learning for Out-of-Distribution Intent Detection
Bo Liu
Liming Zhan
Yujie Feng
Zexin Lu
Chengqiang Xie
Lei Xue
Albert Y. S. Lam
Xiao-Ming Wu
OODD
41
1
0
17 Sep 2024
Large Language Models are Good Multi-lingual Learners : When LLMs Meet Cross-lingual Prompts
Large Language Models are Good Multi-lingual Learners : When LLMs Meet Cross-lingual Prompts
Teng Wang
Zhenqi He
Wing-Yin Yu
Xiaojin Fu
Xiongwei Han
LRM
59
5
0
17 Sep 2024
MARCA: Mamba Accelerator with ReConfigurable Architecture
MARCA: Mamba Accelerator with ReConfigurable Architecture
Jinhao Li
Shan Huang
Jiaming Xu
Jun Liu
Li Ding
Ningyi Xu
Guohao Dai
42
6
0
16 Sep 2024
Fit and Prune: Fast and Training-free Visual Token Pruning for
  Multi-modal Large Language Models
Fit and Prune: Fast and Training-free Visual Token Pruning for Multi-modal Large Language Models
Weihao Ye
Qiong Wu
Wenhao Lin
Yiyi Zhou
VLM
41
10
0
16 Sep 2024
ASFT: Aligned Supervised Fine-Tuning through Absolute Likelihood
ASFT: Aligned Supervised Fine-Tuning through Absolute Likelihood
Ruoyu Wang
Jiachen Sun
Shaowei Hua
Quan Fang
21
1
0
14 Sep 2024
Multi-modal Speech Transformer Decoders: When Do Multiple Modalities Improve Accuracy?
Multi-modal Speech Transformer Decoders: When Do Multiple Modalities Improve Accuracy?
Yiwen Guan
V. Trinh
Vivek Voleti
Jacob Whitehill
44
1
0
13 Sep 2024
Your Weak LLM is Secretly a Strong Teacher for Alignment
Your Weak LLM is Secretly a Strong Teacher for Alignment
Leitian Tao
Yixuan Li
88
5
0
13 Sep 2024
Generated Data with Fake Privacy: Hidden Dangers of Fine-tuning Large Language Models on Generated Data
Generated Data with Fake Privacy: Hidden Dangers of Fine-tuning Large Language Models on Generated Data
Atilla Akkus
Mingjie Li
Junjie Chu
Junjie Chu
Michael Backes
Sinem Sav
Sinem Sav
SILM
SyDa
48
1
0
12 Sep 2024
Stable Language Model Pre-training by Reducing Embedding Variability
Stable Language Model Pre-training by Reducing Embedding Variability
Woojin Chung
Jiwoo Hong
Na Min An
James Thorne
Se-Young Yun
38
2
0
12 Sep 2024
PiTe: Pixel-Temporal Alignment for Large Video-Language Model
PiTe: Pixel-Temporal Alignment for Large Video-Language Model
Yang Liu
Pengxiang Ding
Siteng Huang
Min Zhang
Han Zhao
Donglin Wang
40
7
0
11 Sep 2024
FreeRide: Harvesting Bubbles in Pipeline Parallelism
FreeRide: Harvesting Bubbles in Pipeline Parallelism
Jiashu Zhang
Zihan Pan
Molly
Xu
Khuzaima S. Daudjee
90
0
0
11 Sep 2024
Keyword-Aware ASR Error Augmentation for Robust Dialogue State Tracking
Keyword-Aware ASR Error Augmentation for Robust Dialogue State Tracking
Jihyun Lee
Solee Im
Wonjun Lee
Gary Geunbae Lee
36
0
0
10 Sep 2024
Accelerating Large Language Model Pretraining via LFR Pedagogy: Learn, Focus, and Review
Accelerating Large Language Model Pretraining via LFR Pedagogy: Learn, Focus, and Review
Neha Prakriya
Jui-Nan Yen
Cho-Jui Hsieh
Jason Cong
KELM
AI4CE
LRM
35
1
0
10 Sep 2024
Evidence from fMRI Supports a Two-Phase Abstraction Process in Language
  Models
Evidence from fMRI Supports a Two-Phase Abstraction Process in Language Models
Emily Cheng
Richard Antonello
80
4
0
09 Sep 2024
ELMS: Elasticized Large Language Models On Mobile Devices
ELMS: Elasticized Large Language Models On Mobile Devices
Wangsong Yin
Rongjie Yi
Daliang Xu
Gang Huang
Mengwei Xu
Xuanzhe Liu
37
5
0
08 Sep 2024
InstInfer: In-Storage Attention Offloading for Cost-Effective
  Long-Context LLM Inference
InstInfer: In-Storage Attention Offloading for Cost-Effective Long-Context LLM Inference
Xiurui Pan
Endian Li
Qiao Li
Shengwen Liang
Yizhou Shan
Ke Zhou
Yingwei Luo
Xiaolin Wang
Jie Zhang
47
10
0
08 Sep 2024
UNIT: Unifying Image and Text Recognition in One Vision Encoder
UNIT: Unifying Image and Text Recognition in One Vision Encoder
Yi Zhu
Yanpeng Zhou
Chunwei Wang
Yang Cao
Jianhua Han
Lu Hou
Hang Xu
ViT
VLM
34
4
0
06 Sep 2024
COLUMBUS: Evaluating COgnitive Lateral Understanding through
  Multiple-choice reBUSes
COLUMBUS: Evaluating COgnitive Lateral Understanding through Multiple-choice reBUSes
Koen Kraaijveld
Yifan Jiang
Kaixin Ma
Filip Ilievski
LRM
29
1
0
06 Sep 2024
Open-MAGVIT2: An Open-Source Project Toward Democratizing Auto-regressive Visual Generation
Open-MAGVIT2: An Open-Source Project Toward Democratizing Auto-regressive Visual Generation
Zhuoyan Luo
Fengyuan Shi
Yixiao Ge
Yujiu Yang
Limin Wang
Ying Shan
VLM
52
52
0
06 Sep 2024
LAST: Language Model Aware Speech Tokenization
LAST: Language Model Aware Speech Tokenization
A. Turetzky
Yossi Adi
37
3
0
05 Sep 2024
Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding
Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding
Yunze Man
Shuhong Zheng
Zhipeng Bao
M. Hebert
Liang-Yan Gui
Yu-xiong Wang
78
15
0
05 Sep 2024
Foundations of Large Language Model Compression -- Part 1: Weight
  Quantization
Foundations of Large Language Model Compression -- Part 1: Weight Quantization
Sean I. Young
MQ
52
1
0
03 Sep 2024
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
Haoran Wei
Chenglong Liu
Jinyue Chen
Jia Wang
Lingyu Kong
...
Liang Zhao
Jianjian Sun
Yuang Peng
Chunrui Han
Xiangyu Zhang
VLM
52
44
0
03 Sep 2024
Efficient LLM Context Distillation
Efficient LLM Context Distillation
Rajesh Upadhayayaya
Zachary Smith
Chritopher Kottmyer
Manish Raj Osti
50
1
0
03 Sep 2024
CHESS: Optimizing LLM Inference via Channel-Wise Thresholding and
  Selective Sparsification
CHESS: Optimizing LLM Inference via Channel-Wise Thresholding and Selective Sparsification
Junhui He
Shangyu Wu
Weidong Wen
Chun Jason Xue
Qingan Li
23
5
0
02 Sep 2024
CLIBE: Detecting Dynamic Backdoors in Transformer-based NLP Models
CLIBE: Detecting Dynamic Backdoors in Transformer-based NLP Models
Rui Zeng
Xi Chen
Yuwen Pu
Xuhong Zhang
Tianyu Du
Shouling Ji
43
2
0
02 Sep 2024
Duplex: A Device for Large Language Models with Mixture of Experts,
  Grouped Query Attention, and Continuous Batching
Duplex: A Device for Large Language Models with Mixture of Experts, Grouped Query Attention, and Continuous Batching
Sungmin Yun
Kwanhee Kyung
Juhwan Cho
Jaewan Choi
Jongmin Kim
Byeongho Kim
Sukhan Lee
Kyomin Sohn
Jung Ho Ahn
MoE
49
6
0
02 Sep 2024
LuWu: An End-to-End In-Network Out-of-Core Optimizer for 100B-Scale
  Model-in-Network Data-Parallel Training on Distributed GPUs
LuWu: An End-to-End In-Network Out-of-Core Optimizer for 100B-Scale Model-in-Network Data-Parallel Training on Distributed GPUs
Mo Sun
Zihan Yang
Changyue Liao
Yingtao Li
Fei Wu
Zeke Wang
60
1
0
02 Sep 2024
SAM4MLLM: Enhance Multi-Modal Large Language Model for Referring
  Expression Segmentation
SAM4MLLM: Enhance Multi-Modal Large Language Model for Referring Expression Segmentation
Yi-Chia Chen
Wei-Hua Li
Cheng Sun
Yu-Chiang Frank Wang
Chu-Song Chen
VLM
45
11
0
01 Sep 2024
Investigating Neuron Ablation in Attention Heads: The Case for Peak
  Activation Centering
Investigating Neuron Ablation in Attention Heads: The Case for Peak Activation Centering
Nicholas Pochinkov
Ben Pasero
Skylar Shibayama
27
1
0
30 Aug 2024
Leveraging Large Language Models for Wireless Symbol Detection via
  In-Context Learning
Leveraging Large Language Models for Wireless Symbol Detection via In-Context Learning
Momin Abbas
Koushik Kar
Tianyi Chen
32
5
0
28 Aug 2024
Previous
123...8910...474849
Next