ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.01068
  4. Cited By
OPT: Open Pre-trained Transformer Language Models

OPT: Open Pre-trained Transformer Language Models

2 May 2022
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
Shuohui Chen
Christopher Dewan
Mona T. Diab
Xian Li
Xi Lin
Todor Mihaylov
Myle Ott
Sam Shleifer
Kurt Shuster
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
    VLM
    OSLM
    AI4CE
ArXivPDFHTML

Papers citing "OPT: Open Pre-trained Transformer Language Models"

50 / 2,468 papers shown
Title
Exploring the Integration of Large Language Models into Automatic Speech
  Recognition Systems: An Empirical Study
Exploring the Integration of Large Language Models into Automatic Speech Recognition Systems: An Empirical Study
Zeping Min
Jinbo Wang
AuLLM
57
13
0
13 Jul 2023
No Train No Gain: Revisiting Efficient Training Algorithms For
  Transformer-based Language Models
No Train No Gain: Revisiting Efficient Training Algorithms For Transformer-based Language Models
Jean Kaddour
Oscar Key
Piotr Nawrot
Pasquale Minervini
Matt J. Kusner
38
41
0
12 Jul 2023
A Comprehensive Overview of Large Language Models
A Comprehensive Overview of Large Language Models
Humza Naveed
Asad Ullah Khan
Shi Qiu
Muhammad Saqib
Saeed Anwar
Muhammad Usman
Naveed Akhtar
Nick Barnes
Ajmal Mian
OffRL
72
548
0
12 Jul 2023
VELMA: Verbalization Embodiment of LLM Agents for Vision and Language
  Navigation in Street View
VELMA: Verbalization Embodiment of LLM Agents for Vision and Language Navigation in Street View
Raphael Schumann
Wanrong Zhu
Weixi Feng
Tsu-Jui Fu
Stefan Riezler
William Yang Wang
LM&Ro
34
64
0
12 Jul 2023
PolyLM: An Open Source Polyglot Large Language Model
PolyLM: An Open Source Polyglot Large Language Model
Xiangpeng Wei
Hao-Ran Wei
Huan Lin
Tianhao Li
Pei Zhang
...
Yu Bowen
Dayiheng Liu
Baosong Yang
Fei Huang
Jun Xie
LRM
48
57
0
12 Jul 2023
Predictive Pipelined Decoding: A Compute-Latency Trade-off for Exact LLM
  Decoding
Predictive Pipelined Decoding: A Compute-Latency Trade-off for Exact LLM Decoding
Seongjun Yang
Gibbeum Lee
Jaewoong Cho
Dimitris Papailiopoulos
Kangwook Lee
28
33
0
12 Jul 2023
BLUEX: A benchmark based on Brazilian Leading Universities Entrance
  eXams
BLUEX: A benchmark based on Brazilian Leading Universities Entrance eXams
Thales Sales Almeida
Thiago Laitz
Giovana K. Bonás
Rodrigo Nogueira
ELM
24
6
0
11 Jul 2023
Ethicist: Targeted Training Data Extraction Through Loss Smoothed Soft
  Prompting and Calibrated Confidence Estimation
Ethicist: Targeted Training Data Extraction Through Loss Smoothed Soft Prompting and Calibrated Confidence Estimation
Zhexin Zhang
Jiaxin Wen
Minlie Huang
38
32
0
10 Jul 2023
Assessing the efficacy of large language models in generating accurate
  teacher responses
Assessing the efficacy of large language models in generating accurate teacher responses
Yann Hicke
Abhishek Masand
Wentao Guo
Tushaar Gangavarapu
ELM
AI4Ed
39
9
0
09 Jul 2023
A Stitch in Time Saves Nine: Detecting and Mitigating Hallucinations of
  LLMs by Validating Low-Confidence Generation
A Stitch in Time Saves Nine: Detecting and Mitigating Hallucinations of LLMs by Validating Low-Confidence Generation
Neeraj Varshney
Wenlin Yao
Hongming Zhang
Jianshu Chen
Dong Yu
HILM
47
161
0
08 Jul 2023
QIGen: Generating Efficient Kernels for Quantized Inference on Large
  Language Models
QIGen: Generating Efficient Kernels for Quantized Inference on Large Language Models
Tommaso Pegolotti
Elias Frantar
Dan Alistarh
Markus Püschel
MQ
24
3
0
07 Jul 2023
PREADD: Prefix-Adaptive Decoding for Controlled Text Generation
PREADD: Prefix-Adaptive Decoding for Controlled Text Generation
Jonathan Pei
Kevin Kaichuang Yang
Dan Klein
61
21
0
06 Jul 2023
A Survey on Evaluation of Large Language Models
A Survey on Evaluation of Large Language Models
Yu-Chu Chang
Xu Wang
Jindong Wang
Yuanyi Wu
Linyi Yang
...
Yue Zhang
Yi-Ju Chang
Philip S. Yu
Qian Yang
Xingxu Xie
ELM
LM&MA
ALM
85
1,551
0
06 Jul 2023
Pruning vs Quantization: Which is Better?
Pruning vs Quantization: Which is Better?
Andrey Kuzmin
Markus Nagel
M. V. Baalen
Arash Behboodi
Tijmen Blankevoort
MQ
43
48
0
06 Jul 2023
Scaling In-Context Demonstrations with Structured Attention
Scaling In-Context Demonstrations with Structured Attention
Tianle Cai
Kaixuan Huang
Jason D. Lee
Mengdi Wang
LRM
44
8
0
05 Jul 2023
SkipDecode: Autoregressive Skip Decoding with Batching and Caching for
  Efficient LLM Inference
SkipDecode: Autoregressive Skip Decoding with Batching and Caching for Efficient LLM Inference
Luciano Del Corro
Allison Del Giorno
Sahaj Agarwal
Ting Yu
Ahmed Hassan Awadallah
Subhabrata Mukherjee
46
54
0
05 Jul 2023
Several categories of Large Language Models (LLMs): A Short Survey
Several categories of Large Language Models (LLMs): A Short Survey
Saurabh Pahune
Manoj Chandrasekharan
AILaw
27
14
0
05 Jul 2023
What Matters in Training a GPT4-Style Language Model with Multimodal
  Inputs?
What Matters in Training a GPT4-Style Language Model with Multimodal Inputs?
Yan Zeng
Hanbo Zhang
Jiani Zheng
Jiangnan Xia
Guoqiang Wei
Yang Wei
Yuchen Zhang
Tao Kong
MLLM
27
73
0
05 Jul 2023
External Reasoning: Towards Multi-Large-Language-Models Interchangeable
  Assistance with Human Feedback
External Reasoning: Towards Multi-Large-Language-Models Interchangeable Assistance with Human Feedback
Akide Liu
KELM
LRM
31
0
0
05 Jul 2023
Won't Get Fooled Again: Answering Questions with False Premises
Won't Get Fooled Again: Answering Questions with False Premises
Shengding Hu
Yi-Xiao Luo
Huadong Wang
Xingyi Cheng
Zhiyuan Liu
Maosong Sun
37
22
0
05 Jul 2023
Becoming self-instruct: introducing early stopping criteria for minimal
  instruct tuning
Becoming self-instruct: introducing early stopping criteria for minimal instruct tuning
Waseem Alshikh
Manhal Daaboul
K. Goddard
Brock Imel
Kiran Kamble
Parikshit Kulkarni
M. Russak
ALM
11
13
0
05 Jul 2023
Towards Open Federated Learning Platforms: Survey and Vision from
  Technical and Legal Perspectives
Towards Open Federated Learning Platforms: Survey and Vision from Technical and Legal Perspectives
Moming Duan
Qinbin Li
Linshan Jiang
Bingsheng He
FedML
39
4
0
05 Jul 2023
ProPILE: Probing Privacy Leakage in Large Language Models
ProPILE: Probing Privacy Leakage in Large Language Models
Siwon Kim
Sangdoo Yun
Hwaran Lee
Martin Gubri
Sungroh Yoon
Seong Joon Oh
PILM
398
97
3
04 Jul 2023
Shifting Attention to Relevance: Towards the Predictive Uncertainty
  Quantification of Free-Form Large Language Models
Shifting Attention to Relevance: Towards the Predictive Uncertainty Quantification of Free-Form Large Language Models
Jinhao Duan
Hao-Ran Cheng
Shiqi Wang
Alex Zavalny
Chenan Wang
Renjing Xu
B. Kailkhura
Kaidi Xu
61
37
0
03 Jul 2023
Trainable Transformer in Transformer
Trainable Transformer in Transformer
A. Panigrahi
Sadhika Malladi
Mengzhou Xia
Sanjeev Arora
VLM
52
13
0
03 Jul 2023
VOLTA: Improving Generative Diversity by Variational Mutual Information
  Maximizing Autoencoder
VOLTA: Improving Generative Diversity by Variational Mutual Information Maximizing Autoencoder
Yueen Ma
Dafeng Chi
Jingjing Li
Kai Song
Yuzheng Zhuang
Irwin King
DRL
38
0
0
03 Jul 2023
PatternGPT :A Pattern-Driven Framework for Large Language Model Text
  Generation
PatternGPT :A Pattern-Driven Framework for Large Language Model Text Generation
Le Xiao
Xin Shan
23
5
0
02 Jul 2023
SysNoise: Exploring and Benchmarking Training-Deployment System
  Inconsistency
SysNoise: Exploring and Benchmarking Training-Deployment System Inconsistency
Yan Wang
Yuhang Li
Ruihao Gong
Aishan Liu
Yanfei Wang
...
Yongqiang Yao
Yunchen Zhang
Tianzi Xiao
F. Yu
Xianglong Liu
AAML
37
0
0
01 Jul 2023
InstructEval: Systematic Evaluation of Instruction Selection Methods
InstructEval: Systematic Evaluation of Instruction Selection Methods
Anirudh Ajith
Chris Pan
Mengzhou Xia
Ameet Deshpande
Karthik Narasimhan
ELM
33
16
0
01 Jul 2023
Still No Lie Detector for Language Models: Probing Empirical and
  Conceptual Roadblocks
Still No Lie Detector for Language Models: Probing Empirical and Conceptual Roadblocks
B. Levinstein
Daniel A. Herrmann
30
55
0
30 Jun 2023
Stitched ViTs are Flexible Vision Backbones
Stitched ViTs are Flexible Vision Backbones
Zizheng Pan
Jing Liu
Haoyu He
Jianfei Cai
Bohan Zhuang
33
2
0
30 Jun 2023
SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen
  LLMs
SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs
Lijun Yu
Yong Cheng
Zhiruo Wang
Vivek Kumar
Wolfgang Macherey
...
Yonatan Bisk
Ming-Hsuan Yang
Kevin Patrick Murphy
Alexander G. Hauptmann
Lu Jiang
MLLM
27
52
0
30 Jun 2023
Look, Remember and Reason: Grounded reasoning in videos with language
  models
Look, Remember and Reason: Grounded reasoning in videos with language models
Apratim Bhattacharyya
Sunny Panchal
Mingu Lee
Reza Pourreza
Pulkit Madan
Roland Memisevic
LRM
58
7
0
30 Jun 2023
Provable Robust Watermarking for AI-Generated Text
Provable Robust Watermarking for AI-Generated Text
Xuandong Zhao
P. Ananth
Lei Li
Yu-Xiang Wang
WaLM
49
160
0
30 Jun 2023
Towards Language Models That Can See: Computer Vision Through the LENS
  of Natural Language
Towards Language Models That Can See: Computer Vision Through the LENS of Natural Language
William Berrios
Gautam Mittal
Tristan Thrush
Douwe Kiela
Amanpreet Singh
MLLM
VLM
18
61
0
28 Jun 2023
On the Exploitability of Instruction Tuning
On the Exploitability of Instruction Tuning
Manli Shu
Jiong Wang
Chen Zhu
Jonas Geiping
Chaowei Xiao
Tom Goldstein
SILM
58
93
0
28 Jun 2023
Large Language Model as Attributed Training Data Generator: A Tale of
  Diversity and Bias
Large Language Model as Attributed Training Data Generator: A Tale of Diversity and Bias
Yue Yu
Yuchen Zhuang
Jieyu Zhang
Yu Meng
Alexander Ratner
Ranjay Krishna
Jiaming Shen
Chao Zhang
ALM
44
209
0
28 Jun 2023
Extending Context Window of Large Language Models via Positional
  Interpolation
Extending Context Window of Large Language Models via Positional Interpolation
Shouyuan Chen
Sherman Wong
Liangjian Chen
Yuandong Tian
48
503
0
27 Jun 2023
Understanding In-Context Learning via Supportive Pretraining Data
Understanding In-Context Learning via Supportive Pretraining Data
Xiaochuang Han
Daniel Simig
Todor Mihaylov
Yulia Tsvetkov
Asli Celikyilmaz
Tianlu Wang
AIMat
55
36
0
26 Jun 2023
WinoQueer: A Community-in-the-Loop Benchmark for Anti-LGBTQ+ Bias in
  Large Language Models
WinoQueer: A Community-in-the-Loop Benchmark for Anti-LGBTQ+ Bias in Large Language Models
Virginia K. Felkner
Ho-Chun Herbert Chang
Eugene Jang
Jonathan May
OSLM
29
31
0
26 Jun 2023
MotionGPT: Human Motion as a Foreign Language
MotionGPT: Human Motion as a Foreign Language
Biao Jiang
Xin Chen
Wen Liu
Jingyi Yu
Gang Yu
Tao Chen
MLLM
24
272
0
26 Jun 2023
RobuT: A Systematic Study of Table QA Robustness Against Human-Annotated
  Adversarial Perturbations
RobuT: A Systematic Study of Table QA Robustness Against Human-Annotated Adversarial Perturbations
Yilun Zhao
Chen Zhao
Linyong Nan
Zhenting Qi
Wenlin Zhang
Xiangru Tang
Boyu Mi
Dragomir R. Radev
AAML
LMTD
35
34
0
25 Jun 2023
Mirage: Towards Low-interruption Services on Batch GPU Clusters with
  Reinforcement Learning
Mirage: Towards Low-interruption Services on Batch GPU Clusters with Reinforcement Learning
Qi-Dong Ding
Pengfei Zheng
Shreyas Kudari
Shivaram Venkataraman
Zhao-jie Zhang
VLM
OffRL
26
3
0
25 Jun 2023
Symbolic Chain-of-Thought Distillation: Small Models Can Also "Think"
  Step-by-Step
Symbolic Chain-of-Thought Distillation: Small Models Can Also "Think" Step-by-Step
Liunian Harold Li
Jack Hessel
Youngjae Yu
Xiang Ren
Kai-Wei Chang
Yejin Choi
LRM
AI4CE
ReLM
34
131
0
24 Jun 2023
H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large
  Language Models
H2_22​O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models
Zhenyu Zhang
Ying Sheng
Dinesh Manocha
Tianlong Chen
Lianmin Zheng
...
Yuandong Tian
Christopher Ré
Clark W. Barrett
Zhangyang Wang
Beidi Chen
VLM
69
263
0
24 Jun 2023
Computron: Serving Distributed Deep Learning Models with Model Parallel
  Swapping
Computron: Serving Distributed Deep Learning Models with Model Parallel Swapping
Daniel Zou
X. Jin
Xueyang Yu
Haotian Zhang
J. Demmel
MoE
32
0
0
24 Jun 2023
System-Level Natural Language Feedback
System-Level Natural Language Feedback
Weizhe Yuan
Kyunghyun Cho
Jason Weston
52
5
0
23 Jun 2023
Long-range Language Modeling with Self-retrieval
Long-range Language Modeling with Self-retrieval
Ohad Rubin
Jonathan Berant
RALM
KELM
49
18
0
23 Jun 2023
Quantizable Transformers: Removing Outliers by Helping Attention Heads
  Do Nothing
Quantizable Transformers: Removing Outliers by Helping Attention Heads Do Nothing
Yelysei Bondarenko
Markus Nagel
Tijmen Blankevoort
MQ
28
87
0
22 Jun 2023
Generative Multimodal Entity Linking
Generative Multimodal Entity Linking
Senbao Shi
Zhenran Xu
Baotian Hu
Hao Fei
MLLM
VLM
34
5
0
22 Jun 2023
Previous
123...363738...484950
Next