ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.01068
  4. Cited By
OPT: Open Pre-trained Transformer Language Models

OPT: Open Pre-trained Transformer Language Models

2 May 2022
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
Shuohui Chen
Christopher Dewan
Mona T. Diab
Xian Li
Xi Lin
Todor Mihaylov
Myle Ott
Sam Shleifer
Kurt Shuster
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
    VLM
    OSLM
    AI4CE
ArXivPDFHTML

Papers citing "OPT: Open Pre-trained Transformer Language Models"

50 / 2,473 papers shown
Title
Negative Object Presence Evaluation (NOPE) to Measure Object
  Hallucination in Vision-Language Models
Negative Object Presence Evaluation (NOPE) to Measure Object Hallucination in Vision-Language Models
Holy Lovenia
Wenliang Dai
Samuel Cahyawijaya
Ziwei Ji
Pascale Fung
MLLM
47
52
0
09 Oct 2023
Explainable Claim Verification via Knowledge-Grounded Reasoning with
  Large Language Models
Explainable Claim Verification via Knowledge-Grounded Reasoning with Large Language Models
Haoran Wang
Kai Shu
LRM
50
22
0
08 Oct 2023
ChatRadio-Valuer: A Chat Large Language Model for Generalizable
  Radiology Report Generation Based on Multi-institution and Multi-system Data
ChatRadio-Valuer: A Chat Large Language Model for Generalizable Radiology Report Generation Based on Multi-institution and Multi-system Data
Tianyang Zhong
Wei Zhao
Yutong Zhang
Yi Pan
Peixin Dong
...
Dinggang Shen
Jun-Feng Han
Tianming Liu
Jun Liu
Tuo Zhang
MedIm
LM&MA
55
14
0
08 Oct 2023
Do Large Language Models Know about Facts?
Do Large Language Models Know about Facts?
Xuming Hu
Junzhe Chen
Xiaochuan Li
Yingxin Lai
Lijie Wen
Philip S. Yu
Zhijiang Guo
HILM
KELM
44
49
0
08 Oct 2023
Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for
  Pruning LLMs to High Sparsity
Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity
Lu Yin
You Wu
Zhenyu Zhang
Cheng-Yu Hsieh
Yaqing Wang
...
Mykola Pechenizkiy
Yi Liang
Michael Bendersky
Zhangyang Wang
Shiwei Liu
53
82
0
08 Oct 2023
On the Zero-Shot Generalization of Machine-Generated Text Detectors
On the Zero-Shot Generalization of Machine-Generated Text Detectors
Xiao Pu
Jingyu Zhang
Xiaochuang Han
Yulia Tsvetkov
Tianxing He
DeLMO
41
14
0
08 Oct 2023
MenatQA: A New Dataset for Testing the Temporal Comprehension and
  Reasoning Abilities of Large Language Models
MenatQA: A New Dataset for Testing the Temporal Comprehension and Reasoning Abilities of Large Language Models
Yifan Wei
Yisong Su
Huanhuan Ma
Xiaoyan Yu
Fangyu Lei
Yuanzhe Zhang
Jun Zhao
Kang Liu
LRM
35
10
0
08 Oct 2023
Fast-DetectGPT: Efficient Zero-Shot Detection of Machine-Generated Text
  via Conditional Probability Curvature
Fast-DetectGPT: Efficient Zero-Shot Detection of Machine-Generated Text via Conditional Probability Curvature
Guangsheng Bao
Yanbin Zhao
Zhiyang Teng
Linyi Yang
Yue Zhang
37
137
0
08 Oct 2023
Zero-Shot Detection of Machine-Generated Codes
Zero-Shot Detection of Machine-Generated Codes
Xianjun Yang
Kexun Zhang
Haifeng Chen
Linda R. Petzold
William Y. Wang
Wei Cheng
DeLMO
49
12
0
08 Oct 2023
How Reliable Are AI-Generated-Text Detectors? An Assessment Framework
  Using Evasive Soft Prompts
How Reliable Are AI-Generated-Text Detectors? An Assessment Framework Using Evasive Soft Prompts
Tharindu Kumarage
Paras Sheth
Raha Moraffah
Joshua Garland
Huan Liu
DeLMO
39
24
0
08 Oct 2023
Revisiting Block-based Quantisation: What is Important for Sub-8-bit LLM
  Inference?
Revisiting Block-based Quantisation: What is Important for Sub-8-bit LLM Inference?
Cheng Zhang
Jianyi Cheng
Ilia Shumailov
George A. Constantinides
Yiren Zhao
MQ
29
9
0
08 Oct 2023
Counter Turing Test CT^2: AI-Generated Text Detection is Not as Easy as
  You May Think -- Introducing AI Detectability Index
Counter Turing Test CT^2: AI-Generated Text Detection is Not as Easy as You May Think -- Introducing AI Detectability Index
Megha Chakraborty
S.M. Towhidul Islam Tonmoy
S. M. Mehedi
Krish Sharma
Niyar R. Barman
...
Tanay Kumar
Vinija Jain
Aman Chadha
Amit P. Sheth
Amitava Das
DeLMO
27
21
0
08 Oct 2023
Compresso: Structured Pruning with Collaborative Prompting Learns
  Compact Large Language Models
Compresso: Structured Pruning with Collaborative Prompting Learns Compact Large Language Models
Song Guo
Jiahang Xu
Li Zhang
Mao Yang
38
14
0
08 Oct 2023
Video-Teller: Enhancing Cross-Modal Generation with Fusion and
  Decoupling
Video-Teller: Enhancing Cross-Modal Generation with Fusion and Decoupling
Haogeng Liu
Qihang Fan
Tingkai Liu
Linjie Yang
Yunzhe Tao
Huaibo Huang
Ran He
Hongxia Yang
VGen
32
12
0
08 Oct 2023
The Troubling Emergence of Hallucination in Large Language Models -- An
  Extensive Definition, Quantification, and Prescriptive Remediations
The Troubling Emergence of Hallucination in Large Language Models -- An Extensive Definition, Quantification, and Prescriptive Remediations
Vipula Rawte
Swagata Chakraborty
Agnibh Pathak
Anubhav Sarkar
S.M. Towhidul Islam Tonmoy
Aman Chadha
Mikel Artetxe
Punit Daniel Simig
HILM
48
120
0
08 Oct 2023
Dual Grained Quantization: Efficient Fine-Grained Quantization for LLM
Dual Grained Quantization: Efficient Fine-Grained Quantization for LLM
Luoming Zhang
Wen Fei
Weijia Wu
Yefei He
Zhenyu Lou
Hong Zhou
MQ
42
5
0
07 Oct 2023
Resprompt: Residual Connection Prompting Advances Multi-Step Reasoning
  in Large Language Models
Resprompt: Residual Connection Prompting Advances Multi-Step Reasoning in Large Language Models
Song Jiang
Zahra Shakeri
Aaron Chan
Maziar Sanjabi
Hamed Firooz
...
Bugra Akyildiz
Yizhou Sun
Jinchao Li
Qifan Wang
Asli Celikyilmaz
LRM
ReLM
31
8
0
07 Oct 2023
Reinforced UI Instruction Grounding: Towards a Generic UI Task
  Automation API
Reinforced UI Instruction Grounding: Towards a Generic UI Task Automation API
Zhizheng Zhang
Wenxuan Xie
Xiaoyi Zhang
Yan Lu
54
10
0
07 Oct 2023
EMO: Earth Mover Distance Optimization for Auto-Regressive Language
  Modeling
EMO: Earth Mover Distance Optimization for Auto-Regressive Language Modeling
Siyu Ren
Zhiyong Wu
Kenny Q. Zhu
39
4
0
07 Oct 2023
The Cost of Down-Scaling Language Models: Fact Recall Deteriorates
  before In-Context Learning
The Cost of Down-Scaling Language Models: Fact Recall Deteriorates before In-Context Learning
Tian Jin
Nolan Clement
Xin Dong
Vaishnavh Nagarajan
Michael Carbin
Jonathan Ragan-Kelley
Gintare Karolina Dziugaite
LRM
74
5
0
07 Oct 2023
ReLU Strikes Back: Exploiting Activation Sparsity in Large Language
  Models
ReLU Strikes Back: Exploiting Activation Sparsity in Large Language Models
Iman Mirzadeh
Keivan Alizadeh-Vahid
Sachin Mehta
C. C. D. Mundo
Oncel Tuzel
Golnoosh Samei
Mohammad Rastegari
Mehrdad Farajtabar
126
62
0
06 Oct 2023
Reward Dropout Improves Control: Bi-objective Perspective on Reinforced
  LM
Reward Dropout Improves Control: Bi-objective Perspective on Reinforced LM
Changhun Lee
Chiehyeon Lim
45
0
0
06 Oct 2023
How to Capture Higher-order Correlations? Generalizing Matrix Softmax
  Attention to Kronecker Computation
How to Capture Higher-order Correlations? Generalizing Matrix Softmax Attention to Kronecker Computation
Josh Alman
Zhao Song
60
34
0
06 Oct 2023
MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical
  Reasoning
MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical Reasoning
Ke Wang
Houxing Ren
Aojun Zhou
Zimu Lu
Sichun Luo
Weikang Shi
Renrui Zhang
Linqi Song
Mingjie Zhan
Hongsheng Li
ReLM
LRM
SyDa
35
94
0
05 Oct 2023
GoLLIE: Annotation Guidelines improve Zero-Shot Information-Extraction
GoLLIE: Annotation Guidelines improve Zero-Shot Information-Extraction
Oscar Sainz
Iker García-Ferrero
Rodrigo Agerri
Oier López de Lacalle
German Rigau
Eneko Agirre
58
77
0
05 Oct 2023
Balancing Autonomy and Alignment: A Multi-Dimensional Taxonomy for
  Autonomous LLM-powered Multi-Agent Architectures
Balancing Autonomy and Alignment: A Multi-Dimensional Taxonomy for Autonomous LLM-powered Multi-Agent Architectures
Thorsten Händler
LLMAG
35
23
0
05 Oct 2023
Evaluating Hallucinations in Chinese Large Language Models
Evaluating Hallucinations in Chinese Large Language Models
Qinyuan Cheng
Tianxiang Sun
Wenwei Zhang
Siyin Wang
Xiangyang Liu
...
Junliang He
Mianqiu Huang
Zhangyue Yin
Kai Chen
Xipeng Qiu
HILM
ELM
46
25
0
05 Oct 2023
Expedited Training of Visual Conditioned Language Generation via
  Redundancy Reduction
Expedited Training of Visual Conditioned Language Generation via Redundancy Reduction
Yiren Jian
Tingkai Liu
Yunzhe Tao
Chunhui Zhang
Soroush Vosoughi
HX Yang
VLM
25
7
0
05 Oct 2023
InstructProtein: Aligning Human and Protein Language via Knowledge
  Instruction
InstructProtein: Aligning Human and Protein Language via Knowledge Instruction
Zeyuan Wang
Qiang Zhang
Keyan Ding
Ming Qin
Zhuang Xiang
Xiaotong Li
Huajun Chen
46
29
0
05 Oct 2023
Deep Representations of First-person Pronouns for Prediction of
  Depression Symptom Severity
Deep Representations of First-person Pronouns for Prediction of Depression Symptom Severity
Xinyang Ren
Hannah A. Burkhardt
Patricia A. Areán
Thomas D Hull
Trevor Cohen
18
0
0
05 Oct 2023
JsonTuning: Towards Generalizable, Robust, and Controllable Instruction Tuning
JsonTuning: Towards Generalizable, Robust, and Controllable Instruction Tuning
Chang Gao
Wenxuan Zhang
Guizhen Chen
Wai Lam
62
6
0
04 Oct 2023
Shadow Alignment: The Ease of Subverting Safely-Aligned Language Models
Shadow Alignment: The Ease of Subverting Safely-Aligned Language Models
Xianjun Yang
Xiao Wang
Qi Zhang
Linda R. Petzold
William Y. Wang
Xun Zhao
Dahua Lin
36
172
0
04 Oct 2023
MAD Max Beyond Single-Node: Enabling Large Machine Learning Model
  Acceleration on Distributed Systems
MAD Max Beyond Single-Node: Enabling Large Machine Learning Model Acceleration on Distributed Systems
Samuel Hsia
Alicia Golden
Bilge Acun
Newsha Ardalani
Zach DeVito
Gu-Yeon Wei
David Brooks
Carole-Jean Wu
MoE
53
9
0
04 Oct 2023
The Role of Linguistic Priors in Measuring Compositional Generalization
  of Vision-Language Models
The Role of Linguistic Priors in Measuring Compositional Generalization of Vision-Language Models
Chenwei Wu
Erran L. Li
Stefano Ermon
Patrick Haffner
Rong Ge
Zaiwei Zhang
VLM
CoGe
55
0
0
04 Oct 2023
Improving Automatic VQA Evaluation Using Large Language Models
Improving Automatic VQA Evaluation Using Large Language Models
Oscar Manas
Benno Krojer
Aishwarya Agrawal
37
21
0
04 Oct 2023
TWIZ-v2: The Wizard of Multimodal Conversational-Stimulus
TWIZ-v2: The Wizard of Multimodal Conversational-Stimulus
Rafael Ferreira
Diogo Tavares
Diogo Glória-Silva
Rodrigo Valerio
João Bordalo
Ines Simoes
Vasco Ramos
David Semedo
João Magalhães
29
4
0
03 Oct 2023
OceanGPT: A Large Language Model for Ocean Science Tasks
OceanGPT: A Large Language Model for Ocean Science Tasks
Zhen Bi
Ningyu Zhang
Yida Xue
Yixin Ou
Daxiong Ji
Guozhou Zheng
Huajun Chen
ALM
LLMAG
43
29
0
03 Oct 2023
SEA: Sparse Linear Attention with Estimated Attention Mask
SEA: Sparse Linear Attention with Estimated Attention Mask
Heejun Lee
Jina Kim
Jeffrey Willette
Sung Ju Hwang
43
7
0
03 Oct 2023
VAL: Interactive Task Learning with GPT Dialog Parsing
VAL: Interactive Task Learning with GPT Dialog Parsing
Lane Lawley
Christopher MacLellan
VLM
21
11
0
02 Oct 2023
Language Model Decoding as Direct Metrics Optimization
Language Model Decoding as Direct Metrics Optimization
Haozhe Ji
Pei Ke
Hongning Wang
Minlie Huang
29
7
0
02 Oct 2023
Necessary and Sufficient Watermark for Large Language Models
Necessary and Sufficient Watermark for Large Language Models
Yuki Takezawa
Ryoma Sato
Han Bao
Kenta Niwa
Makoto Yamada
WaLM
57
7
0
02 Oct 2023
Analyzing and Mitigating Object Hallucination in Large Vision-Language
  Models
Analyzing and Mitigating Object Hallucination in Large Vision-Language Models
Yiyang Zhou
Chenhang Cui
Jaehong Yoon
Linjun Zhang
Zhun Deng
Chelsea Finn
Mohit Bansal
Huaxiu Yao
MLLM
75
170
0
01 Oct 2023
JoMA: Demystifying Multilayer Transformers via JOint Dynamics of MLP and
  Attention
JoMA: Demystifying Multilayer Transformers via JOint Dynamics of MLP and Attention
Yuandong Tian
Yiping Wang
Zhenyu Zhang
Beidi Chen
Simon Shaolei Du
40
36
0
01 Oct 2023
Dynamic Demonstrations Controller for In-Context Learning
Dynamic Demonstrations Controller for In-Context Learning
Fei Zhao
Taotian Pang
Zhen Wu
Zheng Ma
Shujian Huang
Xinyu Dai
35
5
0
30 Sep 2023
Understanding In-Context Learning from Repetitions
Understanding In-Context Learning from Repetitions
Jianhao Yan
Jin Xu
Chiyu Song
Chenming Wu
Yafu Li
Yue Zhang
43
22
0
30 Sep 2023
Corex: Pushing the Boundaries of Complex Reasoning through Multi-Model
  Collaboration
Corex: Pushing the Boundaries of Complex Reasoning through Multi-Model Collaboration
Qiushi Sun
Zhangyue Yin
Xiang Li
Zhiyong Wu
Xipeng Qiu
Lingpeng Kong
LRM
LLMAG
41
44
0
30 Sep 2023
Efficient Streaming Language Models with Attention Sinks
Efficient Streaming Language Models with Attention Sinks
Michel Lang
Yuandong Tian
Beidi Chen
Song Han
Mike Lewis
AI4TS
RALM
44
689
0
29 Sep 2023
Network Memory Footprint Compression Through Jointly Learnable Codebooks
  and Mappings
Network Memory Footprint Compression Through Jointly Learnable Codebooks and Mappings
Vittorio Giammarino
Arnaud Dapogny
Kévin Bailly
MQ
29
1
0
29 Sep 2023
PB-LLM: Partially Binarized Large Language Models
PB-LLM: Partially Binarized Large Language Models
Yuzhang Shang
Zhihang Yuan
Qiang Wu
Zhen Dong
MQ
38
44
0
29 Sep 2023
Training and inference of large language models using 8-bit floating
  point
Training and inference of large language models using 8-bit floating point
Sergio P. Perez
Yan Zhang
James Briggs
Charlie Blake
Prashanth Krishnamurthy
Paul Balanca
Carlo Luschi
Stephen Barlow
Andrew William Fitzgibbon
MQ
42
18
0
29 Sep 2023
Previous
123...313233...484950
Next